This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0. Then 3 > I and hence (3a > 1a = a. If (ice = a, this is a representation of the desired kind. Otherwise fla > a. Now, if a, (3, and y arc ordinals with (3a > y, then y has a unique repreProof.
110
The Natural Number Sequence and its Generalizations
I
C11 A P. 2
sentation in the form y = 13a1 + 01 where a, < a and #, < P. The proof is left to the reader. We apply this result with a as y to obtain a = j3a, + #, where /31 < 13. Again we have a representation of the type desired. The proof of uniqueness is left as an exercise.
EXAMPLES 7.12. Ordinal arithmetic presents a wide assortment of oddities. There follows a sketchy sampling.
(a) For n > 1, w"+' = (w" + w)2 -
(wn)2. However, it can be shown that,,
cannot be represented as a difference of squares of ordinals.
(b) The ordinal w2 has infinitely many representations as a difference of squares: w2 = [w(n + 1)]2 - (wn)2 for n = 1, 2, . (c) For every n > 1, (w + 1)"w" is an nth power of an ordinal, namely W2. On the other hand, w"(w + 1)" has no such representation. Indeed, since Wn(w + 1)n _ w2n + W2n
1 + ... f Wn
(W2 _+. On = W2n + W2n I < Wn(W + 1)n < W2n + /
In -1 + ...
-1- w + 1 = (w2 + W
That is, wn(w + 1)" lies between the nth powers of two successive ordinals and hence cannot be an nth Dower.
(d) If a = wan + w"-1 and 0 = W2n + w for n > 1, then a2 = A3; yet there is no ordinal y such that a = y3 and Q = y2. 7.13. With S = 2 in Theorem 7.11 we may conclude that every ordinal can be represented either as 2t or 2t + 1, that is, is either even or odd. For example,
(w + 1)2 = 2(w2) + 1 is odd! Again, with ;Q = w, we may conclude from Theorem 7.11 that any ordinal a can be represented as a = wE + p where p is finite. If p > 0, then a is a num-
ber of the first kind since a = w + (p - 1) + 1. It follows that every ordinal of the second kind is of the form wt. The converse statement is easily established, yielding a characterization of ordinals of the second kind as those having W as a left-hand divisor.
EXERCISES 7.1. Prove that if a relation p (restricted to a set A) is antisymmetric and if in each nonempty subset A, of A there exists an element a, such that a,pb for every b E A1, then p well-orders A. 7.2. Prove that a simply ordered set which is ordinally similar to a wellordered set is well-ordered. 7.3. Show that any infinite chain, no infinite subset of which has a first element, is of order type w*. 7.4. Establish the principle of definition by transfinite induction given in the text.
2.8
I
The Axiom of Choice and Zorn's Lemma
111
7.5. Supply the missing details in the proof of Theorem 7.4.
7.6. In the text we inferred from Theorem 7.4 that any two ordinals are comparable. Give an independent proof of comparability, using Theorem 7.5. 7.7. Prove that the sum of two ordinals is an ordinal. 7.8. Complete the proof of the assertion that the product of two ordinals is an ordinal. 7.9. Find how many different values are assumed by the sum of the ordinals 1, 2, 3, 4, and w in all possible arrangements. 7.10. Determine an arrangement of w, w2 + 1, w3, w5, and w2 for which their sum is w2 + w11 + 1.
7.11. Prove Theorem 7.10 by first proving the forward implications in (I)--(IV).
7.12. Prove that if a and P are ordinals with a > 6, then a + n > fi + n for = 1,2,... 7.13. Prove that 1 + a - a for an ordinal a iff a > w. 7.14. Find ordinals a and fl such that (a - fi) ± f3 P6 a. 7.15. Show that if a, 0, y, and 5 are ordinals such that a > f3 and y > 5, then ay > 06. Use this to prove that if a > f3 and y is an ordinal of the first kind, then ay > fiy. Then show that if cry = 0y and y is of the first kind, a = 0. This is a right-hand analogue of (VI) in Theorem 7.10. 7.16. Show that (W' -'- W)5 = (W., + w')2.
7.17. Suppose that a and f3 are positive ordinals with a -{- a = w. What is af3?
7.18. Give an example of two ordinals a and 13 such that a + fi = + a but a2 -l- f32 _ /#2 + a2.
7.19. Prove that for ordinals a and fl, af3 = f3a implies a2f32 = 2a2. 7.20. Prove that an ordinal f3 is of the second kind iff nf3 = 0 for n = 1, 2, 7.21. Prove that the product of two nonzero ordinals is a number of the first kind iff both factors are of the first kind. 7.22. Complete the proof of Theorem 7.11.
8. The Axiom of Choice, the Well-ordering Theorem, and Zorn's Lemma A theorem to the effect that all sets occurring in mathematics can be well-ordered would be extremely valuable. "Then, for instance, definitions and proof's could be fornmulated by induction for all sets, just as for the natural number sequence. In 1904, Zernielo gave a dernonstration of the well-ordering theorem which asserts that every set can be well-ordered. Soon after its publication it was pointed out by E. Borel, that the proof employed a property of sets which may be deduced easily
from the well-ordering theorem, thereby making the two properties
112
The Natural Number Sequence and its Generalizations
I
c 11 A P. 2
equivalent. The property assumed by Zcrmelo has become known as the axiom of choice. One of its formulations is the following. (AC,) If a is a disjoint collection of noncnrpty sets, then there exists a set B such that for each A in (t, B fl A is a unit set. In other words, if a is a disjoint collection of nonernpty sets, then there exists a set which has a single member in common with each member of a. Since a disjoint collection of nonemjrty sets is a partition of the union of the collection, (AC,) is clearly equivalent to: For each partition (a of a set. U there exists a subset of U consisting of exactly one
member of each member of a. Such a set is called a representative set for the partition as well as for the associated equivalence relation. The restriction in (A(.,) to disjoint collections may be circumvented by formulating it for families of sets. This version reads: If {A,} is a. family of
nonempty sets indexed by a nonempty set 1, then there exists a family {xi} with i E I such that xi C Ai for each i C 1. Intuitively, one thinks of arriving at a set B of the type mentioned in (AC,) by a constructive process; one chooses, in turn, an x from each of the sets A; this accounts for the presence of the word "choice" in the name. That it should be named an axiom is simply an indication that
no one has been able to infer the existence of such a set, in general (other than from an equivalent property of sets). 1 he axiom of choice has been the subject of serious controversy among
mathematicians. Some reject it totally on such grounds as the utter impossibility of making infinitely many selections (needless to say, it is only the case where a is infinite that the axiom injects anything new) or, on the lack of precise definition of a representative set. Others accept
the axiom for the case where a is denumerable and reject it in the uncountable case. Many accept it without any reservatin. Of those to whom the plausibility of (AC,) is indisputable, some revise their attitude
when propositions which can be proved equivalent are encountered. Several equivalent forms, which are in the nature of more useful working forms, are derived in this section. 'T'here is another category of propositions equivalent to the axiom of choice which might be catalogued as illuminating. For example, in Section 10 it will be shown that it is equivalent to the assertion that the ordering of cardinal numbers, discussed in Section 3, is a simple ordering. This is not a useful version of the choice axiom, but rather serves to point out that someone who "believes" that cardinals should be simply ordered must also "believe" the axiom of choice. Thus, such equivalent
2.8
I
The Axiom of Choice and Lorn's Lemma
113
formulations serve primarily to sharpen the delineation between the two schools of thought. Because of the diversity of opinion about the axiom of choice, it is common practice for some present-day authors to point out the occurrences of their usages of it. In cases where a new result rests on some classical mathematics this may amount to merely superficial honesty, since in much of classical mathematics the axiom of choice has slipped into proofs without being noticed, and no one has ever combed through all of it and sorted out the tainted theorems from the untainted. To the best of our knowledge the axiom of choice has been employed in the foregoing only in the proof of Theorem 4.4. Henceforth, we accept the axiom of choice as a valid principle of intuitive set theory and use
it without reservation. In this matter we are guided by Cantor, who tacitly accepted the axiom. For intuitive set theory it has the same status as the principles of extension and abstraction (Section 1.2); collectively, these three assumptions serve as a basis for the theory.
Turning to the derivation of propositions which are equivalent to the axiom of choice, we present first two variations which are so closely related to (AC,) that. they are also known by the same
(AC2) For every set X there exists a function f on the collection, U,(X) - 10 }, of nonempty subsets of X such that f(A) C A. Such a function is a choice function for X. Thus, (ACC2) asserts that every set has a choice function.
(AC3) If {A,} is a -family of none,npty sets indexed by a nonempty set I, then
X,EIAi is noncmpty.
The equivalence of (AC,)--(AC3) is easily established. That of (AC,) and (AC3) follows directly from the formulation of (A(',,) for-a family of
sets and the definition of cartesian product. Further, it is clear that (AC,) implies (AC2). To complete the proof of the equivalence of (AC,)--(AC3) it is sufficient to prove that (AC2) implies (AC,). The reader can do this easily.
'1'o bridge the gap between the axiom of choice and other useful equivalent forms, we prove a fixed point theorem due to N. Bourbaki (1939). Before tackling its proof, as well as the statement and proof of the theorem that follows it, the reader would do well to refresh his memory with regard to the definitions given at the end of Section 1.11.
THEOREM 8.1. Let E be a nonempty partially ordered set such that every chain included in E has a least upper bound in E. If
The Natural Number Sequence and its Generalizations
1 14
I
CHAP. 2
f : E --} E has the property that f (x) x for all x in E, then there exists at least one x in E such that f(x) = x. Proof. Let a be a fixed element of E. A subset A of E will be called admissible (relative to a) if it has the following properties. (1) a C A. (2) J(A) C A.
(3) if F is a chain included in A, then lub F C A.
Clearly E is admissible. Moreover, it is easily verified that M, the intersection of all admissible subsets, is admissible. Thus M is the
smallest admissible subset. It follows that if a subset M0 of M can be shown to be admissible, then MO = M. This technique is used to derive each of three properties of M (designated by Roman numerals), from which the theorem follows easily.
(I) The clement a is the first element of M. It is sufficient to prove
that the subset A = ;x C Mix > a } of M is admissible. For this we verify, in turn, properties (1), (2), and (3) of an admissible set. (1)' (2)'
(3)'
Since aCMand a>a,aCA. I.et x E A; to prove f(x) C A. Now x E A implies x E M, and hence, f(x) E M by (2). Also, x C A implies x > a and this, with J(x) > x, yields f(x) > a. Thus f(x) E A. Let to = lub F, where F is a chain included in A. Since A C M, we have F C-: M and hence to C M by property (3) of admissible sets. Also, F C A implies x > a for all x E F, and
hence to > a. Thus A is admissible, A = M, and (1) is proved.
Before Continuing, we make a definition. An element x of E is said to have property P, in symbols P(x), if y C M and y < x implies f (y) < X.
(II) If x C M and P(x), then for each z E M either z < x or z > f(x). It is sufficient to prove that the subset B = }z C MIz < x or z > J '(x) } is admissible.
(1)" aCIv!anda <x [indeed for each x C M by (I)1,so aCB. (2)" Let z C B; to show that f (z) E B. As in (2)', f (z) E M. Also, z C B implies z < x or z > f(x) by definition. if z = x, then
2.8
1
The Axiom of Choice and Zorn's Lemma
115
f(z) = f(x) so that f(z) > f(x) and hence f(z) C B. If z < x, then since P(x), f(z) < x and f(z) C B. Finally, if z > f(x), then f(z) > z > f(x) and, again, f(z) C B. (3)" Let w = lub F, where b 'is a chain included in B. As in (3)', w C M. Also, for each z C F either z < x or z > f(x). If the first alternative holds for all z C F, then x is an upper bound for F, and hence w < x, so zo C B. Otherwise, there exists a
z C F such that z > f(x). Then w > z > f(x) and, again, w C B.
Thus B is admissible, B = M, and (II) is proved. (III) Every element of M has property P. It is sufficient to prove that the subset C = [x C MIP(x) [ is admissible.
(l)"' a C M and is, moreover, the least element of M. Thus, for no z of M is z < a. Hence, a satisfies I' vacuously and, consequently, is in C.
(2) "' Let x C C; to show that f(x) C C. As in (2)', f(x) C M. It remains to prove that f(x) has property P, that is, y C M and y < f(x) imply f(y) < f(x). Applying (II) to x we have either y < x or y > f(x). The second possibility cannot hold,
for y > f(x) with y < f(x) is impossible. Thus y < x. If y < x, then f(y) < x, using property P for x. This, with x < f(x), implies f(y) < fi(x), as required. Also it is immediate that if y = x the same conclusion holds. Thus f(x) C C. (3)"' Let w = lub F, where F is a chain included in C. As in (3)', w C M. Thus it remains to show that P(ro), that is, y C M
and y < w imply f(y) < w. For this we show first that for such a y there exists y, C F such that y < y,. Indeed, if no such y, C F exists, then by (II) [Note: y, C F implies that P(y,) 1, y > f(y,) > y, for ally, C F. Then y is an upper bound
for F and hence, y > iv, which contradicts the assumption that y < w. Thus a y, C b with y < y, exists. If y < y,, then by property P for y,, f(y) < y, < w, so that f(y) < iv as required. If y = y,, then P(y) and hence, by (11), either w < y orf(y) < w. The first possibility is excluded and hence, again,
f(y) < w. Thus C is admissible, C = M, and (III) is proved. Now for the coup de grace. From (II) and (I1I) it follows that if x,
1 16
'1 he Natural Number Sequence and its Generalizations
I
CHAP. 2
y E M, then either y < x or y > f(x) > x, so M is simply ordered. Let xo = lub M. Since M is admissible, xo C M and, moreover, f(.to) E M. 'thus f(xo) < xo. But xo < f(xo) by hypothesis. It follows that f(xo) = xo.
THEOREM 8.2. The following statements arc equivalent to one another.
(I) Zcrmclo's axiom of choice: For every set X there exists a function f on the collection of noncmpty subsets A of X into X, such that for each A, f(A) C A. (II) 1-!ausdorff's maximal principle: Every partially ordered set
includes a maximal chain, that is, a chain which is not a proper subset of any other chain.
(lll) 'horn's lemma: Every
partially ordered set in which each chain has an upper bound contains a maximal
clement. (IV) Every set can be well-ordered. Proof. (I) implies (II). Let (P, <) be a partially ordered set and assume that (11) is false for it. ']'his means, if . is the family of all subsets X of P which arc simply ordered by <, that for each X in a there exists Y in a. with X C Y. That is,
ax - IYC aIXC Y} is noncmpty for each X in U. By (1) there exists a function f on }axlX C a} into a such that f(ax) C ax. "Thus g: a -*- a with g(X) = f(ax) has the property that X C g(X) for all X in U. As such, the partially ordered set (a, C), together with the function g, satisfies the hypotheses. of Theorem 8.1. (It is left as an exercise to show that if a* is a subset
of a which is simply ordered by C, then U a* = lub ct*). But X C g(X) for all X in a is a contradiction of the conclusion of 'l'heorcin 8.1. Thus, since (1) and the denial of (11) lead to a contradic-. Lion, (1) does imply (I1).
(11) implies (111). Assume that the partially ordered set (1', <) satisfies the hypothesis of (111). By (11), there exists a maximal sub-
set A of 1', simply ordered by <. Let a be an upper bound for A. Then a is a maximal clement for P. Indeed, assume that a < x for some x in 1'. "Then A U {x} is a simply ordered subset which properly
includes A. This is a contradiction.
'
2.8
The Axio,n of Choice and Zorn's Lemma
117
(111) implies (IV). Let X be any set. We consider ordered pairs (A, p) where A C X and p simply orders A. Let S be the set of all (A, p) such that p well-orders A. If (A,, pl) and (A2, p2) are members of g, define (A,, p,) < (A2, P2) ill
(a) A. S A2, (b) P, C P2, and (c) if a, C A,, a2 C A2, and 02 V A,, then (a,, 02) C P2
In other words, we require that A, be a subset of A2, that the ordering
of A2 be an extension of that of A,, and that the elements of A2, not in A,, be greater than the elements of A,, relative to the ordering of A2. It is immediate that < partially orders S. We prove next that (S, <) satisfies the hypothesis of (Ill) ; that is, a chain included in S has an upper bound in S. For a chain e C S, we propose as an upper bound, (A*, p*), where A* = U {AI(A, p) C Cl and p* = U IPI(A, p) C e}. Clearly the only question is whether (A*, p*) C S. To show this we prove that (A*, p*) satisfies the conditions stated at the beginning of Section 7. The proof that p* is antisymmetric is left as an exercise. It remains to prove that if B is a noncnnpty subset of A*, then there exists bo C B such that (b(j, b) C p* for each b C B. For such a B there exists (A,, p,) C e such that B fl A, 7`- 0. In turn, there exists bo C I3 n A, such that (be, b) C p, for all b C 13 () A,. More generally, for each b in 13 there exists p with (A, p) C C such that (bo, b) C p. Indeed, given b in 13, there exists (A, p) C e with b C A. If A e A,, then (bo, b) C p,. Otherwise, A D A, and, (bo, b) C p and so (b,,, b) C p*, as desired.
hence, p D p,. Then
Since the hypothesis of (Ill) is satisfied, we may infer the existence of a maximal element (A, p) of S. The proof will be complete if it can be shown that A = X. To this end assume the contrary, that x C X - A. We now adjoin x to A and extend the ordering p of A to one of A U {x} by defining x to be greater than each element of A.
This yields the ordered pair (A', p'), where A' = A U {x} and p' = p U { (a, x)la C A'{ . Then p' well-orders A' and hence, (A', p') C S.
Moreover, (A, p) < (A', p'), which is a contradiction, since (A, p) is a maximal element. Hence, A = X. (IV) implies (I). Let X be any set. By (IV), Xcan be well-ordered, so we assume that this is given. If A is a noncmpty subset of X, let f(A) be the first element of A. Then f is a choice function for X.
118
The Natural Number Sequence and its Generalizations
I
C H A P. 2
In mathematics, labeling a proposition with the name of an individual
usually indicates his priority to that result. This is not the case in the assignment of names to be found in the literature to the above equivalent formulations of the axiom of choice or in the extensive variety of other formulations which have proved to be useful. We have made what is a relatively common assignment of names.
It should be mentioned that prior to the emergence of a variety of statements equivalent to the well-ordering theorem, transfinite induction was a standard proof technique. This has now given way to the use of some form of Zorn's lemma or a maximal principle. f Usually the modern procedure yields a shorter proof.
EXERCISES 8.1. Establish the axiom of choice in the form (AC,) for finite collections of sets by proving, by induction, that if A, X A2 X .. X A. is empty, then at least one A; is empty. 8.2. The following is known as Hilbert's axiom. If 6' is the set of all properties P such that there exists at least one object having property P, then there exists a function a whose domain is 6' and such that e(P) is an object having property P. Prove that this axiom is equivalent to the axiom of choice (AC2). 8.3. Following A. Mostowski, let us denote by [n] for n = 1, 2, . the following case of (AC,) : For every disjoint collection (t of n-element sets A, there exists
a set B such that, f o r each A in C 3 , B (l A has exactly one member. Without using the axiom of choice, prove that [2] implies [4]. 8.4. Referring to the proof that (I) implies (II) in Theorem 8.2, show that
Ua.* = lub a*. 8.5. In the proof that (I11) implies (IV) in Theorem 8.2, show that p* is antisymmetric. 8.6. Demonstrate the equivalence of the axiom of choice with the following
statement: If A and B are nonempty sets and p is a relation with domain A and range B, then there exists a function f : A --b- B such that f C p. 8.7. Show that if p partially orders A, then there exists a simple ordering relation p' such that p' Q p and p' simply orders A. (Hint: Consider the collection of partial ordering relations which include p and use Zorn's lemma.)
9. Further Properties of Cardinal Numbers With the axiom of choice available, some extensions and some simplifi-
cations of properties of infinite cardinals are at hand. It will be recalled t Of course, Zorn's leninna is not a substitute for transfinite induction in cases of justifying definitions.
2.9
I
Further Properties of Cardinal Numbers
119
that for reasons of expediency one application has already been made to the proof of Theorem 4.4. Another occurs in our next result.
THEOREM 9.1. Any infinite set includes a subset of cardinal number No.
Let X be an infinite set. It is sufficient to exhibit a function g on N into X which is one-to-one. Let f be a choice function for X. Then we define g(O) = f(X) and, proceeding inductively, let g(n + 1) = f(X - 1g(O), , g(u) }). This is possible for each n, since otherwise X would be finite, which is contrary to assumption. According to Theorem 2.4 there exists a function g on N into X such , g(rr - 1) } ). Now g is that, for each n in N, g(n) = f(X - {g(0), one-to-one. For consider r and s in N with r s. It is no loss of Proof.
generality to assume that r < s and we do so. Then g(r) (, X , g(s - 1) } and hence g(s), as a member of this g(r), set, is necessarily distinct from g(r). Ig(0),
.
COROLLARY I
If A is an infinite set then fl > Ro. The proof is left as an exercise. .
Combining Corollary 1 with Theorem 3.5, bto is established as the least infinite cardinal. Theorem 9.1 has another interesting consequence, which places in sharp relief a basic difference between finite and infinite sets.
COROLLARY 2. An infinite set is similar to a proper subset of itself.
Let A be an infinite set. According to Theorem 9.1 we may write A as the union of disjoint sets C NJ and B. 'T'hen the set A, = { C N - 1011 U B is a proper subset of A. Moreover, f: A -- A,, where f(x) = a,, t., if x = a,, and f(x) = x if x C B, is a one-to-one correspondence between A and A,. Proo/.
We infer from this corollary and Theorem 3.3 that a set is infinite iff it is similar to a proper subset of itself. What is for us a characterization of an infinite set, has been taken as the definition of an infinite set in some treatments (Dedekind, 1883).
Another simple application of the axiom of choice produces for cardinal numbers the analogue of Theorem 7.7 for ordinals.
120
The Natural Number Sequence and its Generalizations
I
C II All . 2
TIIEOREM 9.2. If C is a set of cardinal numbers, then there exists a cardinal number greater than each cardinal in C. Proof. A (noncrnpty) set C of cardinal numbers is a disjoint collection of sets. With the axiom of choice it is possible to define a representative set of the form (t = C C1 where ,, = u. Clearly, card U Ct > u for each a in C. Hence card 2" > u for each u in C.
The hierarchy of infinite cardinals in order of increasing magnitude which was mentioned in Section 3 can now be described with more accuracy. First there is the natural number sequence according to Theorem 3.4. Next we have No, by Corollary I of Theorem 9.1 and Theorem 3.5. 'T'hen we get successively greater cardinals 2k°(= K), by application of 'Theorem 3.6. After all of these we gel. a still greater one, say p, by application of Theorem 9.2. Then Theorem 3.6 may be applied again to extend the array by 271, 2"°, , and so on. A more profound consequence of the axiom of choice is that every set of cardinal numbers is well-ordered by the ordering relation < introduced for cardinals. To prove this, along with related properties of cardinals, we employ the fact that the axiom of choice implies the wellordering theorem which, in turn, implies that every cardinal number can be represented by a well-ordered set. From this it follows, first, 2K,
that any two cardinals are comparable in view of the Corollary to Theorem 7.4. Second, it establishes a correspondence between cardinal
numbers and ordinal numbers whereby with a cardinal number is associated the (nonempty) set Z(c) of all ordinals having a representative of cardinality c. Specifically, if C is a given set of cardinals, (c, a,)jc C C and a,; is the least nrcrnber of Z. (c) I
is a function, f,-with C as do'rnain and a set A of ordinals as range. Clearly, f is one-to-one and consequently a one-to-one correspondence between C and A. Further, it is easily shown that f is an order-preserving map. It follows that the simply ordered sets C amid A are isomorphic and since one is well-ordered, so is the other. We record this result as our next theorem.
THEOREM 9.3. Any set of cardinal numbers is well-ordered. This property of cardinals is the basis of the following notation for infinite cardinals. If a is the ordinal number of the set of infinite cardinals
less than an infinite cardinal u, then u is designated by N. The desig-
2.9
Further Properties of Cardinal Nurnliers
1
121
nation of N as No is an instance of this notation. Again, Mr stands for the immediate successor of No, and consequently the continuum hypothesis may be phrased as the assertion that 2", = 1!ti. This has been extended to
the generalized continuum hypothesis which asserts that 2"- = i`t.i I. Another consequence of the axiom of choice is that multiplication and addition of infinite cardinals arc idcmpotent operations; that is, if it
is an infinite cardinal, then u2 = u and 2u = u. To prove these
results we use the fact that the axiom of choice implies Zorn's lemma.
The following proof of the idcmpotcncy of multiplication is due to Zorn (1944).
To facilitate the exposition we introduce a temporary definition: To say that a set A has property 9, symbolized i(A), shall mean that A has at least two distinct members and A2 = 1. Additional properties of such sets, as well as some properties of related sets, are derived below.
I.
If i(A), then A is infinite. The proof is left as an exercise. If #(A) and A > 11, then A + 13 = t1. 12-
Let ao and ar be distinct members of A and suppose that 13NA0CA.Then
Proof.
A
Proof.
The assertion is a corollary of 12-
If 4(A) and 0 < lJ < A, then All = A. Let b E B and let B - A' C A. Then
A=AX U
I.
If ,4(A,), Ao C A, and I - Ao < 71o, then 4(A).
The proof is left as an exercise. Is.
Let A be a set with disjoint subsets Ao and Al such that (i) there
exists a one-to-one correspondence f between Ao and Au X Ao, (ii) 9(A,), and (iii) lo < A. Then there exists a one-to-one correspondence g between C = Au U Ar and C X C such that g D f.
The Natural Number Sequence and its Generalizations
122 Proof.
f
cHAp, 2
Such a correspondence exists provided there is a one-to-one
correspondence between the sets C - Ao = A, and C X C - Ao X
Ao = (Ao X A,) U (At X Ao) U (At X A,). In turn, this is the case if Ao X A, + A, X Ao + A, X A, = A1,. This is trivial if A = 0. If Ao - 0, its validity follows from 14 and 13.
The next two results are special cases of I6. That is, both assert the existence of a one-to-one correspondence between a subset C of 'A and C X C which properly extends a given mapping of the same variety. L. Assume that Ao is a finite subset of the infinite set A and that f is a one-to-one correspondence between Ao and Ao X Ao (so that As has at most one member). Then there exists a proper extension g of f of the type described in 16.
As an infinite set, A has a denumerable subset arid, therefore, a subset A, such that :1(Ar) (Theorems 9.1 and 4.3). Since Ao is finite, Proof.
we may assume that Ao (1 A, = 0. Moreover Ao < A,. Thus, 16 may be applied and provides the desired extension.
Assume that Ao C A, g (Ao), A - A > A0, and that f : Ao -,. A0 X Ao is a one-to-one correspondence. Then f has a proper extension g of the type described in Is. Proof. In view of 16 and 12 it is sufficient to determine a subset A, of A - A,; which has A0 as cardinal number. Such a set exists by virtue of the assumption that A - Ao > A0. 18.
We can now quickly dispose of the principal theorem.
THEOREM 9.4.
If A is an infinite set, then ii (A). Irr other words, if u is an infinite cardinal, then u2 = it. Proof. Consider the collection if of all one-to-one correspondences: f : A' -+- A' X A', where A' C A. This collection is. nonempty since A has a denumerable subset and, as a collection of sets, is partially ordered by inclusion. Each chain e included in if has an upper bound in 9; indeed, U e qualifies. The proof of this is left as an exercise. Hence, by Zorn's lemma, if has a maximal element fo. Now fo is a a subset A0 of A and Ao X Ao one-to-one correspondence be
having no proper extension. In view of 17 the set Ao is not finite and, therefore, is idempotent. So, according to I8, it is false that
2 .9
I
Further Properties of Cardinal Numbers
123
- Ao > Ao, and hence A - Ao < Au according to Theorem 9.3. Since g(Ao), the last inequality implies, using I5, that 9(A).
'T'HEOREM 9.5. If u is an infinite cardinal, then 2u = u. Proof.
In view of the preceding theorem this follows from I;,.
It is left as an exercise to deduce from this theorem the following, whereby the arithmetic of infinite cardinals is reduced to a triviality.
TI-IEOREM 9.6. If u and v are infinite cardinals, then u + v uv = max
{ u, v I.
EXERCISES 9.1. Consider the following three assertions about a set X. (i) X is infinite. (ii) There exists a one-to-one mapping on X onto a proper subset. (iii) There exists a one-to-one mapping on N into X. Show that each of these assertions implies the other two if the axiom of choice may be used. Which of these six implications can be proved without the axiom of choice?
9.2. Expand the proof in the text that any set of cardinals is well-ordered. 9.3. Prove property I,. 9.4. Prove property I5. 9.5. Supply the missing part of the proof of Theorem 9.4. 9.6. Prove Theorem 9.6. 9.7. Extend Theorem 9.6 to the case where only one cardinal is infinite. 9.8. Give a proof, using Zorn's lemma but no properties of well-ordered sets, that any two cardinal numbers are comparable. Hint: Recalling the analysis in Section 3, it is sufficient to prove that if A and B are sets, then there exists Bo and either A = A subsets Ao and Bu of A and B respectively such that Ao or Bo = B. Prove this by applying 'corn's lemma to the partially ordered set (1, C), where if is the collection of all one-to-one correspondences f: A' -i- B'
with A'CAandB'CB. 9.9. Devise a direct proof that if u is an infinite cardinal, then 2u = u. Hint: be the collection of all pairs (A, fA) where A is a subset of S such that A X T = A and fA is a lixed mapping which demonstrates the similarity of A X T and A. Show that e5' is nonernpty and is partially
Let S = u and 7' = (0, 1). Let
ordered by the relation <, where (A, fA) < (B, fit) means that A C B and AI A X T = fA. Then deduce that Zorn's lemma may be applied. 9.10. Deduce from the result in Exercise 9.9 that if the set B has an infinite subset A such that A < B, then B - A = B.
124
The Natural Number Sequence and its Generalizations
I
CHAp. 2
9.11. Use the results of Exercises 9.9 and 9.10 to give another proof of Theo. u. Then consider the collecrem 9.4. Hint: Let u be an infinite cardinal and tion of all pairs (A, fn), where A is a subset of S such that A X A = A and fA is a fixed mapping which demonstrates the similarity of A X A and A. 9.12. Show that for infinite cardinals r, s, u, and v, if r < s and u < v, then
r + u < s + v and ru <.cv.
10. Some Theorems Equivalent to the Axiom of Choice In the preceding section we proved, among other things, that the axiom of choice implies that (i) any two cardinals are comparable, (ii) if u is an infinite cardinal, then u2 = u, and (iii) if u and a are infinite
cardinals, then u + v = uv. It is a remarkable fact that each of (i), (ii), and (iii) is equivalent to the axiom of choice. As a preliminary to
the proofs required to substantiate this statement, we return to the discussion of the relation between cardinal and ordinal numbers which
appears prior to Theorem 9.3. If we understand by an aleph a trans. finite cardinal number which has a well-ordered set as a representative, then the first step in the proof of Theorem 9.3 amounts to the observation that the axiom of choice implies that every transfinite cardinal is an aleph. The converse of this implication is easily verified, so the axiom of choice is equivalent to the assertion that every transfinite cardinal is an aleph. Without using the axiom of choice it is possible to prove the following results concerning alephs.
TIIEOREM 10.1. To each cardinal number c there corresponds an aleph ti(c), which is not less than or equal to c. Proof. If c is a finite cardinal, we may choose Ho for the cardinal in question. So assume that c is transfinite. We now make a definition.
For an ordinal number a, all sets of order type a are similar; we denote the common cardinality of such sets by a and call this the power of a. Now let A be the set of all ordinals cx such that a < c. Then A is an infinite set because every natural number belongs to it and A is well-ordered, being a set of ordinals. I fence the order type of A is a transfinite ordinal l; and & is some aleph, rt(c). We prove that $t(c) is not less than or equal to c by deriving a contradiction from the c. Then, since R (c) = &, we contrary assumption. So assume R(c) have i < c, whence C A. Hence 13 = (01(3 < El C A, since S < l; implies that 0 < E c and E (Z B. Now B = E by 'T'heorem 7.5 and
2 .10
I
Some Theorems Equivalent to the Axiom of Choice
125
by the definition of . It follows that A is ordirially similar to one of its initial segments, contradicting Theorem 7.2. A
THEOREM 10.2. Let c be a transfinitc cardinal number and 1`t an alcph. If cR = c -{- K, then either c > b or c < R. Proof. Let C and A be disjoint representatives of c and fit, respectively. By assumption we may take A to be well-ordered. Then
CXA=cm =c - Fm, and hence there exist disjoint subsets C, and Ar of C X A such that
C,UA,=C'XA, Z;,=c, and fl,=M. Now, either (i) there exists an element b, of Csuch that for all a in A, (b,, a) C C1, or (ii) for every element b of C, there exists an element a of
A such that (b, a) V C1. If (i) holds, let Az be {(b,, a)la C A}. Then Az C C1, i = ti, and hence c > t. If (ii) holds, let rp(b) be the least clement of A such that (b, Wp(b)) C A, and let Cz be {(b, (p(b))Ib C C). Then Cz C A,, t:;z = C:, and hence, r. < R.
We can now prove the theorems in question.
THEOREM 1 0.3 (IIartogs). The axiom of choice is equivalent to the assertion that any two cardinal numbers are comparable. Proof. It remains to prove that if any two cardinals are comparable, then the axiom of choice is valid. Let C be a given set and c = G. In view of Theorem 10.1 and the assumed comparability of cardinals,
there exists an alcph R(c) such that c < bt(c). It follows that C is similar to a subset of a well-ordered set, whence follows the existence of a relation that well-orders C.
THEOREM 10.4 (Tarski). The axiom of choice is equivalent to the assertion that if u and v are infinite cardinal numbers, then u -l- v = uv.
It remains to deduce the axiom of choice from the hypothesis u + v = uv for infinite cardinals a and v. Let c be an infinite cardinal and K(c) be the aleph of Theorem 10.1. Theorem 10.2 is applicable and we conclude that either c > bt(c) or c < KK(c). The inequality c > K(c) is impossible in view of Theorem 10.1. Hence c < 1!I(c), from which it follows that every transfinite cardinal is an aleph. This, in turn, yields the axiom of choice. Proof.
126
The Natural Number Sequence and its Generalizations
I
C H A P P. 2
'I' I I E 0 R E M 10. 5 (Tarski). The axiom of choice is equivalent to the assertion that if It is an infinite cardinal number, then u2 = u. Proof. It remains to deduce the axiom of choice from the assumption that 112 = u for infinite cardinals It. Let c and d be infinite cardinals. Then c2 = c, d2 = d, and (c + d)2 = c -}- d. Since (c + d)2 = c2 + 2cd + d2, it follows that c + d = c +- 2cd -I- d, whence cd < 2cd < c + 2cd A- d = c + d.
But we may also set c = c, +- 1 and d = d, -}- l for cardinals c, and d,, to conclude that cd = (c, + 1) (d, + 1) = c,d, + c, + di + 1
> 1 +c,+d,+1 =c+d.
Hence our assumptions imply that c + d = cd for infinite cardinals.
But this implies the axiom of choice according to the preceding theorem.
EXERCISES 10.1. Deduce the axiom of choice from the hypothesis that every transfinite cardinal is an aleph. 10.2. Another of Tarski's results concerning the axiom of choice asserts that it is equivalent to the proposition if2u
11. The Paradoxes of Intuitive Set Theory The theory of sets which has been l)resemcd so far is that used by mathematicians in their daily work. Many theorems which are accepted by a majority of the mathematical couttnuttity, both past and present,
2.11
(
The Paradoxes of Intuitive Set Theory
127
rely on this theory. Unfortunately, it is not free of difficulties. Indeed, as mentioned earlier, it yields contradictions. However, matters are not as bad as this fact might indicate. This is suggested, at least, by its very inclusion in a present-day text. A firm vantage point from which to view the "reliability" of Cantor's theory is one of the axiornatizations which have been devised. The version of axiomatic set theory which we shall
discuss later (Chapter 7) is based on the conclusion that the known contradictions of Cantor's theory are associated with "too large" sets. These are not the sort which occur ordinarily in mathematics. Before discussing the best-known contradictions, a preliminary remark
is in order. A cornerstone of Cantor's theory is that we are guided by intuition in deciding which objects are sets and which are not. For this
reason the name "intuitive set theory" is often applied to it. The implicit faith that individuals have in their intuition seems to be responsible for the contradictions of intuitive set theory commonly being
called paradoxes. This is a misnomer, since the connotation of the word "paradox" is that of a seemingly, or superficial, contradiction, whereas the examples in question are bona fide contradictions. As such,
they should be labeled "antinomies," which is the correct technical word to describe their status. Few do this, however. The principle of intuitive set theory which asserts that every property determines a set may be regarded as its Achilles' heel. Indeed, when used without restriction, this principle yields at least three sets from which logical contradictions can be derived. The three which we shall discuss
are called the Russell paradox, the Cantor paradox, and the RuraliForti paradox. The simplicity of the Russell paradox is apparent from the fact that it was possible to mention it as early as Section 1.2. We consider it now in more detail. The formula which Russell considered is xCZx or --i(xCx) where, in the second version, we have used one of the standard symbols for negation (--1). According to the principle of abstraction, this formula determines a set R such that x C R iff --i (x E x). In particular,
R C R ill -' (RC R), which is logically equivalent to the contradiction
R C R and -(R C R). We note that R (assuming that it exists) is a very large set. For example, its defining property is satisfied by all objects which are not sets, since
128
The Natural Number Sequence and its Genet alizaliuns
i
(: t-t A P. 2
such objects can have no members and in this event cannot be members of themselves. Moreover, the property is satisfied by most sets; to mention two examples, the set of even integers is not an even integer, nor is the set of all polynomial functions a polynomial function. It is only when one turns to such figments of the imagination as the set of all sets, or the set of all abstract ideas, that violations of the defining property of R can be found.
The Cantor paradox, which was discovered by Cantor in 1899 but which was first published only with his correspondence in 1932, is derived from the set defined by the formula x is a set.
Let e be the set defined by this formula. Then C is the set of all sets. By Theorem 3.6, W(C) > G. Also, since C is the set of all sets and 6'(C) is a set (the set whose incnibers are the subsets of C), o'(C) C C. Hence, (i'(C) < c or, in other words, it is false that (t'(C) > G. Thus, it follows that both "m((O) > 0" and the negation of this statement are valid. This is a contradiction.
The Burali-Forti (1897) paradox, which was known to Cantor as early as 1895, is derived from the set defined by the formula x is an ordinal number. The set 1', which it determines by virtue of the principle of abstraction, is that of all ordinal numbers. As a set of ordinals, 1' is well-ordered according to Theorem 7.6, and hence has itself an ordinal number y. By Theorem 7.5, s(y) is a well-ordered set of ordinal number y; hence
s(y) is ordinally similar to F. With F as the set of all ordinals, y C 1' and hence s(y) is an initial segment of 1'. Thus, we have proved that I' is ordinally similar tQ one of its initial segments. This is a contradiction of Theorem 7.2. Instead of offering the above paradoxes as proofs of the assertion that the unrestricted use of the principle of abstraction yields a contradictory
theory, we may say that if we adhere to ordinary logic, then the paradoxes demonstrate that it is false that corresponding to every property there is a set of objects having that property. Interestingly enough, the converse is also false. That is, it is false that every set has a defining property. The well-known proof of this. is due to Skolcin (1929)
and is as follows. It is possible to map the set of real numbers into a collection of sets in a one-to-one fashion. For example, we can assign to a real number x the set of all real numbers less than x. Since the set
Bibliogrn/ikical Notes
129
of all real numbers is uncountable, it follows that there exists an uncountable collection of sets. So, if every set has a defining property, the set of defining properties is uncountable. On the other hand, a property (written in English) is a finite sequence of letters of the English alphabet.
The set of all such sequences is denumerable so that, in particular, the set of all properties is denumerable. Hence, there exist sets without defining properties.
Intuitive set theory with its paradoxes certainly invites a critical examination with the goal of creating a theory which is both consistent and which enjoys as many features of the intuitive theory as is possible.
Of the points of departure which may be taken in this matter, that of developing set theory as a formal axiomatic theory has been popular. The present-day status of such axiomatic theories is this: they are flexible enough to permit one to carry on essentially as in intuitive set theory, and they circumvent the classical paradoxes (and thus suggest that they are consistent) ; however, no one of them has been proved to be consistent.
BIBLIOGRAPHICAL NOTES Section 1. The development of the basic properties of the arithmetic of the natural numbers from the Peano axioms in Dedckind (1888) (reproduced in Dcdekind (1932)] is well worth reading. In H. Wang (1957) there is an interesting account of how Dedekind arrived at his characterization of the natural numbers. The standard classical account of the development of properties of natural numbers and their extension to the real numbers (see the next chapter) is E. Landau (1930). Section 2. An excellent account of both proof and definition by induction is to be found in Rosser (1953). Sections 3-7. For full accounts of the topics considered in these sections, W. Sierpinski (1958) and A. Fraenkel (1961) should be consulted. Sections 8-10. For more complete accounts of consequences of the axiom
of choice and propositions which are equivalent to it, Rosser (1953) and Sierpinski (1958) should be consulted. The proposition known as Zorn's lemma appears in M. Zorn (1935). Theorem 10.3 is in F. Hartogs (1914). The various propositions which are equivalent to the axiom of choice and which have been credited to A. Tarski appear in Tarski (1923).
CIIAP TER
3
the Extension of the Natural Numbers to the Real .Numbers
IN THIS CHAPTER we carry out another variety of extension of (N, -l-, , <), f the systemn discussed in Sections 2.1 and 2.2. Three successive extensions are made, the last of which yields the real number system. The first of these may be described as the completion of N_ with
respect to addition-- that is, the minimum enlargement of N to insure the solvability of all equations of the type x + n = nz with n, m C N. The extended set is the set Z. of integers. The second extension amounts
to the completion of Z with respect to multiplication--that is, the minimum enlargement of 'Z to attain the solvability of all equations of the form. xb = a with a, b C Z and b s 0. The resulting set is the set Q of rational numbers. The third extension amounts to the completion ofO with respect to order---that is, the minimmum enlargement of a which provides least upper bounds for nonempty subsets of Q which have upper bounds. In addition to those theorems which are of permanent interest, any
development of the real number system includes a great number of results having just temporary interest (for example, results which justify various definitions). Each statement of the latter sort is labeled a lemma, and if no proof is in evidence the reader can count on his being asked to supply one in an exercise.
Finally, we mention that elementary properties of operations and relations for natural numbers are used without explicit reference.
1. The System of Natural Numbers On the basis of definitions and theorems appearing in Sections 2.1 and 2.2, the natural number sequence (N, ', 0) determines the system t It better suits our presentation to adopt <, instead of <, as the basic ordering relation in U.
3.1
1
The System of Natural Numbers
131
of natural numbers, (N, -I-, , <), by which we mean the set N together with the two binary operations and ordering relation which have been defined in this set. Below is a list of those properties of +, , and < and their interrelations upon which this chapter is based. These were all derived as theorems in Sections 2.1 and 2.2 from the assumption that (N, ', 0) is an integral system. Thus, for one who has studied Sections 2.1 and 2.2, this section (which is preliminary to the developments described
in the above summary) is simply an abstract of already demonstrated properties of the system of natural numbers. For anyone who, for some other reason, admits the following as valid properties of (N, +, , <), the chapter is self-contained.
The properties of (N, +, , <) to which we call attention are the following.
A,. A2.
x + (y + z) = (x + y) + z.
x+y =y+x.
A3. O+x=x. A4. x+z=y +zorz+x=z+y implies thatx =y. Ml. x(yz) = (xy)z. xy = yx.
M2.
Ms. M4. D.
1X = X.
xz = yz or zx = zy, and z x(y + z) = xy + xz.
0, imply that x = y.
Further, the relation < has the following properties.
x < y and y < z imply that x < z (transitivity). For each pair x, y of natural numbers, exactly one of x < y, x = y, y < x hold (trichotomy). 03. (N, <) is a well-ordered set. 01. 02.
OAR.
x
otonicity of -1- with respect to <). OA2. x -l- z < y + z or z + x < z + y implies that x < y (cancellation property of + with respect to <).
OMI. If z > 0, then x < y implies that xz < yz and zx < zy (monotonicity of with respect to <). OM2.
If z > 0, then xz < yz or zx < zy implies that x < y (can-
cellation property of with respect to <).
A comment about some of the terminology used above is in order. The meaning of the statement that an operation has the cancellation.
132
Extension o f the Natural Numbers to the Real Numbers
I
CHAP. 3
property with respect to some relation at hand can be inferred immediately from 0A2. (Although 0M2 involves a restriction, no special terminology will be introduced as a reminder of this restriction.) When
we state simply that an operation has the cancellation property, we shall mean with respect to the equality relation. 't'hus a binary operation * has the cancellation property ifl' each of x * z = y * z and z * x = z *y implies that x = y. Further, the meaning of the statement that some binary relation is trichotomous or that a binary operation is monotonic with respect to some relation should be clear from the above, examples.
Although the less than relation was given a central position in our development of the theory of the natural number system, it can be introduced as an offshoot of the operations of addition and multiplication and the notion of positiveness which stems from the definition of a positive natural number as a nonzero natural number. Indeed, since x < y iff there exists a positive natural number z such that x + z = y, this characterization of less than may be taken as the definition of less than in terms of addition and positiveness. Then properties of less than can be derived as consequences of properties of positive elements, properties of
addition, and properties of multiplication. In such a treatment, parts As and Mr, of Theorems 2.1.5 and 2.1.7 respectively (which may be stated as "The sum and the product of two positive natural numbers is a positive natural number") occupy a key role. As an illustration we derive 01 within this framework. Assume that x < y and y < z. Then there exist positive natural numbers u and v such that x -1- u = y and y + v = z. Hence x + (u -}- a) = z. Since it positive and v positive imply that u + v is positive, it follows that x < z. We have called the reader's attention to the foregoing approach to the theory of order for the natural numbers because we shall employ it in each of the forthcoming extensions of the system of natural numbers.
2. Differences This section includes the necessary preliminaries for a definition of the integers and a rapid development of their properties, all of which is presented in the next section. In this section the letters "m," "n," "p," and "q" will designate natural numbers. The intuitive motivation for our point of departure is the observation that a solution of x + n = m is determined solely by m and n in a specific order. Thus ordered pairs
3.2
1
Ui(Je)cnccs
133
of natural numbers become the object of study. The way in which one might naively expect such objects to behave in view of their intended role is the source of the succession of definitions which we make.
By a difference we shall mean an ordered pair ()n, n). In the set N X N of all differences we introduce the relation ^'d (the subscript is for "difference") by defining (in, n) ',t (p, q) in + q = p + n.
if
LEMMA 2.1. -,t is an equivalence relation on N X N. We shall call a difference (m, n) positive iff m > n. Two fundamental properties of positive differences arc stated next.
LEMMA 2.2. If (in, n) is positive and (m, n) ^'d (p, q), then (p, q) is positive. If (in, n) is positive, then there exists a difference (p, 0), with p > 0, such that ()n, n) -,) (/), 0).
An operation, which we call addition and symbolize by f-, is defined for differences by
(in, n) + (p, q) = (m + p, n + q) Clearly, addition is a binary operation in N X N. The motivation for the definition is the expectation that if x + n = in and y + q = p, then it should follow that (x + y) + (n + q) = in + p. Properties of addition which interest us are given next.
LEMMA 2.3. If x, y, u, and v arc differences and x -d u and y ^-',i v,
then x+y' 'du+v.
LEMMA 2.4. Addition of differences is associative and conunutative. The sum of two positive differences is a positive difference. Further, addition is cancellable with respect to -,r.
LEMMA 2.5. If x and y are differences, then there exists a difference z such that z + x
Another binary operation in N X N, which we call multiplication and symbolize by , is defined for differences by (in, n) . (p, q) = (mp + nq, rnq + )zp).
Usually we shall write "xy" instead of "x y" for a product of differences.
134
Extension o f the Natural Numbers to the Real Numbers
I
CHAP.
3
LEMMA 2.6. If x, y, u, and v are differences and x -d u and y rvd v, then xy ^'d UV-
LEMMA 2.7. Multiplication of differences is associative and commutative, and distributes over addition. The product of two positive differences is a positive difference. Further, multiplication is cancellable with respect to -d for differences other than those of the form (m, in).
EXERCISES 2.1. Prove Lemma 2.1. 2.2. Prove Lemmas 2.2 and 2.3. 2.3. Prove Lemma 2.4. 2.4. Prove Lemma 2.5. 2.5. Prove Lernma 2.6. 2.6. Prove Lemma 2.7.
3. Integers Recalling Lemma 2.1, we define an integer to be a -,,-equivalence class. We shall write [x1
for the equivalence class determined by the difference x. (The new subscript is for "integer.") The set of integers will be symbolized by Z. We shall call an integer positive iff one of its members is a positive difference. It follows from Lemma 2.2 that if [x]; is positive, then every member of [x], is positive. The set of positive integers will be symbolized by Z-1 .
We consider next a relation from Z X Z into Z : {(([x];, [y],), [x d- y];)l x and y are differences}.
According to Lemma 2.3 this relation is a function which, by virtue of
its form, is a binary operation in Z. We call this operation addition 'and symbolize it by +. Thus,
[x]i + [y], _ [x + y],.
LEMMA 3.1. Addition of integers is associative and commutative, and has the cancellation property. Further, the sum of two positive integers is a positive integer.
3.3
I
Integers
135
LEMMA 3.2. If x and y are integers, then there exists exactly one integer z such that z + x = Y.
From this result it follows that if x is an integer, then there exists exactly one integer, which we call the negative of x and symbolize by
-x, such that
(-x) + x = x ± (-x) = [(0, 0)1,. Finally, we consider the following relation from Z X Z into Z: {(([x];, [y],), [xy];)l x and y are differences}. According to Lemma 2.6 this relation is a function which, by virtue of its form, is a binary operation in Z. We call this operation multiplica-
tion and symbolize it by . Thus [XI.
[_fl, = [.ry l,.
LEMMA 3.3. Multiplication is associative and commutative, distributes over addition, and has the cancellation property if (0, 0) is not a member of the factor to be canceled. Further, the product of two positive integers is a positive integer.
Now let us tidy up our notation for the integers. The first step is the observation that the set Z° of integers of the form [(n, 0) J; with n E N and the set of integers of the form [(0, m)], with in C N - {0} are dis-
joint and exhaust Z. The former statement is obvious. To prove the latter, consider any integer [(p, q)Jj. Exactly one of p > q and p < q holds. In the former case, p = q + it with it C N, and hence [(p, q)]i = [(n, 0)] C Z°. In the latter case, q = p -l- in with in C N - {0) and [(p, q)]i = [(0, rn)J which completes the proof. It is a straightforward exercise to demonstrate that the ordered triple whose coordinates are, in turn, Z°, the map on Z° which takes [(n, 0)]; into [(n + 1, 0)J and [(0, 0)J; is an integral system. 't'heorem 2.1.8 implies that the mapping f on N into Z such that f(n) = [(n, 0)]j is one-to-one, onto Z°, and preserves addition, multiplication, and less than. f We summarize these properties off by calling it an order-isomorphism of N onto Z° and indicate the relationship of L° to N by referring
to Z° as an order-isomorphic image of N (or, saying that L° is orderisomorphic to N). Parenthetically we remark that it should be clear t'Tu be precise, Theorem 2.1.8 states that f(x + y) = f(x) -I- f( Y), f(xy) = f(x)f(y), and x < y iflf f(x) < f(y). The last property implies, in turn, that x < y iff f(x) < fly), according to Exercise 1.11.9.
136
Extension of the Natural Numbers to the Real Numbers
I
ctIAP. 3
that these definitions are applicable to any two systems each of which consists of a set along with two binary operations and an ordering relation in that set. Thus we may apply the definitions to other such pairs of systems. The order-isomorphism of N onto Z° suggests that we call the members of Z° the integers which correspond to the natural numbers as names for them. We shall do this, and adopt "Oi," "1 i," "2i," which means we agree that ni = {(n, 0)]t
if
n C N.
Since the remaining integers (that is, the members of _L - Z°) have the form 1(0, in)]i with in C N - {0], and since {(0, m)]i = - [(in, O)]i = -ini,
we acquire "-1i," "-2i,"
as names for the so-called negative
integers. Ilcnceforth we may write, therefore, Z = { ... , - 2 i, -1 i, 0;, I i, 2 i, ...) . We summarize our results concerning (Z, +, , Oi, 1 i, Z ), the system of integers, in the following theorem. The theorem does not include all the properties which have been stated. However, in the exercises for this section, the reader is given the opportunity to show that the properties listed in the theorem are complete in the sense that from them follow as logical consequences all others which have been mentioned or might be expected. In particular, it is implied that from the properties listed it may be inferred that for each integer x the equation z + x = Oi
in part (4) has a unique solution (which we have already agreed to
symbolize by -x). Then the notation "y - x" [which appears in part (14) of the theorem] may be introduced as an abbreviation for "y -i- (-x)." Further, the exercises call for the derivation of all expected properties of less than, as defined in part (14).
TIIEOREM 3.1. The operations of addition and multiplication for integers, together with Oi, 1 i, and the set Z+ of positive integers, have the following properties for all integers x, y, and z.
(1) x + (y + z) = (x + y) + Z.
(2) x + y =y+x.
(3) Oi + x = x. (4) There exists an integer - such that z + x = 0i. (5) x(yz) = (xy)z. (6) xy = yx.
3.4
I
Rational Numbers
137
(7) 1 ix = x.
(8) x(y + z) = xy + xz. (9) xz = yz and z 54 Oi imply that x = y. (10) 0i 1 i.
(11) x,yEZi imply that
x+yCZ+.
(12) x, y E Z+ imply that xy C Z'-. (13) Exactly one of x C Z+, x = 0, -x C Z I- holds.
(14) If < i is defined by x < i y if y - x c z+, then < i simply orders Z and well-orders (Oil U Z+. EXERCISES 3.1. Prove Lemma 3.1. 3.2. Prove Lemma 3.2. 3.3. Prove Lemma 3.3. 3.4. Prove part (13) of Theorem 3.1. Remark. Exercises 3.5-3.8 arc concerned with proving that from the propertics of (-7,, +, , 0i, 1;, Z+) in Theorem 3.1 can he deduced the other propertics mentioned in this section and the familiar properties of less than. 3.5. From properties (1), (3), and (4) of addition, prove that (i) addition has the cancellation property, (ii) for each x the solution of z -I- x = Oi is unique, and (iii) for each x and y, the equation z + x = y has a unique solution. 3.6. Prove each of the following properties of negatives of integers:
-(x -f y) = -x -y, (-x)y = -(xy), (--x)(-y) = xy. 3.7. Using properties of addition and multiplication, prove that
(-1i)x = -x. 3.8. Prove each of the following properties of the system of integers. (i) x is positive iff 0 < i X. (ii) The square of a nonzero integer is positive. (iii) < i is transitive. (iv) For each pair x, y of integers, exactly one of x
(v) x
(vi) If 0
4. Rational Numbers The steps which precede the definition of a rational number parallel
those which lead to the definition of an integer. Now we concern ourselves with the solution of equations of the form xb = a where a
1 38
Extension of the Natural Numbers to the Real Numbers
I
C I3 A P .
3
and b are integers and b 54 0;. So again we consider ordered pairs of the numbers at hand but with the quotient (instead of the difference) in mind as the intended interpretation. Since the formal developments are so similar to those in the two preceding sections, our treatment will be rather summary. The letters "a," "b," "c," and "d" will designate integers in this section. An ordered pair (a, b) with b 0 0; will be called a quotient. The quotient (a, b) will be written as a b
The relation -y is introduced into the set of all quotients by defining
bd a
c
iff ad=bc.
This is an equivalence relation on the set of all quotients and has the further property that ac
a
bcb
if c O 0;.
We shall call a quotient b positive if ab is a positive integer. Further, we introduce operations of addition and multiplication into the set of quotients by way of the following definitions: a
c
b
+d
ad -I--bc bd
a
c
ac
b
d
bd
Since b 0 0; and d 0 0; imply that bd 34 0 these are operations in the set of quotients.
LEMMA 4.1 . If x, y, u, and v are quotients and x rvq u and y -y v, then x + y Ny u + v, xy Nq uv and, if x is positive, then u is positive.
In summary, this lemma asserts that the equivalence relation defined for quotients has all expected substitution properties. We forego proving for quotients the analogues of the properties derived in Section 3.2 for addition, multiplication, and positive elements. Instead, we turn to the Nq-equivalence classes to obtain the rational numbers.
A rational number is a -,equivalence class of quotients. The ra-
3.4
I
Rational Numbers
139
tional number having the quotient x as a representative we write, for the moment, as [x],.
The letter "s" is intended to refer to "rational"; we do not use "r" since we want to reserve it for real numbers. The set of rational numbers will be symbolized by Q.
We shall call [x], positive iii it contains a quotient y such that y is positive. It follows from Lemma 4.1 that if [x], is positive then each of its members is positive. The set of positive rationals we symbolize by Q+. The definitions of addition and multiplication for rationals are [x], + [y]n = [x + y13, [y], = [xy]8. Of course, Lemma 4.1 plays a crucial role in these definitions. Next we make a further definition: [XI.
a,=[a] for aCZ. Clearly, [ (a, a,) I a C Z) is a function on Z into Q. Further, it is oneto-one and, since
a, + b, _ (a -I- b) a,b, _ (ab),,
the operations of addition and multiplication are preserved under this mapping. Finally, the image a, of an integer a is a positive rational number ifl' a is a positive integer. This last property implies that if <, is the ordering relation which can be defined in O in terms of its positive
elements (see below), then a <; b ill' a, <, b,. Thus, the mapping a -+- a, is an order-isomorphism. The members a, of this order-isomorphic image of Z in 0, we shall call integral rational numbers. There follows one comprehensive theorem concerning properties of (Q, +, , 0 1 (Q 1), the system of rational numbers.
THEOREM 4.1. The operations of addition and multiplication for rational numbers, together with 0 1 and the set () i- of positive ratiohals have the following properties for all rationals x, y, arid z.
(1) x+ (y+z) _ (x+y) +z. (2) x + y =y+x. (3) 0, + x = x.
(4) There exists a z such that z + x = 0,.
140
Extension of the Natural Numbers to the Real Numbers
I
C 11 A P . 3
(5) x(yz) = (xy)z. (6) .y = yx. (7) 1.x = x. (8) If x 7-1 0., there exists a z such that zx = 11. (9) x(y -I- z) = xy + xz.
(10) 1.7 0.. (11) x,yCQ,+imply that (12) x, y C Q' imply that xy C (2;t.
x+yCQt.
(13) Exactly one of x C QF-, x = 0., -x C Q'- holds.
(14) If P is the intersection of all subsets of Q3 which contain 1, and are closed under addition, then, for each x C Q1', there exist a, b C P such that xb = a. In the exercises the reader is asked to prove the various parts of this theorem [including a more familiar formulation of (14)] and to derive some of the immediate consequences of these properties of the system of rational numbers. Certain results in the latter category are worthy of comment. First, since addition of rationals enjoys the same properties as does addition of integers [properties (1)- (4) of Theorems 4.1 and 3.1, respectively], the results [derived from (1)-(4) of Theorem 3.1 J which appear in Exercises 3.5 and 3.6 hold for rationals.
Next, since the basic properties of multiplication [parts (5), (7), and (8) of Theorem 4.11 for nonzero rationals mimic properties (1), (3), and (4) of addition, with 1. in place of 0., we may infer the following multiplicative analogues of the results in Exercise 3.5.
(i) xz = yz and z 0, imply that x = y. (ii) For each x 0, the solution of zx = 1, is unique. This solution is called the inverse of x and is symbolized by x-'. (iii) For given x and y with x 5 0., the equation zx = y has a unique solution. Finally, we call attention to the fact that if less than, which we symbolize
by <., is defined in Q by ilf y - xCQ' then it enjoys all of those properties stated in Exercise 3.8 for
x<.y
We are now in a position to simplify the notation for rationals. The string of identities CaavJA
t
iiJ=
l 8
= [1J]e L1 tie'
= aiba
3.4
I
Rational Numbers
141
shows that each rational can he written in terms of integral rational numbers.
We shall drop the subscript "s" from now on (the context should make plain whether a. or a is the appropriate entity) and further, agree that b or (when convenient) alb
is another name for ab-1. In this way we obtain the familiar notation for rationals. In practical terms this means that we agree to adopt names of repre-
sentatives (that is, members) of rational numbers as names of rational
numbers. To clarify this remark, let us consider, for example, the rational number C[(2, Mil «3, 0)]=B
By our convention, "2/3" is a name of this rational number. The statement "2/3 = 4/5" means that "4/5" is another name of the sartu: number. This is true iff C3JA = C5Js' which, in turn, is true ill 2 5 = 4 3. Since 2 5 4 3, the original statement is false. In general, the same type of analysis yields the following results for rational numbers: alb = c/d iff ad = cb,
alb + c/d = (ad + bc)/bd, (c/d) = ac/bd. We derive next two significant properties for rational numbers. At this point we begin to use elementary properties of rationals without (a/b)
explicit references.
THEOREM 4.2. Between any two distinct rational numbers there is another rational number. Proof. Suppose that r, s E Q with r < s. It is sufficient to prove that
r < (r + s)/2 and (r + s)/2 < s. To prove the first inequality we start with r < s and infer, in turn, r + r < r + s, 2r < r + s, and r < (r + s)/2. The second inequality is derived similarly.
THEOREM 4.3. (Archimedcan property). If r and s are positive rational numbers, then there exists a positive integer n (properly, a positive integral rational number n) such that nr > s.
142
Extension of the Natural Numbers to the Real Numbers
I
CHAP. 3
Let r = alb and s = c/d where a, b, c, and d are positive integers. If n is a (rational number corresponding to a) positive Proof.
integer, then nr > s if nad > bc. If for n we choose 2bc, this inequality is satisfied, since ad > 1. We conclude our discussion of rational numbers with the introduction
of two functions pertaining to rational numbers. The first has Q as domain and as its value at x, which we symbolize by [x], the greatest integer equal to or less than x. For example,
[2] = 2 and
U
[= 2.
The second function has Q as domain. Its value at x, which we symbolize
by lxl and call the absolute value of x, is defined as
Ixl=l-x
<
0.
x
TI I E OR E M 4.4. If x and y are rational numbers, then (I) 1XI > 0, (Il) Ixyl = 1XI lyl, (II1) Ix + yI <- 1XI + lyl,
(IV) lxi - 1,1 <_ lx - A. EXERCISES 4.1. Prove Lemma 4.1. 4.2. Prove that the mapping a --)a. on Z into Q, introduced prior to Theorent 4.1, is an order-isomorphism. 4.3. Prove Theorem 4.1. As for part (14) of this theorem, show that P is simply the set of ratitmals which correspond to the positive integers. 4.4. Write a short paragraph to substantiate the assertion made after Theorem 4.1 that the properties of multiplication listed may be inferred without giving new proofs.
4.5. Prove that the relation <. for rational numbers may be characterized as follows:
[a] <. [fl
iff abd2 <; b2cd.
4.6. Prove Theorem 4.4.
5. Cauchy Sequences of Rational Numbers The set of rational numbers includes nonempty sets which have an upper bound but fail to have a least upper bound. One of these is
3.5
I
CaucIty Sequences of Rational Numbers
143
S= (xC(Qt"Ix2<3(, as we proceed to prove. Clearly, any positive rational number whose square is greater than 3 is an tipper bound. On the other hand, no positive rational number whose square is less than 3 (that is, no member of S) is an upper bound, since, if s C S, then
3 - s2 s+5-I-2s is obviously greater than s and, as a direct calculation shows, is a member of S. Since there exists no rational whose square is equal to 3, it follows that those positive rationals whose square exceeds 3 exhaust the set of upper bounds for S. Now this set has no least member. Indeed, if u is a positive rational such that u2 > 3, then u -I- 3/u 2
is positive, less than u (since 3/u < u'2/u = u), and its square is greater than 3, since
u-3u ,+ (u+3/2=(-L)2 2
J
2
It follows that S has no least upper bound. The failure of the rational number system to include the least upper bound of every nonempty set having an upper bound may be taken as
the motivation for the extension of Q that is presented in the next section. We now set the stage for this by developing the theory of Cauchy sequences of rational numbers.
We recall that a sequence is a function having '_L° (or, when convenient, Z t) as its domain. The value of the sequence x at n will be denoted by x,,. A sequence of rational numbers is a sequence x such that x E 0, for every n. A Cauchy sequence of rational numbers is a sequence x of rational numbers such that for every positive rational number e there exists a positive integer N such that for every m, n > N
<e. EXAMPLES 5.1. The sequence x such that X" _
n+1 u
is a Cauchy sequence (of rational numbers). To prove this we must exhibit for each positive rational number a an integer N such that for m, n > N lx - xml < e.
144
Extension of the Natural Numbers to the Real Numbers
Since
Ixn - xm! =
m_-n _ mn
1 I
i
CHAP. 3
- l `_1 ml
n
min
if we let N = [1/e] -I- 1, then for all m, n > N,
Ixn -x,n1 <_-1 min (rn, n)
< e.
is
5.2. The sequence x such that x0 = 0, x, = 1, and xn _ 2 (Xn -t -l- xn- 2) for a Cauchy sequence. To prove this we note first that
xn}1-xn=
n > 2
\1)n 2n
This can be established by induction. Further, from the recursive definition of xn it is clear that for all m > n, x,, falls between xn and xni*.,. So, if a is a positive
rational number and we choose N so that 2N > l/e, then, for all m, n > N, Ixm. - xnj
Ixnu - xnI
< F.
We define the operations of addition and multiplication for sequcnccs of rational numbers in the following way: x --1- y = u
where u = x -1-
xy = v where
y,,
Un = Xnyn
Clearly, if x and y arc sequences of rational numbers, then sa arc x + y and xy. It is an important fact that if x and y are Cauchy sequences of rational numbers, then so are x + y and xy. In other words, addition
and multiplication are binary operations in the set of all Cauchy sequences of rational numbers. The proof for the case of multiplication requires the following preliminary result.
3.5
1
Gauclay Sequences of Rational Numbers
145
LEMMA 5.1
. If x is a Cauchy sequence of rational numbers, then there exists a positive rational number 5 such that for every n
Ixnl < S.
Corresponding to the positive rational number 1 there exists by assumption an integer N such that for every m, n > N Proof.
IXn - XmI <
(1)
Let (2)
S = max (Ixol, Ixil, ..., IxNI, IAN+II) + 1.
Clearly, if n < N + 1, then Ixni < 5. Suppose then that n > N + 1. By virtue of (1), lxn - XN+11 < 1 and, hence, Ixnl < I XN-, lI + I.
According to (2), IXN_yl + 1 < S. Hence, for all n, Ixtti < S.
LEMMA 5.2. If x and y are Cauchy sequences of rational numbers, then x -{- y and xy arc Cauchy sequences of rational numbers.
(Sum.) Let e > 0. By hypothesis there exist N, and N2 such that for all m, n > N,
Proof.
IXn - xml < E/2,
and for all m,n> N2 lyn - yml < e/2. Then for all m, n > max (N,, N2) ixn + yn - (Xm + ym) I = I (Xn
Xm) + (yn - ym)
+lyn - ymi
< E.
(Product.) Let e > 0. By virtue of the preceding lenuna, there exist positive rational numbers S, and2 such that for all n Ixnl < Si, SIynl < S2.
Further, there exist integers NA and N2 such that for all m, n > N, iXn - xmI < E/(262),
and for all m, n > N2 lyn - yml < E/(251).
146
Extension of the Natural Numbers to the Real Numbers
I
C11 A P. 3
Then for all m, n > max (N,, N2)
-
Ixnyn - xym1 = Ixnyn - xmyn + x,,.yn Iy,.I Ix, - xml + Ix,,nl Iyn
/ml
2. )
US-) G E.
The basic properties of addition and multiplication may be summarized by the statement that they satisfy properties (1) -(8) of Theorem 3.1, where the distinguished elements in (3) and (7) are taken to
be the sequence 0, (whose value is 0 for all n) and the sequence 1 (whose value is 1 for all n), respectively. Again, the results stated in Exercises 3.5- 3.7 hold. The negative of the Cauchy sequence x is the
sequence -x such that (-x) = -x,. for all n. We introduce next a relation, which we symbolize by -,, in the set of all Cauchy sequences of rational numbers. If x and y are Cauchy sequences of rational numbers, then x
y
if for every positive rational number E there is an integer N such that for every n > N, Ixn - ynl < E.
As an illustration, consider the sequences x and y such that x,, _ (n + 2)/(n + 1) andy,. = 1 fot all n. These are Cauchy sequences and clearly x -,.y, since x,, -y,, = 11(n + 1). It is an easy matter to establish the following property of this relation.
LEMMA 5.3. The relation N, is an equivalence relation on the set of all Cauchy sequences of rational numbers. If x is a Cauchy sequence of rational numbers, then x is called positive
ifl there is a positive rational number e and an integer N such that for every n > N X. > E.
The expected substitution properties of the equivalence relation with respect to addition, multiplication, and positiveness are stated next.
LEMMA 5.4. If x, y, u, and v are Cauchy sequences of rational numbers and x
u and y -. v; then x + y
if x is positive, then u is positive.
u + v, xy -. uv and,
3.5
I
Gaucliy Sequences of Rational Numbers
147
u + v is left as an exercise. Turning The proof that x + y to the result concerning multiplication, let e be a positive rational number. By Lemma 5.1 there exist positive rational numbers 51 Proof.
and 52 such that for every n ly,.l < 5., lung < 52.
Since x ti, u, there exists an integer N, such that for every n > N1 Ixn - un l < E/25,, and since y ' v, there exists an N2 such that for every n > N2 ly. - v,81 > E/252.
Then, for every n > max (N1, N2) unV,, = IXnyn - unyn + unyn -
unvnl
< lynl Ixn - u,, + 111"1 lyn -- and
<
52(e/252)
< E.
That if x is positive and x -, u, then u is positive is shown as follows. By assumption, there exists a positive rational number 2E and
an integer N, such that for every n > N, x > 2E, and there exists an N2 such that for every n > N2 E. Ixn Hence, for n > max (N,, N2), un> Xn - E
> 2E-e > E.
LEMMA 5.5. The sum and the product of two positive Cauchy sequences are positive Cauchy sequences. Further, if x is any Cauchy sequence, then exactly one of the following hold: x is positive, x 'C 06, -x is positive. Proof. We shall prove only the last statement. Clearly, at most one of the three possibilities for x can hold. So we need to prove that at least one holds. Suppose that x is not equivalent to 0,. By the definition of x N, 0, this means that there is a positive rational number 2E such that for
every integer N there is an n > N such that (1)
I xn1 > 2E.
Exten.uon of the Natural Numbers to the Real Nun;be;.c
148
I
C uu A P . 3
Since x is a Cauchy sequence there is an integer N, such that if m, n > N1, then Ixn
(2)
- x+nl < E.
From our observation which led to (1), it follows that there exists an
integer p > N, such that (3)
I x,, l > 2E.
Since (3) implies that x 0 0, either x,, > 0 or x, < 0. Suppose that x, > 0. Then xp > 2E
by (3), and, as a consequence of (2), for every n > p Ixn - x,, < E. Hence, for every z1 > p E
> 2E-E > E.
Thus, if .x,, > 0, then x is positive.
By a similar argument it can be proved that if x,, < 0, then -x is positive.
With the foregoing result available it is easy to prove the following lemma, which is of basic importance when we turn our attention to the -,-equivalence classes of Cauchy sequences (that is, real numbers).
LEMMA 5.6. If the Cauchy sequence x is not equivalent to 0, then there is a Cauchy sequence z such that zx N, 1 Proof.
The preceding lemma implies that for an x which is not
equivalent to 0. there is a positive rational a and an N such that for every n > N Ixnl > E.
Consider now the sequence x' such that xn = E if n < N and x;, = x if n > N. Clearly, x' is a Cauchy sequence, x' x, and for all n (1)
Ixnl > E.
0 for every it, the sequence z where z = l /x, is a sequence of rational numbers. Further, z is a Cauchy sequence as we proceed to prove. Let 17 be a positive rational number. Since x' is a Cauchy sequence, there exists an N such'that for every m, n > N Since x'
(2)
Ixn - xmI < 'qE2.
3.6
I
Real Numbers
149
Further, by virtue of (1), we have I
From (2) and (3) it follows that for all m, n > N Izm - zn! < 77,
which proves that z is a Cauchy sequence. It is clear that zx' Finally, since x'' x, it follows that zx -, I.
1
EXERCISES 5.1. Prove that the sequence x such that n
xn=1-3+5-...--2n+1 is a Cauchy sequence.
5.2. Prove that the sequence x such that
xn=1+1
d-I-+...+I
is a Cauchy sequence.
5.3. Prove that addition and multiplication for Cauchhy sequences satisfy parts (1)-(8) of Theorem 3.1. 5.4. Prove Lemma 5.3. 5.5. Complete the proof of Lemma 5.4. 5.6. Complete the proof of Lemma 5.5.
6. Real Numbers As promised earlier, we define a real number as a -,equivalence class of Cauchy sequences of rational numbers. The real number having the Cauchy sequence x as a representative we write as [x l r,
for the time being. The set of real numbers will be symbolized by R. We shall call a real number positive if it contains a positive Cauchy sequence. In view of Lemma 5.4, if [x]r is positive, then each of its members is positive. The set of positive real numbers we symbolize by R-1.
The following definitions of addition and multiplication for real numbers will scarcely offer any surprise: [XI,
[x]r
+ylr,
-+-
[[Ylr = [X
-
[y]r = [x! ]r.
150
Extension of the Natural Numbers to the Real Numbers
+
CHAP P. 3
Of course, it is Lemma 5.4 which ensures that these are binary operations in R. Next we call attention to a distinguished set of real numbers--those which correspond to rational numbers in a natural way. If a is a rational number, then (a, a, , a, - ) is a Cauchy sequence of rational numbers. Since T.. is a partition of the set of all such sequences, there exists exactly one real number, let us call it ar, which contains (a, a,
, a,
-
).
This means that j
{(a, ar)Ia C Q.]
into R. It is easily proved that this function is oneis a function on to-one and, moreover, that the operations of addition and multiplication are preserved by this mapping. Finally, a rational number is positive iff its correspondent in B. is positive. Thus, B. includes an order-isomorphic
image of Q. Members of this image of Q, will be called rational real numbers. The rational real number corresponding to the rational number 0, we again call zero and symbolize by Or. Thus, Or = ](0=, 0e, .. ., 0 .. .)]r. The rational real number corresponding to the rational number 1. we again call one and symbolize by J r. Thus,
The first major theorem concerning the real number system (g, -1-, , Or, 1 r, B.1) is the following. Its proof relies entirely on those properties of the rational number system appearing in Theorem 4.1, the properties of Cauchy sequences appearing in Lemmas 5.5 and 5.6, and the definitions of addition, multiplication, and positiveness for real numbers.
THEOREM 6.1. The operations of addition and multiplication for real numbers, together with Or, 1, and the set of positive reals, have properties (1)-(13) listed in Theorem 4.1.
Those further properties of addition and multiplication for rational numbers which are listed immediately after Theorem 4.1 are enjoyed by the corresponding operations for real numbers. If less than, which we
symbolize by
if y- x E 8+,
then it has all those properties which <, possesses Also, the definition
Real Numbers
3.6
151
:)f the absolute value function extends to the real numbers and Theorem 4.4 applies. Our results up to this point may be summarized by the statement that the extension of Q to R has resulted in no loss of ground. That a gain has been made will be demonstrated when we have proved that every
nonempty set of real numbers which has an upper bound has a least upper bound. The proof of this property of the real number system requires some other results which are important in their own rights. The first of these is usually phrased as the statement that the rational numbers are dense in the set of real numbers.
THEOREM 6.2.t Between any two distinct real numbers there is a rational real number. Precisely, if x and y are distinct real numbers,
then there exists a rational real number z such that if x < y, then x < z < y while if y < x, then y < z < x. We shall consider the case x < y. Let a C x and b C Y. Then x < y implies the existence of a positive rational 4c and an integer NJ such that for every n > N, Proof.
fi - a > 4E.
(1)
Further, since a and b are Cauchy sequences, there exist integers N2 and N3 such that for every rn, n > N2 Ian - a,,,I < E,
(2)
and for every m, n > N3 (3)
Ibn - b,,11 < e.
Let N = max (N1, N2, N3) + I and let s be a rational number such that e < s < 2e (see Theorem 4.2). Now consider the real number z corresponding to the rational number aN + s. We contend that x < z and z < y. From (2) we may conclude that for every n > N an - aN < E. Hence, aN -- an > -e and, therefore, for every n > N
(aN+S) -an > s
e> 0.
This means that the Cauchy sequence
(aN ± S, aN +s, ..., a, - S, ...> - a is positive. Since this sequence is a member of z - x, the real number
z - x is positive, and hence x < z. t In the remainder of this chapter we shall omit the letter "r" as a subscript for the symbols used in connection with real numbers.
152
Extension of the Natural Numbers to the Real Numbers
I
C II A r . 3
Using the identity bm - (aN + s) = (bv - aN) + (bm - bN) - s, it follows by a similar argument [which employs (1) and (3) ] that y - z is positive, and hence z < y.
The following theorem is a generalization of the corresponding property (Theorem 4.3) for the system of rational numbers.
THEOREM 6.3 (Archiincdean property).
If x and y are positive real numbers, then there exists a positive integer n (properly, a real
number n which corresponds to a rational which, in turn, corresponds to a positive integer) such that nx > y. Proof. Let b C y. According to Lemma 5.1, there exists a positive rational number b such that for every n bn < b. ), then If d is the real number corresponding to (b, b, , a,
y < d. Also, by assumption,
0 <x. By the preceding theorem there exist rational real numbers s and t such that
0<s<x,
y
By the Archimedean property of rationals (which obviously carries over to rational reals) there exists a positive integer n such that
ns> t. It follows that
nx>ns>t>y.
THEOREM 6.4. A nonempty set of real numbers which has an upper bound has a least upper bound. Proof. In the proof which follows, if x is a real number and a is a rational number such that x < a, we shall abbreviate this to simply
"x < a." Let A be a set which satisfies the hypothesis of the theorem. According to Theorem 6.3 there exist integers m and M such that m is not an upper bound of A and M is an upper bound of A. (To obtain
3.6
1
Real Numbers
153
an in, select an element a of A, apply Theorem 6.3 to secure an integer n such that n > -a, and then let m = -n.) Then we may infer the existence of an integer bo such that bo is an upper bound of A
while bo - I is not. We now define b inductively as follows: if bn_.1 - 2-n is an upper bound of A, bn _ bn_t - 2_n if bn_t - 2-4 is not an upper bound of A. hn_ t For all n, bn is an upper bound of A and, as may be proved by an
induction argument, bn - 2-n is not an upper bound. Hence, for every m > n
bn - 2-" < b,n. Further, it is clear that for every m > n, (1)
b,n < b
(2)
.
Combining (1) and (2) gives Ibn -- bml < 2-n.
It follows that if N is a positive integer and m, n > N, then Ibn - bml < 2-N, whence b is a Cauchy sequence of rational numbers. Let u be the real number which it determines. Then by virtue of (1) arid, in turn, (2), for every n (3) (4)
bn-2-n
We shall now prove that u is an upper bound of A. Assume to the
contrary that a > u for some a in A. Then there exists an n such that 2" > (a - u)--l or
2-n
Addition of this to (3) yields the inequality bn < a, a contradiction of the fact that bn is an upper bound of A. Finally we prove that u is the least upper bound of A. Assume to the contrary that v is a smaller upper bound. As above, there then exists an n such that (5)
2-n
f "That (2) implies (4) is a consequence of the following result. If x and y are Cauchy sequences and there exists an integer N such that for all n > N, x < y,,, then lx], < ,[y],. For the contrary implies that the Cauchy sequence z such that z = x - y" is positive and hence there exists an e > 0 and an N, such that for all n > N,, x - y > e. If we choose n = N + N, we are led to a contradiction of the hypotheses. We note further that x < y" does not imply [xJ, < ,[y], but only [x], <, [Y],.
154
Extension of the Natural Numbers to the Real Numbers
I
CHAP. 3
Since b" - 2-" is not an upper bound of A, there exists an a in A such
that b - 2-" < a, which implies that
b"-2"
b" < u, which contradicts (4).
Later we shall prove that the properties of the real number system stated in 't'heorems 6.1 and 6.4 characterize it to within an orderisomorphism.
EXERCISES 6.1. Prove that the system of rational real numbers is order-isomorphic to 6.2. Prove Theorem 6.1. 6.3. Prove the assertion made in the proof of Theorem 6.4 that b" - 2-" is not an upper bound of A. 6.4. Derive as a corollary to 't'heorem 6.4 that a nonempty set of real numbers which has a lower bound has a greatest lower bound.
6.5. Let f be a real function --that is, a function whose domain and range are each a set of real numbers. Such a function is called continuous at a member a of its domain iff for every e > 0 there exists a S > 0 such that for IhI < S and
a -l- h in the domain off I f(a + h) - f(a) I < E.
Prove that if f is a continuous at each point of the closed interval [a, b] and f (a) < 0 and J(b) > 0, then there exists a c such that a < c < b and f (c) = 0. Hint: Define c to be the least upper bound of all x between a and b for which
f(x) < 0. 6.6. Assume that it has been shown that a real polynomial function is continuous. Let f be the polynomial function such that for all real numbers x, f(x) = x" - a where n is a positive integer and a is a positive real number. Prove that there exists exactly one positive real number c such that f (c) = 0. This number is called the nth root of a and symbolized by Va or a'1" 6.7. If a > 0, b > 0, and ri is a positive integer, prove that
1/a 17'b = ab.
7. Further Properties of the Real Number System A sequence x of real numbers is a sequence such that x" C R for every n. A Cauchy sequence of real numbers is a sequence x of real
3.7
I
Further Properties of the Real Number System
155
numbers such that for every positive real number e there exists a positive
integer N such that for every m, n > N IX'n - xml < E.
We define next the notion of limit. This notion is the cornerstone of the calculus and, indeed, of analysis in general. The real number y is a limit of the sequence x of real numbers iff for every positive real number e there exists a positive integer N such that for every n > N Ixn - yl < E. The proof of the following lemma is left as an exercise.
LEMMA 7.1. A sequence of real numbers has at most one limit. Thus, if the sequence x of real numbers has y as a limit, then y is its only limit, and we are justified in introducing the following familiar notation for y :
lim x,, = y n-
or simply lim x = Y.
LEMMA 7.2. Let a be a sequence of rational numbers and let x be the sequence of real numbers such that for every n, x = the real number corresponding to a,,. Then x is a Cauchy sequence ifr a is a Cauchy sequence. Further, if a is a Cauchy sequence and y is the real number which it defines, then lim x,, = y. Proof. We shall consider only the second assertion. Let e be a positive real number and let b = (d)r he a rational real number such that 0 < b < E. Since a is assumed to be a Cauchy sequence, there exists an N such that for all m, n > N Ian - a,I < d Since a - a,,, < d it follows that Ran, a,,, ..
an,
...) - (a,, a2, ... , an, . ..) ]r < b
(see the footnote in the proof of Theorem 6.4) or, in other words, that
xn - y<S. Similarly, the inequality a. - an < d implies that Hence, for all m, n > N
Ixn - yl < b <
Extension of the Natural Numbers to the Real Numbers
156
I
C I I A P. 3
THEOREM 7.1. (Cauchy convergence principle). A sequence of real numbers has a limit iff it is a Cauchy sequence. Proof.
It is left as an exercise to prove that if a sequence of real
numbers has a limit, then it is a Cauchy sequence. Turning to the converse, assume that u is a Cauchy sequence of real numbers. The strategy for proving that u has a limit calls for the determination of a sequence a of rational numbers which approximates u sufficiently closely that a is a Cauchy sequence and the real number which it defines is the limit of rc. For each positive integer n, u,, < un + 1/n, and hence (Theorem 6.2) there exists a rational real number x such that un < x < un -i- 1 /n. Let a be a positive real number. Then there exists an integer Nn such that N, > 3/e and, hence, for every n > Nr tun - xnl < e/3. (1) Further, x is a Cauchy sequence, since lxn - Xnel < Ixn - unl + Jun - Umt + June - Xrnl, and for in and n sufficiently large each summand on the right side of the inequality is less than e/3. Let an be the rational number to which xn corresponds. By Lemma 7.2, a is a Cauchy sequence of rational numbers and hence defines a real number y. Further, by Lemma 7.2, urn Xn = Y.
Hence, there exists an integer N2 such that for n > N2
Ixn - YI < e/2. We infer from (/) and (2) that for n > Inax (Nn, N2) (2)
tun - yI ` Iun - xnl + Ix,, - yj < e/3 + e/2 < e, which establishes that y = Limn un.
We establish next the possibility of representing a real number by a nonterminating decimal. A precise formulation of this, generalized to any integer radix greater than or equal to 2, is given in the following theorem.
THEOREM 7.2. Let r be an integer greater than or equal to 2e Corresponding to each nonncgative real number x there is a sequellOG
3.7
I
Further Properties of the Real Number System
157
- - , d,,, - -) of integers which is uniquely determined by x (relative to r) such that
(a, dl, d2,
-
(i) a = [x], the largest integer less than or equal to x, (ii) 0 < do < r for all n, (iii) the sequence whose terms are defined inductively by
yo=a, yn+-I = yn + doI
l/rnf-1
is a Cauchy sequence and lien yn = x.
Let r be an integer greater than or equal to 2, x be a nonnegative real number, and a = [x]. Then xr = ar + xl for some number xl such that 0 < xl < r. Let dl = lxi ] so xlr = dlr + x2 for some number x2 such that 0 < X2 < r. Let d2 = [x2] so Proof.
x2r = d2r A- xa
for some number x3 such that 0 < xa < r. In general, define xn by xn_Ir = do-lr + xn
and set do = [xn]. 't'hen
x=a
.-- -n --z+..+ r r r
do
d2
l
a
Y I>> n.i
r
where 0 < x,,.1I < r. Hence r
r//
r
1.11
According to the definition of yn given in (iii), this may be written as
0 <x - yn < r
n.
It follows that ix - ynI < r '`, whence limy. = x. The proof of the uniqueness (relative to r) of the sequence corresponding to x is left as an exercise.
If r = 10 in the' preceding theorem we obtain the familiar representation of a nonncgative real cumber as a nontcrminating decimal ;upon writing a = [xj in decimal notation. Of the two possible decimal 'representations of numbers of the form i 10-) where i and j are non-
158
Extension of the Natural Numbers to the Real Numbers
I
CHAP P. 3
negative integers, the theorem chooses that one which consists of all zeros after a certain point. The proof of the converse of Theorem 7.2 is left as an exercise.
THEOREM 7.3.
) be a sequence of nonnegative integers such that for some integer r > 2, 0 < do < r for Let (a, dl, d2,
, d,,,
all n. Then there exists a unique nonnegative real number x such that the sequence whose terms y. are defined inductively by yo = a and is a Cauchy sequence having x as its limit. yA+, = y +
We conclude our development of the real number system by calling attention to # common feature of the three extensions whereby R is obtained from N. If the details (see Sections 2 and 3) of the first extension are reviewed, it will become evident that it could be mimicked using integers instead of natural numbers as the initial elements. Suppose this construction is carried out to obtain what might be called the system of superintegers. Then this system has all those properties which the system of integers possesses. Moreover, the system of superintegers has an
additional property-one which destroys any further interest in it. Namely, as the reader can readily prove, it is order-isomorphic to the system of integers. In other words, the extension of Z by the method used to extend N to Z yields nothing essentially different from Z. A corresponding result holds for the second type of extension we introduced : the extension of (Q by the method used to extend Z to Q is a system which is order-isomorphic to Q. Finally, let us consider the extension of R, which can be made in terms of Cauchy sequences. We shall call the numbers we get in this way superreal numbers. Thus, a superreal number is an equivalence class of Cauchy sequences of real numbers. Corresponding to the results stated above, it is possible to prove that the system of superreal numbers is order-isomorphic to the system of real numbers. In other words, essentially nothing new results if R is extended by the method used to extend Q to R. To prove this we point out first that, as discussed in Section 6, there is a one-to-one map on the initial system (which is now R) into the extended system SR. This map determines an order-isomorphic image of B. in SR. Suppose
now that X is any superreal number. Let x E X, which means that x is a Cauchy sequence of real numbers. According to Theorem 7.1, x has a limit y, whence x is -.,equivalent to (y, y, , y, ), which implies that X = yer. This means that the image of B. in SR exhausts SR or, in other words, that SR is an order-isomorphic image of R.
Bibliographical Notes
159
EXERCISES 7.1. Prove Lemma 7.1. 7.2. Prove the first assertion made in Lemma 7.2. 7.3. Prove that if a sequence of real numbers has a limit, then it is a Cauchy sequence.
7.4. In the proof of Theorem 7.2, show that an explicit definition of d" is
d" = [xr"] - r[xr"-']. 7.5. Prove Theorem 7.3. 7.6. Prove that the system of superintegers is order-isomorphic to the system of integers.
BIBLIOGRAPHICAL NOTES The two basic set-theoretical methods of constructing the system of real numbers from the system of natural numbers are due to Cantor and Dedekind. The difference in these methods appears in the extension of the rational numbers to the real numbers. The extension of the rationals to the reals via Dedekind's method is given in Landau (1930) and in N. FI. McCoy (1960).
CHAPTER
4
Logic
AS
WE SHALL study it, mathematical or symbolic logic has two aspects. On one hand it is logic--it is an analytical theory of the art
of reasoning whose goal is to systematize and codify principles of valid reasoning. It has emerged from a study of the use of language in argument and persuasion and is based on the identification and examination of those parts of language which are essential for these purposes. It is
formal in the sense that it lacks reference to meaning. Thereby it achieves versatility: it may be used to judge the correctness of a chain of reasoning (in particular, a "mathematical proof") solely on the basis of the form (and not the content) of the sequence of statements which make up the chain. There is a variety of symbolic logics. We shall be concerned solely with that one which encompasses most of the deductions of the sort encountered in mathematics. Within the context of logic itself, this is "classical" symbolic logic. The other aspect of symbolic logic is interlaced with problems relat-
ing to the foundations of mathematics. In brief, it amounts to formulating a mathematical theory as a logical system augmented by further axioms. The idea of regarding a mathematical theory as an "applied" system of logic originated with the German mathematician G. Frege (1848-1925), who developed a system of logic for use in his study of the foundations of arithmetic. The Principia Mathematica (1910-1913) of Whitehead and Russell carried on this work of Frege and demonstrated
that mathematics could be "reduced to logic." In the later chapter treating axiomatic theories some indication will be given of this approach to mathematical theories.
1. The Statement Calculus. Sentential Connectives In mathematical discourse and elsewhere one constantly encounters declarative sentences which have been formed by modifying a sentence with the word not or by connecting sentences with the words and, or, if . . . then (or implies), and if and only if. These five words or combina-
4.1
1
T 1w Statement Calculus: Sentential Connectives
161
tions of words are called sentential connectives. Our first concern here is the analysis of the structure of a composite sentence (that is, a declarative sentence in which one or more connectives appear) in terms of its constituent prime sentences (that is, sentences which either contain no connectives or, by choice, are regarded as "indivisible"). We shall look first at the connectives individually. A sentence which is modified by the word "not" is called the nega-
tion of the original sentence. For example, "2 is not a prime" is the negation of "2 is a prime," and "It is not the case that 2 is a prime and 6 is a composite number" is the negation of "2 is a prime and 6 is a composite number." It is because the latter sentence is composite that grammatical usage forces one to use the phrase "It is not the case that" instead of simply the word "not." The word "and" is used to join two sentences to form a composite sentence which is called the conjunction of the two sentences. For example, the sentence "The sun is shining, and it is cold outside" is the conjunction of the sentences "The sun is shining" and "It is cold outside." In ordinary language various words, such as "but," are used as approximate synonyms for "and"; however, we shall ignore possible differences in shades of meaning which might accompany the use of one in place of the other. A sentence formed by connecting two sentences with the word "or" is called the disjunction of the two sentences. We shall always assume that "or" is used in the inclusive sense (in legal documents this is often
expressed by the barbarism "and/or"). Recall that we interpreted "or" in this way in the definition of the union of two sets.
From two sentences we may construct one of the form "If . . . , . . ."; this is called a conditional sentence. The semtcnce irnipediately following "If" is the antecedent, and the sentence iuianediately following "then" is the consequent. For example, "If 2 > 3, then 3 > 4" is a conditional sentence with "2 > 3" as antecedent and "3 > 4" as consequent. Several other idioms in English which we shall regard as having the same meaning as "If P, then Q" (where I' then
and Q arc sentences) are P implies Q, only if Q, P is a sufficient condition for Q, Q, provided that P, Q if P, Q is a necessary condition for P.
162
Logic
I
CHAP. 4
The words "if and only if" are used to obtain from two sentences a biconditional sentence. We regard the biconditional P if and only if Q as having the same meaning as if P, then Q, and if Q, then P; Q is a necessary and sufficient condition for P.
By introducing letters "P,"
.
.
.
to stand for prime sentences, a
special symbol for each connective, and parentheses, as may be needed for punctuation, the connective structure of a composite sentence can
be displayed in an effective manner. Our choice of symbols for the connectives is as follows:
-, for "not," A for "and," V for "or,"
- for "if .
. . , then . for "if and only if." Thus, if P and Q are sentences, then
.
.
,"
-1P,PAQ,PVQ,P-4Q,P++ Q are, respectively, the negation of P, the conjunction of P and Q, and so on. Following are some concrete examples of analyzing the connective structure of composite sentences in terms of constituent prime sentences.
EXAMPLES 1.1. The sentence 2 is a prime, and 6 is a composite number may be symbolized by
P A C, where P is "2 is a prime" and C is "6 is a composite number." 1.2. The sentence If either the Pirates or the Cubs lose and the Giants win, then the Dodgers will be out of first place and, moreover, I will lose a bet is a conditional, so it may be symbolized in the form R --> S.
The antecedent is composed from the three prime sentences P ("The Pirates lose"), C ("The Cubs lose"), and G ("The Giants win"), and the consequent
4.1
1
The Statement Calculus. Sentential Connectives
163
is the conjunction of D ("The Dodgers will be out of first place") and B ("I will lose a bet"). The original sentence may be symbolized in terms of these prime sentences by
((PVC.,) AG)-'(DAB). 1.3. The sentence If either labor or management is stubborn, then the strike will be settled iff the government obtains an injunction, but troops are not sent into the mills is a conditional. The antecedent is the disjunction of L ("Labor is stubborn") and M ("Management is stubborn"). The consequent is a biconditional whose left-hand member is S ("The strike will be settled") and whose right-hand member is the conjunction of G ("The government obtains an injunction") and the negation of R ("Troops are sent into the mills"). So the original sentence may be symbolized by (L V M) -> (S c-), (G A (-1R))).
To avoid an excess of parentheses in writing composite sentences in symbolic form, we introduce conventions as in algebra. We agree that +-> is the strongest connective (that is, it is to encompass most), and then
follows -+. Next in order are V and A, which are assigned equal strength, and then follows -, , the weakest connective. For example,
P A Q-+Rmeans(P A Q)-iR, P +--> Q --+R means P F-' (Q - ' R),
-, P A Q means (-n P) A Q, and the sentence in Example 1.3 may now be symbolized as L v M -> (S +-> G A -, R). EXERCISES 1.1. Translate the following composite sentences into symbolic notation, using letters to stand for the prime components (which here we understand to mean sentences which contain no connectives). (a) Either it is raining or someone left the shower on. (b) If it is foggy tonight, then either John must stay home or he must take a taxi. (c) John will sit, and lie or George will wait. (d) John will sit and wait, or George will wait. (e) I will go either by bus or by taxi. (f) Neither the North nor the South won the Civil War. (g) If, and only if, irrigation ditches are dug will the crops survive; should the crops not survive, then the farmers will go bankrupt and leave.
Logic
164
I
cnnP. 4
(h) If I am either tired or hungry, then I cannot study. (i) if John gets up and goes to school, he will he happy; and if he does not get up, he will not be happy. 1.2. Let C be "Today is clear," R be "It is raining today," S be "It is snowing today," and Y be "Yesterday was cloudy." Translate into acceptable English the following.
(a) C->-,(RAS).
(d) (Y->R) V C.
(b) Y .-4 C.
(e) C 4 .* (R A -18) V Y.
(c) Y A (C V R).
(f)
(C H R) A (--,S V Y).
2. The Statement Calculus. Truth Tables Earlier we agreed that by a statement we would understand a declarative sentence which has the quality that it can be classified as
either true or false, but not both. That one of "truth" or "falsity" which is assigned to a statement is its truth value. Often we shall abbreviate "truth" to T and "falsity" to F. If P and Q are statements, then, using the everyday meaning of the connectives, each of
-,P,PAQ,PV Q,P--3Q,PHQ is a statement. Let us elaborate.
On the basis of the usual meaning of "not," if a statement is true, its negation is false, and vice versa. For example, if S is the true stateinent (has truth value T) "The moon is a satellite of the earth," then --S is false (has truth value F). By convention, the conjunction of two statements is true when, and
only when, both of its constituent statements are true. For example, "3 is a prime, and 2 + 2 = 5" is false because "2 + 2 = 5" is a false statement. Having agreed that the connective "or" would be understood in the inclusive sense, standard usage classifies a disjunction as false wlicn, and only when, both constituent statements are false.
Truth-value assignments of the sort which we are making can he summarized concisely by truth tables wherein are displayed the truthvalue assignments for all possible assignments of truth values to the constituent statements. Below are truth tables for the types of composite statements we have already discussed, as well as those for conditional and biconditional statements.
4.2
The Statement Calculus. Truth Tables
I
Negation P T F
I
165
Conjunction
_1 P
P
QIPAQ
F
T T
T
T
F
F F
T
Disjunction
P
Q
T T
T
F
T
F
F
T
T T T
F
F
F
F
F
Condi tional
P T T
T
T
F
F
F
T
F
F
T T
F
IPVQ
Bicond itio na l
I PHQ
P
Q
T T
T
T
F
F
F
T
F
F
F
T
The motivation for the truth-value assignments made for the conditional is the fact that, as intuitively understood, P -- * Q is true if Q is deducible from P in sortie way. So, if P is true and Q is false, we want
1' - Q to be false, which accounts for the second line of the table. Next, suppose that Q is true. Then, independently of P and its truth value, it is plausible to assert that P-' Q is true. This reasoning suggests the assignments made in the first and third lines of the table. To justify the fourth line, consider the statement P A Q - -) P. We expect this to be true regardless of the choice of I' and Q. But, if P and Q are both false, then P A Q is false, and we are led to the conclusion that if both antecedent and consequent are false, a conditional is true. The table for the biconditional is determined by that for conjunction and the conditional, once it is agreed that P <- Q means the same as (P -+ Q) A (Q ---' P). These five tables are to be understood as definitions; they arc the customary definitions adopted for mathematics. We have made merely a feeble attempt to make them seem plausible on the basis of meaning. It is an immediate consequence of these definitions that if P and Q are statements, then so are each of -, P, P A Q, I' V Q, P ---' Q, and P E--' Q. It follows immediately that any composite sentence whose prime colnponents are statements is itself a statement. If the truth values of the prime components are known, then the truth value of the composite statement can be determined in a mechanical way.
166
Logic
I
CHAP. 4
EXAMPLES 2.1. Suppose that a composite statement is symbolized by
PVQ-+(RH-1S) and that the truth values of P, Q, R, and S are T, F, F, and T, respectively. Then the value of P V Q is T, that of -1S is F, that of R H -1S is T, and, hence, that of the original statement is T, as a conditional having a true antecedent and a true consequent. Such a calculation can be made quickly by writing the truth value of each prime statement underneath it and the truth value of each composite constructed under the connective involved. Thus, for the above we would write out the following, where, for study purposes, we have put successive steps on successive lines.
P V Q -i (R E3 -,S) T
F
T
F
T
F
T T
2.2. Consider the following argument. If prices are high, then wages are high. Prices are high or there are price controls. Further, if there are price controls, then there is not an inflation. There is an inflation. Therefore, wages are high. Suppose that we are in agreement with each of the first four statements (the premises). Must we accept the fifth statement (the conclusion)? To answer this, let us first symbolize the argument using letters "P," "W," "C," and "I" in the obvious way. Thus, P is the sentence "Prices are high." Then we may present it as follows: P -, W
PVC C -' -i I I W
To assume that we are in agreement with the premises amounts to the assignment of the value T to the statements above the line. The question posed then can be phrased as: If the premises have value T, does the conclusion have value T? The answer is in the affirmative. Indeed, since I and C -b -,I have value T, the value of C is F according to the truth table for the conditional. Hence, P has value T (since P V C is T) and, therefore, W has value T (since P -- W is T). 2.3. We consider the conjunction
(PVC)A(C-'-,I) of two of the statements appearing in the preceding example. In general, the
4.2
(
The Statement Calculus. Truth Tables
167
truth value which such a statement will receive is dependent on the assignments
made to the prime statements involved. It is realistic to assume that, during periods of changing economic conditions, the appropriate truth value assignments to one or more of P, C, and I will change from T to F or vice versa. Thus the question may arise as to combinations of truth values of P, C, and I for which
(P V C) A (C -' -,l) has value T or value F. This can be answered by the examination of a truth table in which there appears the truth value of the composite statement for every possible assignment (211) of truth values to P, C, and I.
This is called the truth table for the given statement, and it appears below. Each line includes an assignment of values to P, C, and I, along with the associated value of (P V C) A (C--+ -,l). The latter may be computed as in the first example above. However, short cuts in filling out the complete table will certainly occur to the reader as he proceeds. P C I (PVC) A (C -- -1I) T T T T
T T
T
F
F
T
F F
T
T
F
T
F
T
F
F
T T
F
T
F F
F F
T
F F
F
2.4. If P is "2 is a prime" and L is "Logic is fun," there is nothing to prohibit our forming such composite statements as
PV L,P->L, -,P --*PV L. Since both P and L have truth values (clearly, both are T), these composite statements have truth values which we can specify. One's initial reaction to such nonsense might be that it should be prohibited- -that the formation of conjunctions, conditionals, and so on, should be permitted only if the component statements are related in content or subject. However, it requires no lengthy reflection to realize the difficulties involved in characterizing such obscure notions. It is much simpler to take the easy way out: to permit the formation of composite statements from any statements. On the basis of meaning, this amounts to nonsense sometimes, but no harm results. Our concern is with the formulation of principles of valid reasoning. In applications to systematic reasoning, composite statements which amount to gibberish simply will not occur.
EXERCISES 2.1. Suppose that the statements P, Q, R, and S are assigned the truth values T, F, F, and T, respectively. Find the truth value of each of the following statements.
Logic
168
I
(b) P V (Q V R).
(f) P V R 4- R A --,S. (g) S*--+P--(-,PV S).
(c) R -+ (S A P).
(h) Q A -,S -> (P
(d) P -> (R --> S).
(i)
(a) (P V Q) V R.
CHAP. 4
S).
RAS--1(P--), -,Q VS).
(j) (P V -,Q) V R -+ (S A -,S). (c) P -> (R V S). 2.2. Construct the truth table for each of the following statements.
(a) P-->(P-'Q).
(d) (P-'Q)<-- -1PV Q.
(e) (P -4 Q A R) V (-,P A Q). (f) P A Q - (Q A -, Q --> R A Q). 2.3. Suppose the value of P -> Q is T; what can be said about the value of (b) P V Q,(-+ Q V P. (c) P --> -, (Q A R).
-iPAQHPVQ?
2.4. (a) Suppose the value of P 4-, Q is T; what can be said about the values of P- -1 Q and --, 1' <- Q? (b) Suppose the value of P <-* Q is F; what can be said about the values of P <- --IQ and -1 P 4-4 Q?
2.5. For each of the following determine whether the information given is sufficient to decide the truth value of the statement. If the information is enough, state the truth value. If it is insufficient, show that both truth values are possible.
(a) (P - Q) - R.
(d) -1 (P V Q) -. -1P A -, Q. T
T
(b) P A (Q ' R).
(e) (P - Q) ---> (-,Q --* -,P). T
T
(c) P V (Q --> R).
(f)
T
(P A Q) -> (P V S). T
F
2.6. In Example 1.3 we symbolized the statement If either labor or management is stubborn, then the strike will be settled if the government obtains an injunction, but troops are not sent into the mills as
L V Al - ( S -,R). By a truth-value analysis, determine whether this statement is true or false under each of the following assumptions.
(a) Labor is stubborn, management is not, the strike will be settled, the government obtains an injunction, and troops are sent into the mills. (b) Both labor and management are stubborn, the strike will not be settled, the government fails to obtain an injunction, and troops are sent into the mills.
2.7. Referring to the statement in the preceding exercise, suppose it is agreed
that
4.3
I
T /w Statement Calculus. Validity
169
If the government obtains an injunction, then troops will be sent into the mills. If troops are sent into the mills, then the strike will not be settled. The strike will be settled. Management is stubborn.
Determine whether the statement in question is true or not.
3. The Statement Calculus. Validity The foregoing is intended to suggest the nature of the statement calculus, namely, the analysis of those logical relations among sentences
which depend solely on their composition from constituent sentences using sentential connectives. The setting for such an analysis includes the presence of an initial set of sentences (the "prime sentences") and the following two assumptions.
(i) Each prime sentence is a statement; that is, there may be assigned to a prime sentence a truth value.
(ii) Each sentence under consideration is composed from prime sentences using sentential connectives and, for a given assignment of truth values to these prime sentences, receives a truth value in accordance with the truth tables given earlier for negation, con-
junction, and so on. With this in mind, let us make a fresh start on the statement calculus.
Suppose there is given a noncmpty set of distinct sentences and that we extend this set by adjoining precisely all of those sentences which can be formed by using, repeatedly and in all possible ways, the various sentential connectives. Then the extended set has the following property. If A and B are members, then so are each of -i A, A V B, A A B,
A --* B, and A-+ B. We shall call the members of the extended class formulas. The members of the initial set are the prime formulas, and the others are composite formulas. The prime formulas which appear in a composite formula are said to be contained in that formula and are called its prime components. To display a composite formula unambiguously, parentheses are used. However, to avoid excessive use of parentheses, the conventions introduced earlier will be employed. The classical statement calculus, which is the only one we treat, assumes that with each prime formula there is associated exactly one member of IT, Fl. Further, it assumes that it is irrelevant if it is T or F that is associated with a prime formula. Thereby, maximum versatility in the applications is achieved-truth values may be assigned as the
170
Logic
I
CHAP. 4
occasion demands. The truth value of a composite formula is defined inductively in accordance with the following tables. AAB AvB A--*B A*-3B A -,A A B T
T
T
T
F
F
F
T
F
T T T
F
F
F
F
T
T
T
F
F
F
F
T
T T
F
T
EXAMPLES 3.1. If the prime components in a formula A are P1, P2, , P,,, then the definition of the truth value of A in terms of truth values of P1, P2, , P" can be exhibited in a truth table, as described earlier. There are 2" rows in such a table, each row exhibiting one possible assignment of T's and F's to P,, P2, . , P". 3.2. Let A be a formula having P,, Ps, , P as its prime components. Then A provides a rule for associating with any ordered n-tuple of T's and F's, whose ith coordinate is the assignment to Pi, for i = 1, 2, , n, one of T and F. If we set V = IT, F}, then we can rephrase our observation: A defines a function on V" into V. A function on V" into V we shall call a truth function (of n arguments). Truth functions will be designated by such symbols as
.., P.), g(ql, q2, ..., q"), and so on.
,/ (P1) P2,
Note that we depart from our practice of designating functions by single letters and use notation heretofore reserved for function values. Our excuse is that
composition of functions can be described more simply. For example, the notation 1(pl,
..., pi-1, g(gl) . . .' qm), pi-hl, ..., pn)
is self-explanatory as a function obtained by composition from the truth function f of n arguments and g of nl arguments. We shall refer to this function as that obtained by substitution of g for the ith variable in f. Clearly, such combinations of truth functions are again truth functions. An alternative approach to the statement calculus can be given in terms of truth functions: There are 22 different truth functions of n arguments. Of the four for n = 1, that whose value at T is F, and whose value at F is T, we shall denote by -Ip. Among the sixteen truth functions of two arguments appear the four listed below in tabular form. The reason for the denotations chosen should be clear.
(T, T) (T, F) (F, T) (F, F)
A (p, q)
v (p, q)
-->(p, q)
H(p, q)
T
T
T
F
F
T T T
F F
F
F
F
T T
T
4.3
1
The Statement Calculus. Validity
171
Since the outfix notation for these functions seems unnatural, we shall put the reader at ease by employing the more familiar infix notation [for example, p A q instead of A(p, q)]. It is of interest that by using just those functions which have been mentioned so far and the operation of function composition in the form mentioned above, all truth functions of any number of arguments can be obtained. Indeed, the three functions -,p, p V q, and p A q suffice. To prove this, let f(p,, p2, be some truth function. If the value off is F for all values of p (that is, f is the constant function F), then it is equal to
,
(p, A A (P2 A -,p2) A .. A (pn A-,p,.). Otherwise, f assumes the value T at least once. For each element of the domain off such that f takes the value T, let its form the function
q, A q2A ... Aq,,, where qi is p (or -,p,) when p; has the value T (or F). Then we contend that f is equal to the function obtained "by disjunction" from all such functions. For example, iff(p, q) takes the value F when p = q = T and the value T otherwise, then
f(1p,q)=(pA-1 q) V (-iPAq)V (-,pA-,q). The reader can verify this and supply a proof of the general statement. Actually, each of the pairs -,p, p A q and -,p, p V q is adequate to generate all truth functions, using the operation of function composition, since p V q = -, (-,p A q) and p A q = -1 (-,p V -, q). The same is true of the pair -,p, p --- q, as the reader can verify. Although no member of any of the three pairs mentioned can be discarded to obtain a single function which generates all truth functions, such functions do exist. For example, the function plq (as it is customarily written) of two arguments, whose value is T except at (T, T), where its value is F, suffices. To prove this it is sufficient to show (for example) that both -,p and p V q can be expressed in terms of it. As we have already observed, each formula of the statement calculus defines a truth function. It should be clear that it is only the structure of a composite formula A regarded as a truth function that one considers when making a truth value assignment to A for a given assignment of truth values to its prime components. When it is convenient, we shall feel free to regard a formula as a truth function. In such an event, the prime components (statement letters) will be considered as variables which can assume the values T and F.
The statement calculus is concerned with the truth values of composite formulas in terms of truth-value assignments to the prime components and the interrelations of the truth values of composite formulas having some prime components in common. As we proceed in this study it will appear that those formulas whose truth value is T for every
Logic
172
CHAP. 4
I
assignment of truth values to its prime components occupy a central position. A formula whose value is T, for all possible assignments of truth
values to its prime components, is a tautology or, alternatively, such a formula is valid (in the statement calculus). We shall often write
GA for "A is valid" or "A is a tautology."f Whether or not a formula A is a tautology can be determined by an examination of its truth table. , P,,, then A is a tautolIf the prime components in A are P1, P2, ogy if its value is T for each of the 2" assignments of T's and F's to Pt, P2, , P,,. For example, P -- P and P A (1' -4 Q) -* Q are tautologies, whereas P --' (Q --> R) is not. These conclusions are based on an examination of Tables I, II, and 111, below.
Table I
PJP--4P T F
T T
P
Q
Table II P A (P -- Q) -- Q
T T
T
T
T
F
F
F
F
T
F
F
F
F
T T
T T T T
Table III P
Q
R P -> (Q --> R)
T T T T
T T
T
T
T
F
F
F
F
T
T
F
F
F
T
F
T T
F
F
T
F
F
F
T T T T T T
F
T
T F
T T
The definition of validity provides us with a mechanical way to decide whether a given formula is valid--namely, the computation and examination of its truth table. Although it may be tedious, this method can always be used to test. a proposed formula for validity. But, clearly, it is an impractical way to discover tautologies. This state of affairs has prompted the derivation of rules for generating tautologies from tautologies. 'I he knowledge of a limited number of simple tautologies and
several such rules make possible the detivation of a great variety of valid formulas. We develop next several such rules and then implement their with a list of useful tautologies.
THEOREM 3.1. Let B be a formula and let B* be the formula resulting from B by the substitution of a formula A for all occurrences of a prime component P contained in B. If l- 13, then r- B*. t This symbol for validity (l=) appears to be due to Klecne.
4.3
Die Statement Calculus. Validity
1
173
For an assignment of values to the prime components of B* there results a value v(A) of A and a value v(B*) of 13*. Now v(B*) = v(B), the value of B for a particular assignment of values to its prime components, including the assignment of v(A) to P. If B is valid, then v(B) and hence v(B*) is always T. That is, if B is valid, then so is B*. Proof.
EXAMPLES 3.3. From Table IV below it follows that G P V Q E-a Q V P. Hence, by Theorem 3.1, !_ (R --- S) V Q <-' Q V (R --I S). A direct verification of this result (Table V), using the reasoning employed in the proof of Theorem 3.1, should clarify matters, if need be. To explain the relationship of Table V to Table IV, we discuss the displayed line of Table V. Table IV Table V R S Q (R - S) V Q <--' Q V (R --' S) 1' Q PV QE-+Q VP T T
T
F
T
F
F
F
T T
FTTTTTF
T
F
T
F
T
T
F
There was entered first (at two places) the value F of R -' S for the assignment of T to R and F to S. Then the value T assigned to Q was entered twice. The rest of the computation is then a repetition of that appearing in the third line of Table IV after the entries underlined there have been made. 3.4. The practical importance of Theorem 3.1 is that it provides a method to establish the validity of a formula without dissecting it all the way down to its prime components. An illustration will serve to describe the application we have in mind. Suppose the question arises as to whether the formula
(RVS) A ((RVS)-4(PAQ))-'(PAQ) is a tautology. The answer is in the affirmative, with Theorem 3.1 supplying the justification, as soon as it is recognized that the formula in question has the "same form as" the tautology P A (P --+ Q) --' Q (Table II), in the sense that it results from P A (1' -' Q) -* Q upon the substitution of R V S for P and P A Q for Q.
We introduce next a relation for formulas. For the definition it is convenient to interpret formulas as truth functions and observe that a formula whose prime components are P1, P2, , P. may be regarded as a function of an extended list P,, . , 1',,, . , P,,, of variables. Let us now agree to call formula A equivalent to formula B, symbolized A eq B,
174
Logic
I
CHAP. 4
if they are equal as truth functions of the list of variables P1, P2,
. , Pm,
where each P occurs as a prime component in at least one of A and B. In terms of truth tables, the definition amounts to this. Suppose that I P,, P2, . -, P,4 is the union of the sets of prime components contained in A and B, respectively, and that we compute the truth tables of A and of B as if both contained P,, P2, . , P,,, as prime components. Then A eq B if the resulting truth tables are the same. For example, from
Tables VI and VII below we infer that (P--; Q) eq -,P V Q and
Peg PA (QV -,Q). P Q
Table VI P -> Q -1P V Q
PQ
Table VII P P
T
T
T
T
F
F
F
T T
T
F
T T
T
T F
T
T
F
T
F
F
F
F
T T
T
F
F
F
F
T
It is left as an exercise to prove that eq is an equivalence relation on every set of formulas and, further, that it has the following substitutivity
property: If CA is a formula containing a specific occurrence of the formula A and Ca is the result on replacing this occurrence of A by a formula B, then if B eq A, then egCA. Henceforth, equivalent formulas will be regarded as interchangeable, and the substitution property will often be employed without comment. Equivalence of formulas can be characterized in terms of the concept of a valid formula, according to the following theorem.
TIIEOREM 3.2. KA+->BifAegB. Let P,, P2, : , P,,, be the totality of prime components appearing in A and B. For a given assignment of truth values to these components, the first part of the computation of the value of A -+ B consists of computing the values of A and B, after which the computation is concluded by applying the table for the biconditional. According to this table, the value of A 4-1 B is T if the values computed for A and B are the same. COROLLARY. Let CA be a formula containing a specified occurProof.
rence of the formula A and let CD be the result of replacing this occurrence of A by a formula: B. If t- A 4--> B, then t= CA 4-> CB. t=CA,then t=Co. If This proof is left as an exercise.
4.3
The Statement Calculus. Validity
1
THEOREM 3.3. If
175
A and K A - - + B , then 1= B.
Let P,, P2, , P,,, be the totality of prime components appearing in A and B. For a given assignment of truth values to these, the first part of the computation of the value of A --> B consists of computing the values of A and B, after which the computation is completed by applying the table for the conditional. The assumptions A and 1= A --+ B imply that both the value obtained for A Proof.
and that for A -- B are T. According to the table for A --> B, this implies that B must also have the value T. Since this is the case for all assignments of values to P,, P2,
, Pm, B is valid.
As the next theorem we list a collection of tautologies. It is not intended that these be memorized; rather, they should be used for reference. That many of the biconditionals listed are tautologies should be highly plausible on the basis of meaning, together with Theorem 3.1. That each is a tautology may be demonstrated by constructing a truth
table for it, regarding the letters present as prime formulas. Then, once it is shown that the value is T for all assignments of values to the components, an appeal is made to the substitution rule of Theorem 3.1
to remove the restriction that the letters be prime formulas. In the exercises for this section the reader is asked to establish the validity of
some of the later formulas by applying one or more of Theorems 3.1--3.3 to tautologies appearing earlier in the list.
THEOREM 3.4. Tautological Conditionals
1.1=AA(A>B)-4 B. 2. 3.
1= -,BA k= -,AA (AV 13)>B.
4. tzA-a(B-) AA13). 5. k=AAB--->A. 6. 1= A --> A V B. 7. 1- (A -* B) A (B -> C) - > (.4 --a C). 8. - (A A B-> C) -- (A > (B -- > C)). 9. 1= (A --> (13 -+ C)) -> (A A B -- C). 10. 1= (A --> B A -, B) > -, A. 11. (A -- B) --* (A V C --* B V C.'). 12.
(A -* B) --* ((B --> C) > (A -). C)). 14.
1= (A EI 13) A (B .-> C) > (A .--3 C).
Logic
176
CHAP. 4
I
Tautological Biconditionals 15. 16. 17. 18. 19.
t= AAA. A
--,A
(A+-* B) +-4 (B A). (A -4 B) A (C ---> B) E--> (A V C ---* B).
t= (A -4 B) A (A
C) 4-4 (A -a B A C).
20. K (A-4 B).+(-,B-- -, A). 21.
22.
KAVB4BvA. (A V B) V C H
AV(BVC). 1= AV (BAC) -
21'. tAABt-+BAA. 22'. 1= (AAB)ACH A A (B A C).
24. i=AvA<--'A.
23'. KAA(BVC)f--> (AAB)V (A A C). 24'. IzAAA+-+A.
25. t= -, (A V B) . >
25'. t= -, (AAB)<--4
23.
(A V B) A (A V C).
-,A A -,B.
-,A V -,B.
Tautologies for Elimination of Connectives
26. KA-'B4-,AVB. 27. I A -->BH 28. KAVB<-29. KAVB<-30. t7- AAB<-31. l AAB
32. t= (A+-->B)*-+(A-). B) A (B--->A).
We conclude this section with the description of a powerful method for obtaining tautologies from scratch. Initially we consider only formulas composed from prime formulas P,, P2, , 1' using -,, A, and V. The denial, Ad, of such a formula A is the formula resulting from A by replacing each occurrence of A by V and vice versa and replacing
each occurrence of P, by an occurrence of -, P; and vice versa. As illustrations of denials in the present context we note that the denial of P V Q is -1P A -, Q and the denial of -, (-, P A Q) is -1 (P V The theorem relating denials and tautologies follows.
THEOREM 3.5. Let A be a formula composed from prime components using only -,, A, and V, Let Ad be the denial of A. Then
K -,AHAd.
4.3
I
The Statement Calculus. Validity
177
A proof of this assertion can be given by induction on the number of symbols appearing in a formula. We forego this, but do include in the first example below a derivation of an instance of the theorem. Another example describes the extension of the theorem to the case of a formula which involves --f or 4--p.
EXAMPLES 3.5. An instance of Theorem 3.5 is the assertion that
K-1((-,P V Q) V (Q A (R V -1 P))) 4 (P A -,Q) A (-i Q V (--iR A P)), or, in other words, that the left-hand side and the right-hand side of the biconditional are equivalent formulas. Using the properties of transitivity and substitutivity of equivalence, this is established below. Each step is justified by the indicated part of Theorem 3.4 (in view of Theorem 3.2).
-, ((-,P V Q) V (Q A (R V -,P)))
eq-,(-,PVQ)A-i(QA(RV-,P)) eq (-,-,P A -,Q) A (-,Q V -,(R V -1 P)) eq (--i --i P A -,Q) A (-'Q V (-1R A -,-,P)) eq (P A -,Q) A (-,Q V (-,RAP))
(25)
(25, 25') (25) (16)
3.6. Using tautology 32 in Theorem 3.4 we can derive from a formula in which
appears an equivalent formula in which H is absent. For instance,
P 4-+ (Q A R) eq (P - Q A R) A (Q A R -4 P). That is, H can be eliminated from any formula. Similarly, using tautology 26 or 27, -a can be eliminated from any formula. Thus, any formula A is equivalent to a formula B composed from prime components using -,, A, and V. Then we may define the denial of A to be the denial of B.
3.7. According to the preceding example, H and -> can be eliminated from any formula. Using tautology 29 it is possible to eliminate V or (with tautology 31), equally well, A. That is, any formula is equivalent to one composed from prime components using -, and V or using -, and A. This conclusion should be recognized as merely another version of a result obtained in Example 3.2.
3.8. From tautology 22 follows the general associative law for V, which V A. to render it unambiguous, the resulting formulas are equivalent. From tautology 22' asserts that however parentheses arc inserted in A, V A2 V follows the corresponding result for A A.
EXERCISES 3.1. Referring to Example 3.2, write each of the following formulas as a truth
function in outfix notation. For example --,P --' (Q V (R A S)) becomes -+ (-,P, V (Q, A (R, S)).
178
Logic
(a) P A
(
CHAP. 4
P -* (Q -R) H Q -- * (P - 4 R).
(b) -,P - Q.
(1) P V R -> (B A (S V -,P)). (g) (P - Q) --> (S A -,P -> Q).
(c) P V (Q V R). (d) P A (Q-->B).
3.2. (a) Referring to Example 3.2, complete the proof that every truth function can be generated from -,p, p V q, and p A q. (b) Referring to the same example, show that every truth function can be generated from p(q. 3.3. Suppose that P,, P2, , P. are prime components of A. Show that the truth table of A, regarded as having P1, , P,,, , P," as prime components can be divided into 2'"`-" parts, each a duplicate of the truth table for A computed with P1, Pz, , P. as the prime components. 3.4. Prove that eq is an equivalence relation on every set of formulas and that it has the substitutivity property described in the text. 3.5. Prove the results stated in the Corollary to Theorem 3.2. 3.6. Derive each of tautologies 28-31 from earlier tautologies in Theorem 3.4, using properties of equivalence for formulas. As an illustration, we derive tautology 27 from earlier ones. From 26, A -a --I B eq --1 A V --I B, and, in turn, - A V -1 B eq -, (A A B) by 25'. Hence, A --> -i -1 B eq -I (A A --I B). Using 16 it follows that A --+ B eq -, (A A -1 B), which amounts to 27. 3.7. Instead of using truth tables to compute the value of a formula, an arithmetic procedure may be used. The basis for this approach is the representation of the basic composite formulas by arithmetic functions in the following way. Arithmetical Formula representation
-'P
1+P
PAQ
P -I- Q+ PQ
1'
PQ (1 -I- P) Q P -}- Q
PVQ Q
P f- Q
When the value T (respectively, F) is assigned to a prime component in a formula - for example, 1' -the value 0 (respectively, 1) is assigned to the variable P in the associated arithmetical representation. Further, values of the arithmetical functions are computed as in ordinary arithmetic, with one exception:
namely, 1 + 1 = 0. In each case a simple calculation shows that when the formula takes the value T (respectively, F), then its arithmetical representation takes the value 0 (respectively, 1). In these terns, tautologies are represented by functions which
are identically 0. For example, that K P V -j P is clear from the fact that P V --IP is represented by P(1 + P). To prove that the formula in 1 of Theorem 3.4 (regarding A and B as prime components) is a tautology, we form first [corresponding to A A (A -+ 13)J,
4.4
I
The Statement Calculus. Consequence
179
A + (1 -1- A) B -1- A(1 + A) B,
which reduces to A + (1 + A)B since A(1 + A) is identically 0. Then to the entire formula in I corresponds the function (I + A + (1 + A) B) B, which, as one sees immediately, is identically 0.
In the algebra at hand, 2x = 0, x(x + 1) = 0, and x2 = x for all x. These facts make the simplification of long expressions an easy matter. Prove some of the tautologies in Theorem 3.4 by this method.
3.8. (a) With Exercise 3.7 in mind, show that the function (I + P)(1 + Q) is an arithmetical representation of the truth function PIQ defined in Example 3.2. (b) The result in (a), together with that in Exercise 3.2(b), may be reformulated as follows: Every mapping on (0, 1) ft into (0, 11 can be generated from the mapping f : {0, 1) 2 -} {0, 1 ) such that f (x, y) _ (1 + x) (1 +y). Show that the same is true ofg: {0, 1}a {0, 1}, where g(x, y, z) = 1 + x -1- y -1- xyz.
4. The Statement Calculus. Consequence In the introduction to this chapter we said that it was a function of logic to provide principles of reasoning--that is, a theory of inference. In practical terms this amounts to supplying criteria for deciding in a
mechanical way whether a chain of reasoning will be accepted as correct on the basis of its form. A chain of reasoning is simply a finite sequence of statements which are supplied to support the contention that the last statement in the sequence (the conclusion) may be inferred
from certain initial statements (the premises). In everyday circumstances the premises of an inference are judged to be true (on the basis of experience, experiment, or belief). Acceptance of the premises of an inference as true and of the principles employed in a chain of reasoning from such premises as correct commits one to regard the conclusion
at hand as true. In a mathematical theory the situation is different. There, one is concerned solely with the conclusions (the so-called "theorems" of the theory) which can be inferred from an assigned initial set of statements (the so-called "axioms" of the theory) according to rules which are specified by some system of logic. In particular, the
notion of truth plays no part whatsoever in the theory proper. The contribution of the statement calculus to a theory of inference is just this: It provides a criterion, along with practical working forms thereof,
for deciding when the concluding sentence (a statement) of an argu-
180
Logic
cnIAF. 4
I
ment is to be assigned the value T if each premise of the argument is assigned the truth value T. This criterion is in the form of a definition. The statement B is a consequence of statements At, A2, A,,, (by the statement calculus), symbolized A,, A2, ... , Am 1= B,
iff for every truth-value assignment to each of the prune formulas , P,, occurring in one or more of A,, A2j 1',, P2, , A,,,, and B, the formula B receives the value T whenever every A receives the value T. In terms of truth tables, "A,, A2, , Am K B" means simply that if truth tables are constructed for A,, A2, , Am, and B, from the list , P. of prime formulas occurring in one or more of these P,, P2, formulas, then B receives the value T at least for each assignment to the F's which make all A's simultaneously T.
EXAMPLE 4.1. From an inspection of Table VIII below we obtain the following three illustrations of our definition: P, R, Q A P --> -, R l=
(line 3)
--, Q,
P,P--*R,RK PV Q
(lines land 3)
R,
Q A P - --, R, --, Q, 1' - R l -, (P A Q).
Table VIII R QAP--*-1R P
P
(lines 3, 7, 8)
-,(PAQ)
P
Q
T T T T
T T
T
F
F
T
T
F
F
F
F
F
F
F
T
T
F
T T
T
F
F
F
F
T T
T
F
F
F
T
F
F
F
T T T T
T
F
T T T T T T T
T T T T T T
F
F
T T
Q
F
T T
THEOREM 4.1. (I) A 1= B iff l= A - - + B.
(II) A1, A2, , Am 1= B iff A,AA2A KAlAA2A ... AA,,--+ B(m> 2).
AAm1=B or, if
For (I), let A 1= B. By the table for --+, A > B receives the value F if A receives the value T, and, simultaneously, B receives the Proof.
value F. From the hypoth-,sis, this combination of values does not
4.4
Die Statement Calculus. Consequence
1
181
occur. Hence A -+ .13 always receives the value T, that is, t= A -> B.
For the converse, let t= A --* B, and consider an assignment of values to the prime components such that A receives the value T. Since A -4 B receives the value T, it follows from the table for that B takes the value T, whence, A J-_ B.
The first assertion in (II) follows from the table for A, and the second follows from the first by an application of (I).
COROLLARY. A,, ,Am_,,Am t- BiffA,, More generally, At, , Am_t, Am l B iff
-, Am-, t-_ Am
B.
t-- A, --, (A2 -.* (
(Am -.._, B) ... )).
For m = 1, the first assertion is (I) of the theorem. So, assume that A,, , Am_,, A. f= B for m > 1. Then f= (At A . A Am_,) A A,,,--* B, according to the theorem. From tautology 8 of TheoProof.
rem 3.4 and Theorem 3.3, we deduce that t= (A, A
A Am_,)
According to (1) of the theorem, it follows that A, A A Am_, t= Am --+ B and hence, by (II), that A,, , A,,,_, I- Am -> B. The converse is established by reversing the foregoing steps. Finally, the second assertion follows by repeated application of the (Am ---> B).
first.
Thus, the problem of what statements arc consequences of others (by the statement calculus) is reduced to the problem of what statements arc valid (which accounts for the importance of tautologies). On the other hand, there is something to be said for approaching the concept of consequence directly. One reason is the possibility of converting the definition into a working form which resembles that used in math-
ematics to infer theorems from a set of axioms. Indeed, we can substantiate a working form as a sequence of formulas (the last formula being the desired consequence of the premises) such that the presence of each is justified by a rule, called a rule of inference (for the statement calculus). The basis for the rules of inference which we shall introduce is the following theorem.
THEOREM 4.2. (I) At, A2, ,A,,,1 Aifori= 1, 2, , ,n. (II) If At, A2, - ,At= B;for j = 1, 2, ,p,andifB,,B2, -
B,,
C, then At, A2,
-, A. I= C.
Logic
182 Proof.
I
CHAP. 4
Part (I) is an immediate consequence of the definition of
, A. 1= B." For (II) we construct a truth table from the "A,, A2, list P1, P2j - , P. of all prime components appearing in at least one of the A's, the B's, and C. Consider any row in which A,, A2,
, A,,,
each receive the value T. Then, by the hypotheses, each B has the value T, and hence C has the value T. That is, for each assignment of values to the P's such that every A takes the value T, formula C receives the value T. This is the desired conclusion. With this result, a demonstration that a formula B (the conclusion) is a consequence of formulas A1, A2, - , A. (the premises) may be pre-
sented in the form of a string (that is, a finite sequence) of formulas, the last of which is B and such that the presence of each formula E is justified by an application of one of the following rules. Rule p : The formula E is a premise. Rule t: There are formulas A, , D preceding E in the string such A D -' E. that t= A A
That is, we contend that A,, A2, , A. K B if we can concoct a string E,, E2, ... , E,(= B)
of formulas such that either each E is a premise (rule p) or there are preceding formulas in the string such that if C is their conjunction, C - E (rule t). Indeed, assuming that each entry in the disthen played sequence can be so justified, we shall prove that A,, A2, - , A. i (any E in the sequence). This is true of E_, by Theorem 4.2(I). Assume that each of E,, E2, ., Ek_, is a consequence of the A's; we prove that the same is true of the
next formula Ek. If Ek is a premise, then Theorem 4.2(I) applies. Otherwise, there are formulas preceding Ek such that if C is their conjunction, then i C-> Ek. Let us say
E A E,, A ... A E,. -f Ek. Then, by Theorem 4.1 (II), E,,,
E,,,
.
. ., E`.
(' Ek,
and, by assumption,
A1, A2, ...,AmI=Er
j = 1, 2, ...,s.
Hence, by Theorem 4.2(11), A1, A2, ... , A. K Ek.
4.4
1
The Statement Calculus. Consequence
183
We note, finally, that by an application of rule t any tautology may be entered in a derivation. Indeed, if i D, then for any formula A we have K A -+ D. Thus, D may be included in a derivation by an application of rule l wherein we take any premise as the "A." The examples which follow illustrate the foregoing method for demonstrating that some formula is a consequence of given formulas. To make
the method entirely definite, let us agree that when applying rule t, only the tautological conditionals which appear explicitly in Theorem 3.4 or are implicit in the biconditionals of that theorem (for exA H A yields the tautological conditional 1= A -a A and ample, A --+ K -, -, A H A yields 1= -i -1 A ---> A and -i A) may be used. Admittedly, this is an arbitrary rule. Our excuse for making it is that it serves to make the game to be played a definite one. i
EXAMPLES 4.2. We demonstrate that
AVB,A-*C,B--+DrCVD. An explanation of the numerals on the left is given below. {1} {1 }
(1) A --+C (2) A V B --+ C V B
Rulep
{3}
Rule p
{3)
(3) B -- 1) (4) C V B --' C V 1)
{1, 3}
(5) A V B --j C V I)
Rule t; l (2) A (4) - (5) by
(6)
(6) A V B
Rule p
11, 3, 6)
(7) C V D
Rule t; 1= (6) A (5) --, (7) by
Rule t; 1= (1) -a (2) by tautology 11.
Rule i; G (3) - (4) by tautology 11. tautology 7.
tautology 1.
The numbers in parentheses adjacent to each formula serve to designate that
formula as well as the line of the derivation in which it appears. The set of numbers in braces for each line corresponds to the premises on which the formula in that line depends. That is, the formula in any line n is a consequence of the premises designated by the numbers in braces in that line. Thus, the formula in line 5 is a consequence of the premise in line 1 and the premise in line 3, and the formula in line 7 is a consequence of the premises in lines 1, 3, and 6--that is, of all the premises. In particular, for a line which displays a premise there appears in braces at the left just the number of that line, since such a formula depends on no other line. Using the brace notation in connection with the numerals on the left is deliberate in that it suggests that the for-
184
Logic
I
C. H A P. 4
mula in that line is a consequence of the set of premises designated by those numbers. We now rewrite the above derivation, incorporating some practical abbreviations. In this form the reader is called on to supply the tautologies employed. {1} (1) A -> C p
V B--3CV
{1)
(2) A
(3)
(3) B -- D
{3}
(4) CV
{6}
(6) A V B
B
It
p
D
B
2,41 p
{1,3,6) (7) CV D 4.3. As a more elaborate illustration we prove that
5,61
WVP->I,I-CVS,S-U,-,CA-,U1=-,W by the following string of thirteen formulas. {1)
(1) -, C A -, U
{1)
(2)
(3)
(3) S -> U (4) -1S (5) -1C (6) -1C A --1S (7) -, (C V S)
{1,3}
{I} {1, 3} 11,3) {8}
(8)
p 1 1
p
2,31 1t 4, 5 t 6t
W V P -+I
p p
(9) 1 -> C V S
{9}
{8,9}
U
(10) WVP -). C VS 8,91 (11) -,(WVP) 7,101 (12) -,WA -,P 111
{1,3,8,9) {1,3,8,9) 121 (13) -,W (1,3,8,9) We note that the foregoing takes the place of a truth table having 26 = 64 lines for the purpose of verifying that
t= (WVP - I) A (I - C V S) A (S -3 U) A (-1 (: A
U) -> -, W. 4.4. Many theorems in mathematics have the form of a conditional, the assumptions being the axioms of the theory under development. The symbolic form of such a theorem is A,, A2,
, A. k= B
. C,
where the A's are the axioms and 1? -4 C is the consequence asserted. In order to prove such a theorem it is standard practice to adopt B as a further assumption and then infer that C is a consequence. Thereby it is implied that A1, A2, ..., A,,, K B - C ifF A1, A2, ..., A,,,, B k= C.
This is correct according to the Corollary to Theorem 4.1. It is convenient to
4.4
!
The Statement Calculus. Consequence
185
formulate this as a third rule of inference, the rule of conditional proof, for the statement calculus. Rile rp: The formula B -. C is justified in a derivation having At, Az, A. as premises if it has been established that C is a consequence of At, Az, , Am, and B. As an illustration of the use of this rule we prove that
A-4(B-+C),-,D V A,B1=D--*C. (1) {2} {3}
(1) A -+ (B ---> C) p fi (2) -11) V A
{4)
(4) D
{2, 4}
(5) A (6) B (7) C
(1,2,4) {1, 2, 3, 41
{1, 2, 3}
(3) B
P
p (introducing "D" as an additional premise) 2,41 C
1, 5 1
3, 6 t 4, 7 rp
(8) D -* C
The usefulness of the braced numbers to show precisely what premises enter into the derivation of the formula in that line is clear. 4.5. Even if an alleged consequence of a set of premises does not have the form of a conditional, the application of the strategy as described in the preceding example may simplify a derivation. As an illustration we rework the first example, starting with the observation that the conclusion C V D is equivalent to -,C --> D. This suggests adding -,C as a premise and hoping that D can be derived as a consequence of this and the other premises. An advantage gained thereby is the addition of a simple assumption. The derivation follows. {1} {2} {3} {4} {2, 4} {1, 2, 4)
(1)AVB
p
(2) A -- C (3) B - I) (4) -,C
p
(5) -, A (6) B (7) D
p p 2, 4 t 1, 5 t
{1,2,3,4) 3,61 4,7cli (8) -,C -- D {1,2,3) 81 {1,2,3} (9) CVD 4.6. Each of the tautological implications in Theorem 3.4 generates a rule of inference, namely, the instance of rule t, which is justified by reference to that tautology alone. For example, tautology 1 in Theorem 3.4 determines the rule from A and A --i B to infer B. This is called the rule of detachment or modus ponens. In a textbook devoted to logic, names for many rules of inference of this sort will be found. Probably modus ponens is the one used most frequently in derivations.
Logic
186
Char. 4
I
EXERCISES Note: It is intended that the restrictions described prior to Example 4.2 shall apply to applications of rule 1. 4.1. By an examination of Table VIII in Example 4.1, justify the conclusions drawn in that example. 4.2. Complete each of the following demonstrations of consequence by supplying the tautologies employed and the numbering scheme discussed in Example 4.2.
(a) A --> B, --1(B V C) l= --,A
(c) (A A B) V (C A D),
A->11
A--->
-,(BVC) -,11A-,C
-,A
A
-,11 A
A A
A
C
D->AVC
DrB
D
A
A ---> (C --> B), -, D V A,
-,DVA
CAD --'B
1)
AVC
A
A->B
A-->(C-*B)
C->B C
B
B
B
4.3. Justify each of the following, using only rules p and 1.
(a) -,AV B,C,->--, B A
E ->F1
AF.
(d) A->(BAC),-,BVD,(E->-,F)->-,D,B->(AA -,F.) K B-->E. (e) (A --> B) A (C -> D), (B --> E) A (D --> F), -, (E A F), A --> C
-,A.
4.4. Try to shorten your proofs of Exercise 4.3(a), (b), (c), (d) using rule cp (along with rules p and t).
4.5. Can the rule of conditional proof be used to advantage in Exercise 4.3(e)? Justify your answer.
4.5
!
The Statement Calculus. Applications
187
5. The Statement Calculus. Applications We now turn to some household applications of the theory of inference which we have discussed. Usually the circumstances accompanying the presentation of an argument include the audience having the privilege of accepting or rejecting the contention that some statement B is a consequence of statements A,, A2, , A,,,. In this event, the man who thinks for himself will want to prove either that B is a consequence
of the A's or that the argument is invalid, that is, that there can be made an assignment of truth values to the prime components at hand such that simultaneously each A receives value T, and B receives value F.
The most expedient way to cope with the entire matter is this: Assume that B has value F and that each A has value T, and analyze the consequences so far as necessary assignments of truth values to prime components are concerned. Such an analysis will lead to either a contradiction, which proves that B is a consequence of the A's, or an assignment to each prime component such that all assumptions are satisfied, which proves that the argument is invalid. The foregoing method for proving that some formula is a consequence
of others undercuts that promoted in the preceding section since it proceeds so quickly. However, the earlier method has (at least pedagogical) merits. For example, it leads to an acquaintance with the tautologies in Theorem 3.4. Instances of these are commonplace in proofs in mathematics, and the reader should learn to recognize them as such. As an illustration, tautology 20 justifies the familiar conclusion that if the contrapositive, - Q -a -i P, of P -+ Q is a consequence of A, then so is P--> Q.
EXAMPLES 5.1. Consider the following argument.
If 1 go to my first class tomorrow, then I must get up early, and if I go to the dance tonight, I will stay up late. If I stay up late and get up early, then I will be forced to exist on only five hours of sleep. I simply cannot exist on only five hours of sleep. Therefore, I must either miss my first class tomorrow or not go to the dance. To investigate the validity of this argument, we symbolize it using letters for prime statements. Let C be "I (will) go to my first class tomorrow," G be "I must get up early," D be "I (will) go to the dance tonight," S be "I will stay
188
Logic
I
CHAP. 4
up late," and E be "I can exist on five hours of sleep." Then the premises may be symbolized as
(C - C) A (1) - S),
SAC ->I, -, E,
and the desired conclusion as
ACV -iD.
Following the method of analysis suggested above, we assume that -, C V -i D has value F and that each premise has value T. Then each of C and D must have
value T. Further, according to the first premise, both C and S have value T. This and the second premise imply that E has value T. But this contradicts the assumption that the third premise has value T. Thus we have proved that -,C V -1D is a consequence of the premises. 5.2. Suppose it is asserted that
A-'B,C-->D,AVCt= BAD. Assume that B A D has value F and each premise has value T. The first assumption is satisfied if T is assigned to B and F is assigned to D. Then C has value F, and A has value T. With these assignments, each premise receives value T, and B A D takes value F. Hence the argument is invalid.
Related to the foregoing, but distinct from it, is the question of the satisfiability of a set of statements which is proposed as the set of premises for an inference. A set { A,, A2, - -, Am } of statements is satisfiable (within the statement calculus) iff there exists at least one assignment of truth values for the prime components such that the A's simultaneously receive value T. It is clear that {A,, A2, , Am} is satisfiable if - A A. is T for at least one combination of truth-value asAt A A2 A signments to the prime components and is not satisfiable if A, A A2 A A Am is F for all combinations of truth-value assignments to the prune components. The nonsatisfiability of a set of statements can be established within
the framework of the methods described in the preceding section as soon as the following dclinition is made. A contradiction is a formula which always takes the value F (for example, A A --,A).
THEOREM 5.1 . A set
{ A,, A2,
-, Am } of statements is not satis-
fiable if a contradiction can be derived as a consequence of the set. Assume that A,, A2, , A,,, 1= B A -, B for some formula B. A A. -, B n -, B, and the conclusion follows Then A, A A2 A from the truth table for the conditional. Proof.
4.5
I
The Statement Calculus. Applications
189
Contradictions also play an important role in the method of indirect
proof (also called proof by contradiction or reductio ad absurdum proof). The basis for this type of proof is the following result.
THEOREM 5.2.
A,, A2,
,
K B if a contradiction can be
derived as a consequence of A,, A2, Proof. Assume that A,, A2, . , mula C. Then A,, A2,
,
, A,,, and -, B. A -, C for some forA -, C. Consider now an
assignment of values to the prim; components at hand such that every A receives value T. Then -, B ---3 C A -, C has value T. This and
the fact that C A -, (, receives value F imply that -, B has value F and hence that B has value T.
EXAMPLES 5.3. We illustrate the usefulness of Theorem 5.1 in proving the nonsatisliiability of a set of statements. Such a proof follows the same pattern as one devised to establish the correctness of an argument in all but one respect: in a proof of the correctness of an argument the final line, which is the conclusion, is assigned in advance, whereas, in a proof of nonsatisliability the final line is any contradiction. For example, suppose that it is a question of the satisfiability of a set of statements which may be symbolized as
AFiB, B --*C,
--1
We adopt these as a set of premises and investigate what inferences can be made. {1} {2)
(1) A <-> I.
(2) B -C
p
{3}
(3) -, C V 1)
p
{4} {5}
(4) --,A --' D
p
(4,5) {4, 5}
(1,2) 11, 2, 4, 5}
{3, 51
{1, 2 ,3, 4, 5}
(5)
D
(6) -, -, A (7) A (8) A --I C (9) C
(10) -,C (11) CA -,C
p
P
4, 5t 6t
1,21 7,81 3, 5t 9,101
We conclude that the set is not satisfiable.
5.4. We could introduce a further rule of inference based on Theorem 5.2. Alternatively, we may employ the rule of conditional proof and the tautology r- (-, B --+ C A -,C) -b B to justify an indirect proof. As an illustration, we rework Example 5.1 in this section, starting with the negation of the desired conclusion as an additional premise.
190
Logic
(1) (C-'G) A (D --'S)
p
(4) -,(ACV -,D)
p p p
(2) SAC-,E (3)-,E (5) C A D
i
CHAP. 4
(6) C
(7) C--'C (8) G
(9) D - S (10) D (11) S (12) S A G (13) E
(14) Ii A -,E (15) -, (-,(; V -,D) - E A -1E (16) --j C V --,D
It is left as an exercise to supply the missing details.
EXERCISES Use the method discussed in this section to prove the validity or invalidity, whichever the case might be, of the arguments in Exercises 5.1-5.12 below. For those which are valid, construct a formal proof. In every case use the letters suggested for symbolizing the argument.
5.1. Either I shall go home or stay and have a drink. I shall not go home. Therefore I shall stay and have a drink. (II, S) 5.2. If John stays up late tonight, he will be dull tomorrow. If he doesn't stay up late tonight, then he will feel that life is not worth living. Therefore, either John will be dull tomorrow or he will feel that life is not worth living. (S, D, L)
5.3. Wages will increase only if there i3 inflation. If there is inflation, then the cost of living will increase. Wages will increase. 'Therefore, the cost of living will increase. (W, I, C)
5.4. If 2 is a prime, then it is the least prime. If 2 is the least prime, then I is not a prime. The number 1 is not a prime. Therefore, 2 is a prime. (P, L, N) 5.5. Either John is exhausted or he is sick. If he is exhausted, then he is contrary. Ile is not contrary. Therefore, he is sick. (E, S, C) 5.6. If it is cold tomorrow, I'll wear my heavy coat if the sleeve is mended. It will be cold tomorrow, and that sleeve will not be mended. Therefore, I'll not wear my heavy coat. (C, H, S)
5.7. If the races are fixed or the gambling houses are crooked, then the tourist trade will decline, and the town will suffer. If the tourist trade decreases, then the police force will be happy. The police force is never happy. Therefore, the races are not fixed. (R, H, D, S, P)
4.5
1
The Statement Calculus. Applications
191
5.8. If the Dodgers win, then Los Angeles will celebrate, and if the White Sox win, Chicago will celebrate. Either the Dodgers will win or the White Sox will win. However, if the Dodgers win, then Chicago will not celebrate, and if the White Sox win, Los Angeles will not celebrate. So, Chicago will celebrate if and only if Los Angeles does not celebrate. (D, L, W, C)
5.9. Either Sally and Bob are the same age or Sally is older than Bob. If Sally and Bob are the same age, then Nancy and Bob are not the same age. If Sally is older than Bob, then Bob is older than Walter. Therefore, either -Nancy and Bob are not the same age or Bob is older than Walter. (S, 0, N, W) 5.10. If 6 is a composite number, then 12 is a composite number. If 12 is a composite number, then there exists a prime greater than 12. If there exists a t'rirne greater than 12, then there exists a composite number greater than 12. If 2 divides 6, then 6 is a composite number. The number 12 is composite. Therefore, 6 is a composite number. (S, W, P, G, D) 5.11. If I take the bus, and the bus is late, I'll miss my appointment. If I miss my appointment and start to feel downcast, then I should not go home. If I don't get that job, then I'll start to feel downcast and should go home. Therefore, if l. take the bus, and the bus is late, I will get that joh. (B, 1., M, D, FI, J) 5.12. If Smith wins the nomination, he will be happy, and if he is happy, he is not a good campaigner. But if lie loses the nomination, lie will lose the confidence of the party. lie is not a good campaigner if he loses the confidence of the party. If lie is not a good campaigner, then he should resign from the party. Either Smith wins the nomination or he loses it. Therefore, he should resign from the party. (N, H, C, P, R) 5.13. Investigate the following sets of premises for satisfiability. If you conclude that a set is not satisfiable by assigning truth values, then reaffirm this using Theorem 5.1 and vice versa. Substantiate each assertion of the satisfiability of a set of premises by suitable truth-value assignment,, (c) (A --> B) A (C --), D) (a) A -, (B A C)
DV E- C
C-.-> -(IIV I)
-,CAEAI1
(b) AV B -'CAD
DVE -'C AV --,G
(B
D) A (-1C--rA)
(E -4G) A (G-'-, D)
-,I- Z
(d) (A-aBAC)A(D--), BAE) (G-a--,A)AH--i1
(1I--'I)-'GAD -,(-,C--GIs)
(e) The contract is fulfilled if and only if the house is completed in February.
If the house is completed in February, then we can move in March 1. If we can't move in March 1, then we must pay rent for March. If the contract is not fulfilled, then we must pay rent for March. We will not pay rent for March. (C, H, M, R)
Logic
1 92
I
CHAP. 4
5.14. Give an indirect proof of the validity of the argument in the following. (d) Exercise 5.7. (a) Example 4.3. (e) Exercise 5.11. (b) Example 4.4. (f) Exercise 5.12. (c) Example 5.1.
5.15. Prove that if A, -1B Ir- C (a contradiction), then A I-= B.
6. The Predicate Calculus. Symbolizing Everyday Language The theory of inference supplied by the statement calculus is quite inadequate for mathematics and, indeed, for everyday arguments. For example, from the premises every rational number is a real number, 3 is a rational number, certainly
3 is a real number is justified as a conclusion. Yet the validity of this argument cannot be established within the context of the statement calculus. The reason is that the statement calculus is limited to the structure of sentences in
terms of component sentences, and the above inference requires an analysis of sentence structure along the sul)ject--predicate lines that grammarians describe. In other words, the statement calculus does not break down a sentence into sufficiently "fine" constituents for most purposes. On the other hand, with the addition of three additional logical notions, called terms, predicates, and quantifiers, it has been found that much of everyday and mathematical language can be symbolized in such a way as to make possible an analysis of an argument. We shall describe these three notions in turn.
It is standard practice in mathematics to introduce letters such as "x" and "y" to reserve a place for names of individual objects. For example, in order to determine those real numbers such that the square of the number minus the number is equal to twelve, one will form the
equation x2 - x = 12, thereby regarding "x" as a placcholder for the name of any such (initially unknown) number. Again, as it is normally understood, the "x" in such an equation as sine x + cost x = 1 reserves a place for the name of any real or, indeed, complex number. As it is employed in "x2 - x = 12," one is accustomed to calling "x"
4.6
I
77ie Predicate Calculus. Symbolizing Everyday Language
193
an unknown, and in "sing x + cos2 x = 1" one is likely to refer to "x" as a variable. The usage we shall make of letters from the latter part of the alphabet in symbolizing everyday language shall be like that just described- that is, as an unknown or a variable. In logic it is customary to employ the word "variable" for either usage; the decision as to whether "x" is intended to be a variable in the intuitive sense or an unknown is made on the basis of the form of the expression in which
it appears. Since, ultimately, we intend to strip all symbols of any meaning whatsoever, it is simplest to do this at the outset for variables. This we do by defining an individual variable to be a letter or a letter with a subscript or superscript. Variables constitute one class of terms. We shall also find use for letters and symbols as names of specific, well-defined objects; that is, we shall use letters and symbols for proper names. Letters and symbols used for this purpose are called individual
constants. For example, "3" is an individual constant, being a name of the numeral 3. Again, "Winston Churchill" is an individual constant. In order to achieve a compact notation we shall use a letter from the beginning of the alphabet to stand for a proper name if there is no accepted symbol for it. For example, we might let a = Winston Churchill if we intend to translate the sentence Winston Churchill was a great statesman into symbolic form. Proper names are often rendered by a "description," which we take
to be a name that by its own structure unequivocally identifies the object of which it is a name. For example, the first president of the United States and
the real number x such that for all real numbers y, xy = y are descriptions. If we let b = George Washington, then we may write b = the first president of the United States. Further, we have 1 = the real number x such that for all y, xy = y. Collectively, individual variables and individual constants (either in
194
Logic
I
CIIAP. 4
the form of proper names or descriptions) are classified as terms. The grammatical function of variables is similar to that of pronouns and common nouns in everyday language, and the function of individual constants is similar to the role of proper nouns. We now turn to the notion of predicates. In grammar a predicate is the word or words in a sentence which express what is said of the subject; for example, "is a real number," "is black," "is envious." In logic the word "predicate" has a broader role than it has in grammar. The basis for this is the observation that if a predicate is supplemented by including a variable as a placeholder for the intended subject (for example, "x is a real number"), the result behaves as a "statement function" in the sense that for each value of x (from an appropriate domain) a statement results. Although "John loves" is not a predicate in gram-
mar, if "x" is introduced as a placeholder for the object (of John's affections), which yields
John loves x, the result is a statement function in the sense just described. An obvious generalization is at hand, namely, the extension to statement functions of more than one variable. Examples are x is less than y, x divides y, z is the sum of x and y. The upshot is the notion of an n-place predicate P(x,, x2, , as an
expression having the quality that on an assignment of values to the variables x,, x2, - , x from appropriate domains, a statement results. For convenience we include 0 as a value of n, understanding by a 0-place predicate a statement. We now consider some examples of translations into symbolic form. EXAMPLES 6.1. The sentence Every rational number is a real number (1) may be translated as For every x, if x is a rational number, then x is a real number. (2) In ordinary grammar, "is a real number" is the predicate of (1). In the translation (2) the added predicate "x is a rational number" replaces the common noun "rational number." Using "Q(x)" for "x is a rational number" and "R(x)" for "x is a real number," we may symbolize (2) as
4.6
I
The Predicate Calculus. Symbolizing Everyday Language
195
For every x, Q(x) - R(x). Further, the statement "3 is a rational number" may be symbolized by (3)
(4)
Q (3).
In terms of symbolism available at the moment, (3) and (4) are the translations of the premises of the argument appearing at the beginning of this section. 6.2. The sentence Some real numbers are rational we translate as For some x, x is a real number and x is a rational number. Using the predicates introduced above, this may be symbolized as (5) For some x, R(x) A Q(x). 6.3. The sentence For some x, R(x) --> Q(x)
(6)
should have the same meaning as For some x, -, R(x) V Q(x),
(7)
since we have merely replaced "R(x) -- Q(x)" by its equivalent "-y R(x) V Q(x)." Now (7) may be translated into words as There is something which is either not a real number or is a rational number. Certainly, this statement [which has the same meaning as (6)] does not have the same meaning as (5). Indeed, as soon as we exhibit an object which is riot a real number we must subscribe to (6). In summary, (6) and (5) have different meanings.
By assumption, on suitable assignments of values to the variables in a predicate, a statement results. For example, if S(x) is "x is a sophomore," this predicate yields the statement "John is a sophomore." A
statement may also be obtained from S(x) by prefixing it with the phrase "for every x": (8)
For every x, x is a sophomore.
No doubt, one would choose to rephrase this as (9)
Everyone is a sophomore.
The phrase "for every x" is called a universal quantifier. We regard "for every x," "for all x," and "for each x" as having the same meaning and symbolize each by
(bx) or (x).
196
Logic
I
CrrAP. 4
Using this symbol we may symbolize (8) or (9) as (x)SW
Similarly, prefixing S(x) with the phrase "there exists an x (such that)" yields a statement which has the same meaning as "There are sophomores." The phrase "there exists an x" is called an existential quantifier. We regard "there exists an x," "for some x," and "for at least one x" as having the same meaning, and symbolize each by (3x).
Thus, "(3x)S(x)" is the symbolic form of "There are sophomores." In each of Examples 6.1 6.3 above a quantifier prefixes not merely a
predicate but a "formula in x," by which we shall understand for the time being an expression compounded from one-place predicates using sentential connectives. Using the symbol introduced P(x), . for the universal quantifier, we can now render "Every rational number is a real number" in its final form: (10)
(x)(Q(x) -* R(x)).
Possibly it has already occurred to the reader that this means simply that Q C R. Indeed, if one recalls the definition of the inclusion relation for sets, it becomes clear that (10) is an instance of that definition. Further, we note that (10) is characteristic of statements of the form "Every so and so is a such and such." Similarly, the sentence "Some real numbers are rational" may be translated as (11)
(3x)(R(x) A Q(x)).
The meaning of this sentence is simply that R f1 Q is nonempty; that is, it is a symmetrical form of the original sentence. A mistake commonly
made by beginners is to infer, since a statement of the form "Every so and so is a such and such" can be symbolized as in (10), that. the statement "Some so and so is a such and such" can be symbolized by (3x)(R(x) -* Q(x)).
However, as is pointed out in Exaniple 6.3, this should have the same meaning as (3x)(
R(x) V Q(x)).
This should be accepted as true as soon as we exhibit an object which is not a real number. In particular, therefore, it has no relation to what it is intended to say, namely, that some real numbers are rational.
4.6
I
The Predicate Calculus. Symbolizing Everyday Language
197
EXAMPLES 6.4. The notion of a formula in x, as (vaguely) described above, is the same as that given in Chapter 1. There it was stated that such an expression is often called a property (of x). Associated with a property is a set, according to the intuitive principle of abstraction. Extending in the obvious way the notion of a formula in x to that of a formula in x and y, one can associate with a formula A(x, y) those ordered pairs (a, b) such that A(a, b) is true. That is, a formula in x and y may be used to define a binary relation. This being so, formulas in two variables are often called binary relations, those in three variables are called ternary relations, and so on. 6.5. If A(x) is a formula in x, consider the following four statements.
(c) (x)(-1A(x)) (d) (3x)(-A(x)).
(a) (x)A(x) (b) (3x)A(x).
We might translate these into words as follows.
(a) Everything has property A. (b) Something has property A. (c) Nothing has property A. (d) Something does not have property A. Now (d) is the denial of (a), and (c) is the denial of (b), on the basis of everyday meaning. Thus, for example, the existential quantifier may be defined in terms of the universal quantifier by agreeing that "(3.x)A(x)" is an abbreviation for « --I (x) -, (A (x)). "
6.6. Traditional logic emphasized four basic types of statements involving quantifiers. Illustrations of these along with translations appear below. Two of these translations have been discussed. All rationals are reals. No rationals are reals. Some rationals are reals. Some rationals are not reals.
(x)(Q(x) -' R(x)) (x)(Q(x) --+ n R(x)). (3x)(Q(x) A R(x)).
(3x)(Q(x) A -R(x)).
6.7. If the symbols for negation and a quantifier modify a formula, the order in which they appear is relevant. For example, the translation of -I(x)(x is mortal)
is "Not everyone is mortal" or "Someone is immortal," whereas the translation of (x)(--I (x is mortal))
is "Everyone is immortal." 6.8. By prefixing a formula in several variables with a quantifier (of either
Logic
198
I
CHAP. 4
kind) for each variable, a statement results. For example, if it is understood that all variables are restricted to the set of real numbers, then
(x)(Y)(z)((x + y) + z = x + (y + Z)) is the statement to the effect that addition is an associative operation. Again, (x) (3y) (x2 - y = y2 - x)
translates into "For every (real number) x there is a (real number) y such that x2 - y = y2 - x." This is a true statement. Notice, however, that (3y) (x) (x2 - y = y2 - x),
obtained from the foregoing by interchanging the quantifiers, is a differentindeed, a false- statement. 6.9. We supplement the first remark in the preceding example with the observation that a formula in several variables can also be reduced to a statement by substituting values for all occurrences of some variables and applying quan-
tifiers which pertain to the remaining variables. For example, the (false) statement (x) (x < 3)
results from the 2-place predicate "x < y" by substituting a value for y and quantifying x.
We conclude this section with the remark that there are no mechanical rules for translating sentences from English into the logical notation which has been introduced. In every case one must first decide on the meaning of the English sentence and then attempt to convey that same meaning in terms of predicates, quantifiers, and, possibly, individual constants. Beginning with the exercises below we shall often omit parentheses
when writing predicates. For example, in place of "A(x)" we shall write "Ax," and "A (x, y)" will be written simply as "Axy."
EXERCISES 6.1. Let Px be "x is a prime," Ex be "x is even," Ox be "x is odd," and Dxy be "x divides y." Translate each of the following into English.
(a) P7.
(e) (x)(--,Ex -4 --i D2x). (b) E2 A P2. (f) (x)(Ex -' (y)(Dxy -Ey)). (c) (x) (D2x -' Ex). (g) (x)(Px -y (3y) (Ey A Dxy)). (h) (x)(Ox - (y)(Py --i -,Dxy)) (d) (3x) (Ex A Dx6). (i) (3x) (Ex A Px) A-,(3x)((Ex A Px) A (3y) (x 0 y A Ey A Py)).
6.2. Below are twenty sentences in English followed by the same number o
4.6
I
The Predicate Calculus. Symbolizing Everyday Language
199
sentences in symbolic form. Try to pair the members of the two sets in such a way that each member of a pair is a translation of the other member of the pair. (a) All judges are lawyers. (Jx, Lx) (b) Some lawyers are shysters. (Sx) (c) No judge is a shyster (d) Some judges are old but vigorous. (Ox, Vx) (e) Judge Jones is neither old nor vigorous. (j) (f) Not all lawyers are judges. (g) Some lawyers who are politicians are Congressmen. (Px, Cx) (h) No Congressman is not vigorous. (i) All Congressmen who are old are lawyers. (j) Some women are both lawyers and Congressmen. (Wx) (k) No woman is both a politician and a housewife. (lIx) (1) There are some women lawyers who are housewives. (m) All women who are lawyers admire some judge. (Axy) (n) Some lawyers admire only judges. (0) Some lawyers admire women. (p) Some shysters admire no lawyer. (q) Judge Jones does not admire any shyster. (r) There are both lawyers and shysters who admire Judge Jones. (s) Only judges admire judges. (t) All judges admire only judges. (a)' (3x)(Wx A Cx A Lx). (b)' -, Oj A -, Vj.
(c)' (x)(Jx -- -,Sx). (d)' (3x)(Wx A Lx A Hx). (e)' (x) (Ajx --r -m Sx).
(f)' (x) (Jx -4 Lx) . (g)' -, (x) (Lx -* Jx). (h)' (x)(Cx A Ox --+ Lx). (i)' Ox) (Lx A Sx). (j)' (3x)(Lx A Px A Cx). (k)' (x) (Wx --> -, (Px A IIx) ). (1)' (x) (Cx - Vx). (m)' (3x) (Jx A Ox A Vx). (n)' (x)(y)(Ayx A Jx ---' .Iy).
(o)' (3x)(Sx A (y)(Axy -* -,Ly)) (p)' (3x) (3y) (Lx A Sy A AV A Ayj). (q)' (x) (Wx A Lx - (3y) (Jy A Axy)).
(r)' (3x) (Lx A (3y) (Wy A Axy) ) (s)' (x)(Jx --+ (y)(Axy - Jy)). (t)' *(3x) (Lx A (y) (Axy --4 Jy)).
200
Logic
I
afire['. 4
7. The Predicate Calculus. A Formulation The examples and exercises of the preceding section serve to substantiate the contention that if the sentential conncctives are supplemented with predicates and quantifiers, much of everyday language can be symbolized accurately. Predicate calculus is concerned with a theory of inference based on the structure of sentences in terms of connectives,
predicates, and quantifiers. In particular, therefore, it is an extension of the statement calculus. The type we shall discuss admits of quantification only of individual variables. To distinguish this simple type from others, it is usually called restricted predicate calculus or predicate calculus of first order. Incidentally, it is not our intention to develop the restricted predicate calculus to the same degree of completeness as we did the statement calculus. Rather, we shall merely formulate it and sketch how it might be developed and applied. A formulation which is comparable to that of the statement calculus in Section 3 is our starting point. We assume that for each of n = 0, 1, 2, there is given an unspecified number of n-place predicates (or, statement functions of n variables). 'These we shall denote by such symbols as P(x, y) (to stand for some one 2-place predicate), P(x, y, z) (to stand for some one 3-place predicate which would necessarily represent a predicate different from
that symbolized by P(x, y), being a function of a different number of variables), Q(x, y, z) (to stand for another 3-place predicate), R (to stand
for some one 0-place predicate, that is, a statement), and so on. It is assumed that the set of all n-place predicates for n = 1, 2, is nonempty. Henceforth we shall call the given predicates predicate letters. From the given set of predicate letters we generate those expressions
which we shall call "formulas (of the predicate calculus)." A prime formula is an expression resulting from a predicate letter by the substitution of any variables, not necessarily distinct, for those variables which appear in the predicate letter. For example, some of the prime formulas which the predicate letter P(x, y, z) yields are P(x, y, z), P(x, y, y), P(y, x, x), and I'(tt, u, u). We extend the set of all prime formulas by adjoining all those expressions which can be formed by using, repeatedly and in all possible ways, the sentential connectives and quantifiers. Precisely, we extend the set of all prime formulas to the smallest set such that each of the following holds. If A and B are members of the set, then
so are -1(A), (A) A (B), (A) V (B), (A) -* (B), and (A) H (B). Also,
4.7
I
The Predicate Calculus. A Formulation
201
if A is a member of the set and x is a variable, then (x)A and (3x)A are members of the set. The members of this extended set are called for-
mulas. Those which are not prime formulas are called composite formulas.
Parentheses are inserted automatically in a formula, but in some cases arc unnecessary. (Indeed, the sole purpose of such lavish use of parentheses is to make possible the formulation of a mechanical procedurc for demonstrating that some juxtaposition of symbols is a formula.) In other cases parentheses can be omitted by the same conventions established earlier. We extend those conventions by agreeing
that quantifiers, along with -,, have the least possible scope. For example, (3x)A V B stands for ((3x)(A)) V (B).
The foregoing description is vague only with respect to the nature of a predicate letter. From the standpoint of the theory of the firstorder predicate calculus, the nature of predicate letters is irrelevant, for there they are treated in a purely formal way, that is, simply as certain strings of letters, parentheses, and commas. From the standpoint of the applications, the vagueness is deliberate, for thereby versatility is achieved. The examples which follow may serve to substantiate this assertion. Each example describes the initial steps which one might take in axiomatizing a mathematical theory.
EXAMPLES 7.1. Suppose that a practitioner of the axiomatic method were to set out to reconstruct the set theory of Chapter 1 as an axiomatic theory. After analyzing how that subject matter was developed, he might conclude that all concepts stemmed from the membership relation --that is, the 2-place predicate "is a member of." This might motivate the practitioner to set up a system of the type
introduced above, one having a single predicate letter C(x, y) intended to denote the membership relation. Of course, the intended interpretation of individual variables would be as sets. The prime formulas of the system would consist of all expressions of the form C(x, y) or, using more suggestive notation, x C y. Then, for convenience, further predicates could be introduced by definition. Following are some instances:
xtZyfor -,(xCy),
xcyfor (a)(aCx-aCy), x= y for (x _C y) A (y c x),
x0yfor -i(x=y), xCyfor(x9y)A(xg-y).
The next step would be the adoption of certain formulas as axioms.
Logic
202
I
CHAP. 4
7.2. As every high school student knows, the basic ingredients of elementary geometry are "points," "lines," and the relation of incidence, " lies on _." In formulating an axiomatic theory intended to have intuitive geometry as an interpretation, one might choose as primitive terms a list of individual variables (intended to range over points and lines), two 1-place predicate letters,
P(x) and L(x), and one 2-place predicate letter, I(x, y). These might be read, in turn, "x is a point," "x is a line," and "x is on y." Among the axioms might appear the following: (3x)P(x), (3x)L (x), (x) (y) (AX, y) <- -' I(.y, X)),
(x)(P(x) -, (3y)(L(y) A I(x, y))) 7.3. As the first step in axiomatizing the theory of partially ordered sets as described in Chapter 1, one might introduce as the primitive terms a list of individual variables and two 2-place predicate letters, = (x, y) and < (x, y). Then the prime formulas would consist of all expressions of the form x = y and x < y, using more familiar notation. As nonlogical axioms for the theory (that is, those axioms which serve as a basis for the intended mathematical structure), we might then take
(x) (x = x), (x) (y) (x = y - y = x), (x) (y) (z) (x = y A y = z (which mean that = is an equivalence relation),
x = z)
(x) (y) (z) (x = y A x < z --b y < z), (x) (y) (z) (x = y A z < x -- z < y)
(which assert that "equals may be substituted for equals"), and, finally,
(x) -, (x < x), (x) (y) (z) (x < y A y < z - x < z) (which establishes < as an ordering relation).
As part of the formulation of the predicate calculus there must be introduced definitions for distinguishing between the circumstances in
which a variable is intended to play the role of a variable or an unknown in the intuitive sense. As a preliminary to this we define the scope of a quantifier occurring in a formula as the formula to which the quantifier applies. A possible ambiguity is removed by use of parentheses. Below are several examples illustrating the scope of the quantifier "(x)," in which the scope is indicated by the line underneath: (x) P(x) A Q (x), (3y) (x) (P(x, y)
(z) Q (z) ),
(x) (y) (P(x, y) A Q(y, z)) A (3x)P(x, y),
(x)(P(x) A (3x)Q(x,z) -* (3y)R(x,y)) v Q(x,y)
4.7
I
The Predicate Calculus. A Formulation
203
It is now possible to give the key definitions in connection with the matter at hand. An occurrence of a variable in a formula is bound if this occurrence is within the scope of a quantifier employing that variable or is the explicit occurrence in that quantifier. An occurrence of a variable is free ill this occurrence of the variable is not bound. For example, in (x)P(x, y)
both occurrences of x are bound, and the single occurrence of y is free. Again, in the formula (3y) (x) (P(x, y) -* (z) Q W)
each occurrence of every variable is bound. A variable is free in a formula if at least one occurrence of it is free, and a variable is bound in a formula if at least one occurrence of it is bound. A variable may be both free and bound in a formula. This is true of z in the formula (z)(P(z) A (3x) Q(x, z) -a (3y) R(z, y)) V Q(z, x).
If a variable is free in a formula, then, on an assignment of meaning to the predicates involved, that variable behaves as an unknown in the familiar sense, since the formula becomes a statement about that variable. The formulas x < 7 and (3y)(y < x), in each of which x is free, serve to illustrate this point. The formula (3y) (y < x) A (x) (x > 0),
wherein the first occurrence of x is free and the other two are bound, illustrates the remark that insofar as meaning is concerned, the free and bound occurrences of the same variable in the same formula have
nothing to do with each other. Indeed, the formula (x) (x > 0) is simply a statement and has the same meaning as (u)(u > 0) and (W) (W > 0).
In bound occurrences in a formula a variable behaves like a variable in the intuitive sense. For example, in (x) (x 2 - 1 = (x - 1)(X + 1) )
all occurrences of x are bound and, clearly, x serves as a variable. That x in the formula (3x) (y ; x)
204
Logic
I
CHAP. 4
serves as a variable is made more plausible on recalling that this formula has the same meaning as (x) --I (Y 7,!5 X) -
In conclusion, we note that it is now possible to give a precise definition of the word "statement." A statement is a formula which has no free variables.
EXERCISES 7.1. List the bound and the free occurrences of each variable in each of the following formulas.
(a) (x)P(x). (b) (x)P(x) --' P(J')
(c) P(x) - (3x)Q(x)
(d) (3x)A(x) A 11(x). (e) (3x)(y)(P(x) A Q(y)) --> (x)R(x). (f) (3x)(3y)(1'(x, y) A Q(z)).
7.2. Using the letters indicated for predicates, and whatever symbols of arith-
metic (for example, "+" and "<") may be needed, translate the following. (a) If the product of a finite number of factors is equal to zero, then at least one of the factors is equal to zero. (Px for "x is a product of a finite number of factors," and Fxy for "x is a factor of y.") (b) Every common divisor of a and b divides their greatest common divisor. (Fxy for "x is a factor of y," and Gxyz for "z is the greatest common divisor
of x and y.") (c) For each real number x there is a larger real number y. (Rx) (d) There exist real numbers x, y, and z such that the sum of x and y is greater than the product of x and z. (c) For every real number x there exists a y such that for every z, if the sum of z and 1 is less than y, then the sum of x and 2 is less than 4.
7.3. An abelian group may be defined as a (noncmpty) set A together with a binary operation -l- in A which is associative, commutative, and such that for given x and y in A the equation x -l- z = y always possesses a solution z in A. A familiar example is that of L_ with ordinary addition as the operation. A formulation within the predicate calculus can he given by taking as primi(x, y), and a 3-place tive terms a list of variables, a 2-place predicate letter predicate letter S(x, y, z). The prime formula x = y is read "x equals y," and the prime formula S(x, y, z) is read "z is the sum of x and y." As axioms we take th.e following formulas. (x)(x = x)-
(x) (y) (x = y --> y = X).
(x) ()) (z) (x = y A y = z - x = z).
.
4.8
I
205
The Predicate Calculus. Validity
(u) (v) (w) (x) (y) (z) (S(u, v, w) A u = x A y = v A w = z -+ S(x, y, z) ). (x) (y) (3z)S(x, y, z)
(x) (y) (z) (w) (S(x, y, z) A S(x, y, w) - + z = w) . (u) (v) (w) (x) (y) (z) (S(u, v, w) A S(w, x, y) A S(v, x, z) --3 S(u, z, y)). (x) (y) (z) (S(x, y, z) --, S(.Y, x, z)) (x) (y) (3z)S(x, z, y)
Write a paragraph in support of the contention that, collectively, these axioms do serve to define abelian groups.
8. The Predicate Calculus. Validity The system described in the preceding section is essentially the com-
mon starting point in the formulation of various predicate calculi. Distinguishing features of the classical predicate calculus (which is our concern) include further assumptions which extend the one assumption made in Section 3 for the statement calculus, namely, that with each prime formula there is associated exactly one of T and F. The corresponding assumption about a prime formula in the sense of the predicate
calculus is much more complicated. We shall introduce it in several steps. First, it is assumed that with the system described in the preceding section there is associated a nonempty set D, called the domain, such that each individual variable ranges over D. Further, it is assumed that with each n-place predicate letter there is associated a logical function, that is, a function on D" into IT, F}. (For 0-place predicates the associated function is assumed to be a constant, one of T or F.) Finally, it is assumed that a truth-value assignment to a prime formula P(yt, y23 , y") can be made, relative to an assignment of an element , y", in the following way. in D to each distinct variable among yl, y2, If toy; is assigned d; in D and if to the predicate letter P(xt, x2, is assigned X: D" --*- IT, F }, then the truth value of 1'(yt, y2, , is
,
, d"). For example, if P(x, y, x) is the prime formula and X is assigned to P(x, y, z), then the truth value of P(x, y, x), relative to the assignment of a to x and b toy, is A(a, b, a). For the theory of the statement calculus, that one of T and F which is assigned to a prime formula is assumed to be irrelevant. In the prediA(dt, r12,
cate calculus this is extended to the assumption that the theory is independent of the domain D and the assignment of functions to predicate letters.
Logic
206
I
CHAP P. 4
The foregoing is the basis of the valuation procedure for a formula C of the predicate calculus. For this it is assumed that (i) a domain D is given, (ii) a function is assigned to each predicate letter appearing in C, and (iii) to each distinct free variable in C is assigned a value in D. Collectively, these constitute an assignment to C. A truth value is assigned to C by a procedure which parallels the formation of C. , yn) is a prime formula in C and A is assigned to (I) If P(y1j Y2, , xn) and d= is assigned to ys, then the truth value of P(x1, x2, P(Y1,Y2, ...,yn) is X(d1, d2, ..., dn).
(II) For a given assignment of values to the predicate letters and free variables of --,A, the value of --,A is F if the value of A is T, and the value of -, A is T if the value of A is F. Similarly, for a
given assignment of values to the predicate letters and free variables of A V B, A A B, A --+ B, and A <--, B, the truth tables from the statement calculus apply.
(III) For a given assignment of values to the predicate letters and free variables of (x)A, the value of (x)A is T if the value of A is T for every assignment to x, and the value of (x)A is F if the value of A is F for at least one assignment to x. For'a given assignment of values to the predicate letters and free variables of (3x)A, the value of (3x)A is T if the value of A is T for at least one value of x, and otherwise it is F.
As an illustration, we consider the problem of the assignment of truth values to the formula (x) (P(x) --+ Q) V (Q A P(Y))
Although the domain D is fixed, it is unknown. Suppose D = {a, b}. By assumption there is associated with P(x) a logical function on D into IT, F} and with Q a truth value. Further, the free variable y may assume any value in D. The possible logical functions which may be associated with P(x) are tabulated( here: x
a b
I
X1(x)
X2(x)
A3(x)
X4(x)
T T
T
F
F
F
T
F
The possible values which may be associated with Q are T and F, and to y may be assigned the value a or b. Thus, we may fill out a table with 16(= 4 2 2) entries exhibiting the truth-value assignment in all possible cases:
intended
is
calculus
predicate
the
of
description
Our
parallel
the that
for
far,
So to
3.
Section
with
beginning
calculus
statement
the
of
T
F
v(TAF)
T
--j
(x)(P(x)
X3(a))
P(y))
A A
(T (Q
V V
T) Q)
--*
(x)
(x)(X
:
form
tabular
in
steps
these
summarize
We
evaluated
is
--+T)
(x)(X3(x)
for
T
is
conditional
T
--'
X3(x)
compute
must
we
T)
--p
(x)(X3(x)
^3(a)
D,
a
is
this
for
table
T.
to
assignments
T
-
a3(x)
x
evaluate
to
order
as
T
of
value
the
TT TT
TF
ba
The
x.
of
function
logical
In
A x in is
formula
entire
the
F,
X3(a)
Since
T.
the
of
value
the
Since
of = all
value
the
V,
for as
table
the
by
Finally,
F.
is
(a)).
A3
A
(T
V
T)
-+
(x)(X3(x)
obtain
to
formula
the
into
assignments
the
substitute
we
First
follows.
as
are
table
the
consideration.
under
formula
the
to
assignment
of
row
ninth
the
of
details
the The
in
given
assignment
the
accompanying
computation
an
rip
make
row
fixed
a
in
y
and
Q,
P(x),
under
appearing
entries
The
X4(x)
4(x)
X X3(x) X3(x) X3(x) X3(x) X2(x)
X2(x)
X2(x)
A2(x)
ar(x) X,(x)
Xj(x) Ai(x)
P(x)
FF F F F F TF F F F TF F T T
P(y))
A
(Q
T T T TF F T TF F T TF F T T V
Q)
->
(x)(P(x)
F F T TF F T TF F T T FF T T Q b a b a 6 a ba ba 6 a ba b a y
X4(x)
)fi(x)
207
Validity
Calculus.
Predicate
The
1
4.8
208
Logic
I
CHAP. 4
predicate calculus, we have introduced the symbols to be employed, given the definition of a formula, and described a valuation procedure. We imitate the next step in the earlier theory by defining validity in the predicate calculus. A formula is valid in a given domain iff it takes the value T for every assignment to the predicate letters and free variables in it. A formula is valid if it is valid in every domain. For "A is valid" we shall write K A.
It is appropriate to use the same terminology and symbolism as before, since this definition of validity is an extension of the earlier one. It is
obvious that to establish the validity of a formula, truth tables must give way to reasoning processes. On the other hand, to establish nonvalidity, just one D and one assignment based on this domain will suffice. For example, the fourth line of the above table demonstrates that the formula considered there is not valid. The case with which the validity of some formulas can be established may come as a surprise.
EXAMPLES 8.1. Let us illustrate the assignment of functions to predicate letters in an application of the predicate calculus. Suppose that L is the domain and that we are told that 1'(x, y, z) is to be interpreted as "z is the sum of x and y." Then to this predicate letter we would assign the function X: Z' ->- IT, F} such that X (a, b, c) = T if a + b = c, and X (a, b, c) = F otherwise. If, on the other hand, we are told that P(x, y, z) is to be interpreted as "z is the product of x and y," then we would define X(a, b, c) to be T if ab = c, and to be F otherwise. 8.2. We prove that l= (x)P(x) --> P(y). A prerequisite for the formula to take the value F is that. P(y) receive the value F for some assignment in some domain. But in that event, (x)1'(x) receives the value F. Hence, (x)P(x) -+ P(y) always receives the value T. 8.3. Let us prove that
K P(y) - (3x)P(x). As in the preceding example, we need concern ourselves only with assignments
in some domain D such that (3x)P(x) takes value F. This is the case if P(x) receives value F as x ranges over D. But then P(y) must receive the value F. 8.4. Let us establish the nonvaliclity of the formula (3x)P(x) -* (x)P(x). Let D contain at least two individuals, a and b. Assign to P(x) a logical function A such that X(a) = T and ,\(b) = F. Then (3x)P(x) receives the value T and (x)P(x) receives the value F. Hence, the entire formula receives the value F.
8.5. A proof that (x)P(x) V (x)Q(x) -' (x)(P(x) V Q(x))
4.8
I
The Predicate Calculus. Validity
209
may be given as follows. Suppose that the consequent takes the value F for an assignment A1, A2, and a to P(x), Q(x), and x, respectively. Then, for this assignment, P(x) V Q(x) takes the value F. Hence, Ai(a) = F and A2(a) = F, from which it follows that (x)P(x) and (x)Q(x), and hence their disjunction, each take the value F.
We turn now to the question of general methods for proving validity,
looking first at what we can take over from the statement calculus. Theorem 3.2 (with "A eq B" now assigned a meaning in terms of our present valuation procedure) and Theorem 3.3 carry over unchanged. The proofs employ essentially the earlier reasoning. The substance of Theorem 3.1 is the possibility of proving validity of a formula without dissecting it into prime components. This same technique has applications in the predicate calculus. To proceed with our first illustration, let us call a formula of the predicate calculus prime for the statement calculus if no sentential connectives appear in it. In terms of the composition of a formula from such prime formulas we can introduce the notion of tautology into the predicate calculus. For example, P(x) -4 P(x) is a tautology, and we may recognize tautologies (for example, A --+ A) even when the prime formulas are not displayed. Clearly, a tautology is a valid formula. In particular, Theorem 3.4 holds for the predicate calculus.
In order to illustrate further the technique under discussion some definitions are required. To substitute a variable y for a variable x in a formula A means to replace each free occurrence of x in A by an oc-
currence of y. If y is to be substituted for x in A, it is convenient to introduce a composite notation such as "A(x)" for the substituend and then write "A(y)" for the result of the substitution. Such notation as "A(x)" for the formula A is used solely to show the dependence of A on x and is not to be confused with the notation for predicate letters; indeed, we do not require that x actually occur free in A and do not exclude the possibility that A(x) may contain free variables other than x. In the future we shall often use such notations as "A(x)" or "A(x, y)"
instead of "A" when we are interested in the dependence of A on a variable x or variables x and y, whether or not we plan to make a substitution. Let us consider an example. If A(x) is (x = 1) A (3y) (y $ x), (1)
then, clearly, A(y) says something different about y than A(x) says about x. The reason is that the occurrence of x in (3y) (y x) is free, whereas an occurrence of y in the same position is bound. In everyday
210
Logic
I
c1AP. 4
mathematics we are not likely to make a substitution which changes the meaning of a formula. A safeguard against inappropriate substitutions in purely formal situations can be given. A formula A(x) is free for y if no free occurrence of x in A(x) is in the scope of a quantifier (y) or (3y). For example, if A(x) is P(x, Y) A (y)Q(y), then it is free for y,
whereas if A(x) is (1), above, then it is not free for y. If substitutions for x in A(x) are restricted to variables y such that A(x) is free for y, difficulties of the sort mentioned are avoided.
We turn now to Example 8.3, where we proved that K P(y) -, (3x)P(x) for a predicate letter P(x). Using the same reasoning we can prove that 1= A(y) --> (3x)A(x), where A(x) is any formula which is free
for y. The computation of the value of the formula at hand for a given assignment consists of (i) the computation of a value of the logical function assigned to A, and (ii) the computation of the value of the formula. The second step will coincide with that by which the value of P(y) -, (3x)P(x) is computed for some assignment; this, as we have seen, is always T. In general, although a formula A may contain several prime formulas, we may consider A as a prime formula and speak of "the logical function assigned to A." We state the result just derived along with a companion valid formula as our next theorem.
THEOREM 8.1. Let A (x) be a formula which is free for y. Then (I) 1= (x)A(x) --+ A(y). A(y) -> (3x)A(x).
(II)
COROLLARY. If Proof.
(x)A(x), then
A(x).
We apply (I) of the theorem, taking x as the y to obtain
(x)A(x) - A(x). Now assume that (x)A(x). Then we may conclude that K A(x) by the extension of Theorem 3.3 mentioned above. 'r I-1 E O R E M 8.2.
Let x be any variable, B be any formula not
containing any free occurrence of x, and A(x) be any formula. Then (1) If I-- B -* A(x), then 1= B -> (x)A(x).
(II) if r- A(x) - B, then l (3x)A(x) -> B. Proof. To prove (I), we assume that K f3 -. A(x). Let. D be any domain and for this domain consider any assignment a to the formula
B-+ (x)A(x). Note that since x does not occur free in either B or (x)A(x), a does not include an assignment of a value in D to x. For a,
4.8
I
The Predicate Calculus. Validity
211
B takes either the value F or T. If B takes the value F, then B -* (x)A(x)
takes the value T. If B takes the value T, then this is still the case when a is extended to include any assignment of a value in D to x. Hence, for a so extended, A(x) receives the value T, since, by assumption, B --+ A(x) has value T. That is, for each assignment to x along
with the given assignment a, A(x) receives the value T. It follows that t-- B --* (x)A(x). The proof of (11) is similar and is left as an exercise.
COROLLARY. If t= A(x), then 1= (x)A(x). Assume that l= A(x). Since J= B ---, (C -* B), if P is any 0-place predicate letter, then 1= A(x) --+ (P V -, P --* A(x)). hence, K P V -,P --- A(x) by Theorem 3.3. By (1) of the above theorem, it follows that r. P V --1 P -+ (x)A(x). Finally, since P V --j P, another application of Theorem 3.3 gives K (x)A(x). Proof.
An illustration in familiar terms of the above corollary is this. A proof of "For all real numbers x, sine x + cos" x = 1" begins by regarding x as some unknown (but fixed) real number. After proving that, for this x, Sine X + COS2 X = 1, it is argued that since x is any real
number, the assertion follows. Note that this involves the transition from the consideration of x as a free variable to that of a bound variable. When we initially raised the questions of what methods for proving validity in the statement calculus carry over to the predicate calculus, we ignored the possibility of a direct generalization of Theorem 3.1. It has generalizations to the predicate calculus, but they are complicated
because of the necessity of the avoidance of binding, in a way which is not intended, of free variables by quantifiers which may be present.
In order to present one theorem of this type, we must describe the mechanics of substituting in a formula for all occurrences of prime formulas resulting front a particular predicate letter. We begin with an illustration. In the formula (x)P(.r) -. P (y) there are two occurrences
of prime formulas resulting from the predicate letter P(w). By the result of substituting a formula A(zu) for the predicate letter P(w) in (x)P(x) --* P(y) we shall mean the result of replacing P(x) by A(x) and P(y) by A(y). For instance, if we take A(w) to be (3z)Q(w, z), then the result of the substitution is (2)
(x)(3z)Q(x, z) --- (3z)Q(y, z),
212
Logic
I
CHAP. 4
and if we take A(w) to be (3y)Q(w, y), then the result of the substitution is (3)
(x) (3y) Q (x, y) -> (3y) Q (y, y)
There is a basic difference between results (2) and (3). Namely, the free y in P(y) remains free in (2) but in (3) it becomes bound by the quantifier (3y) of our second choice for A(w). The effect of binding y in (3) is disastrous as may be seen by considering, for instance, the interpretation of (3) which results on choosing Z as the domain and Q(x, y)
Such mixups in the way the variables are bound after a substitution can be avoided by observing two restrictions. To formulate the substitution process and these restrictions in general, let us suppose the substitution is of the formula A(wi, W2, -, wk) for the predicate letter
as x <
Y.
, ZOO in a formula B not containing any one of wI, w2i , wk. The substitution with result B* is effected by replacing each part of B of the form P(rI, r2, - - , rk) by A(r1, r2, , rk), where , wk A(r,, r2, . , rk) is the result of substituting rI, r2j - , rk for wI, w2, , Wk). The substitution is called admissible iff' none of in A(wj, w2, the variables in B occur bound in A(w1, w2, , wk) and none of the free variables in A(wI, w2, - , wk) occur bound in B. The generalizaP(wI, w2,
tion of Theorem 3.1 which we have in mind can then be stated as follows (the proof is omitted).
THEOREM 8.3. Let B be a formula containing a prime formula resulting from the predicate letter P(wI, IV2, - , Wk) and let B* be
the formula resulting from B by an admissible substitution of the formula A(w,, w2, - , rvk) for P(wI, w2, . , wk). If 1= B, then K B*. -
Although this theorem is not the most general of its kind, it serves to reduce the proof of the validity of each of the formulas in the next theorem to the case of prime formulas in place of arbitrary formulas. Since the formulas of Theorem 3.4 extend to the predicate calculus, we continue the numbering used there to emphasize that we arc introducing additional valid formulas for the predicate calculus.
THEOREM 8.4. Let x and y be distinct variables, A(x), B(x), and A(x, y) be any formulas, and A be any formula not containing any free occurrences of x. Then 33. t
(3x) (3y)A(x, y) E-> (3y) (3x)A(x, y) 33'. i (x)(y)A(x,y) -- (y)(x)A(x,y').
4.8
1
The Predicate Calculus. Validity
213
34. 1= (3x)A(x) E-> -, (x) -, A(x). (x)A(x) F-4 -, (3x) -,A(x). 34'. 1
-, (3x)A(x) H (x) -,A(x). 35'. 1= -i (x)A(x) - (3x) -, A(x). 36. K (3x)(y)A(x,y)--. (y)(3x)A(x,y) 37. (3x)(A(x) V B(x)) H (3x)A(x) V (3x)B(x). 37'. G (x)(A(x) A B(x)) <-+ (x)A(x) A (x)B(x). 38. K (x)A(x) V (x)B(x) -' (x)(A(x) V B(x)). 38'. (3x)(A(x) A B(x)) - (3x)A(x) A (3x)13(x). 35.
39. K (3x) (A V B(x)) E-> A V (3x) B(x).
A A (x)B(x). 39'. K (x)(A A B(x)) 40. K (x)(A V B(x)) --' A V (x)B(x). 40'. I_ (3x)(A A B(x)) H A A (3x)B(x).
The proofs of the validity of these formulas are left as exercises. Tha some of the formulas are valid should be highly plausible on the basis
of meaning; formulas 33 and 33', which mean that existential (or universal) quantifiers can be interchanged at will, are in this category. Again, formulas 34 and 34', which describe how an existential quantifier can be expressed in terms of a universal quantifier and vice versa, were discussed in the preceding section. Formulas 35 and 35' provide rules for transferring -, across quantifiers. Formulas 37, 37', 38, and
38' are concerned with transferring quantifiers across V and A in general, and formulas 39, 39', 40, and 40' treat special cases of such transfers.
EXAMPLES 8.6. We consider some practical illustrations of the use of formulas 35 and 35' in arithmetic. That is, we take as the domain D the set of natural numbers. Further, let < and + have their familiar meanings; thus <(x, y) is a 2-place predicate letter, and + (x, y, z) is a 3-place one. The (true) statement "There does not exist a greatest natural number" may be symbolized by (x)(3y)(x < y)-
Its negation, -, (x) (3y) (x < y),
may be rewritten, using 35', as
(3x) -i ((3y) (x < y)) In turn, using 35, this may be rewritten as (3x) (y) -, (x < y)
or
(ax)(Y)(x ? Y) -
214
Logic
I
CHAP. 4
In English this last formula reads "There exists a greatest natural number." The (false) statement "For every pair m, n of natural numbers there is a natural
number p such that m + p = n" may be symbolized by (m)(n)(3p)(m + p = n). Its negation may be transformed into (3m) (3n) (p) (m +p
n).
The reader can translate this into acceptable English. 8.7. Take for D the set R of real numbers. The definition of continuity of a function J at a, namely, "J is continuous at a iff for every e > 0 there exists a
5 > 0 such that for all x, if Ix - al < 5, then If (x) - f(a)l < e" can be translated into the symbolic form (E) (E > 0 -. ((36) (6 > 0 A (x)(Ix - al < S
If(x) - 1(a)I < e)))) This can be shortened considerably using the notion of restricted quantification, which in practical terms amounts to restricting the range of a and S to the set R 1. Then the above may be contracted to (E)(35)(x)(Ix - aI < S --: 11(x) - J(a)I < e).
With mild restrictions, the valid formulas of Theorem 8.4 remain valid when some quantifiers are restricted. This makes it possible, for example, to obtain the negation of complicated formulas quickly and in greatly abbreviated form. As an illustration, the reader is asked to form the negation of the original formula above and show that, in terms of restricted quantifiers', it reduces to the negation of the abbreviation of the original formula, which is (3E)(5)(3x)(Ix - al < S A 11(x)
- 1(a)I > e).
EXERCISES 8.1. For a domain of two elements, construct a truth table for the formula (x)(P V Q(x)) *-> P V (x)Q(x). 8.2. Prove that the formula in Example 8.4 is valid in a domain consisting of one element. 8.3. Establish the validity of formulas 34, 35, and 36 in Theorem 8.4, regarding all constituent formulas as primes. 8.4. Establish the validity of formulas 37, 38, and 39 in Theorem 8.4, regarding all constituent formulas as prunes. 8.5. Supply an example to show that the converse of formula 36 in Theorem 8.4 is nonvalid. 8.6. Prove Theorem 8.2 (II). 8.7. As in Example 8.6, let us take for 19 the set of natural numbers. Using Theorem 8.4, justify the equivalence of the left-hand and right-hand members of each of the following pairs of formulas.
4.9
I
The Predicate Calculus. Consequence
215
(a) (3x)(y) -, (y > x), (3x) -i (3y)(y > x).
(b) (3x)(y)(y > x V -, (9 > 0)), (3x)(y)(y > 0 ->y > x). (c) (x) (3y) (3z) (x < y A z2 > y), (x) (3y) (x < y A (3z) (z2 >y)).
be a sequence of real numbers. Using restricted 8.8. Let a0, a,, , a,,, quantification, translate into symbolic form (a) the assertion that a is the limit of the sequence, (b) the assertion that the sequence has a limit, (c) the assertion that the sequence is a Cauchy sequence (that is, given e > 0 there exists a positive integer k such that if n, m > k, then Ia. - a,,,j < e).
8.9. Write the negation of each of the formulas obtained in the preceding exercise.
8.10. With R as domain, translate each of the following statements into symbolic form, write the negation of each (transferring -I past the quantifiers), and translate each resulting formula into English. (a) For x, y C R and z E R+, xz = yz implies x = Y. R. (b) The number a is the least upper bound of A (c) The set A has a greatest element.
9. The Predicate Calculus. Consequence The concept of consequence for the predicate calculus is an extension of that for the statement calculus as given in Section 4. In this extension, statement letters give way to predicate letters, and assignments of truth values give way to the more elaborate assignments of the predicate calculus. In addition, a further ingredient appears for the first time: the possibility that an assumption forrmrla contains a free occurrence of a variable. For example, in a theorem in an assurnption may have the form "Let x be an integer greater than 0" or "Suppose that x is divisible by 3." An examination of how such an x is "treated" in a proof' reveals that it is regarded as a constant; that is, it is regarded as a name of one and the same object throughout the proof. Outside of the context of the proof, however, it is it variable. (For exam ple, having proved some result concerning an x which is divisible by 3, one feels free to apply it to all such numbers.) The
reader is familiar with. such names as "parameter" and "arbitrary constant" for symbols employed in this way. This brings its to our basic definition. The formula 13 is a consequence of formulas A,, A2, , A. (in the predicate calculus), symbolized by
A,, A2, ---,A,,,K B,
216
Logic
(
CCHHAP. 4
if for each domain D and for each assignment to the A's in D the formula 13 receives the value T whenever each A receives the value T. Further, if a variable x occurs free in any A, then in each assignment to the A's one chooses for all free occurrences of x one and the same
value in D; that is, in making an assignment to the A's, such an x is regarded as a constant.
The statement and proof of Theorem 4.1 and its Corollary carry over unchanged to the present case. Thus, these results are available. In particular, to conclude that A1, A2, , A. B, it is sufficient to . . A A. -p B. Since Theorem 4.2 likewise prove that K A, A A,. A extends to the predicate calculus, it is possible to give a demonstration that a formula B is a consequence of A1, A2, , A. in the form of a finite sequence of steps, the last of which is B. In addition to the two basic rules p and 1, which in the statement calculus serve to justify the
appearance of a formula E in a demonstration, we may introduce others for the predicate calculus. The most fundamental of these are the following two. Rule (of universal specification) us: There is a formula (x)A(x) pre-
ceding E such that E is A(y), the result of substituting y for x in A(x),
such substitutions being restricted by the requirement that none of the resulting occurrences of y is bound. Rule (of universal generalization) ug: E is of the form (x)A(x) where A(x) is a preceding formula such that x is not a variable having a free occurrence in any premise. The state of affairs regarding a demonstration of consequence in the predicate calculus is then this. We contend that A,, A2, , Am K B if we can devise a string
E1,E2j...,Er(=13)
of formulas such that the presence of each E can be accounted for on the basis of one of the rules p, 1, us, or ug. Indeed, as in Section 4, it is possible to prove that if the presence of each E can be so justified, then A1, A2, (any E in the sequence). The earlier proof carries over (using the extended form of Theorem 4.2) to dispose of the case where the presence of an E is justified by either rule p or rule 1. The cases which involve rule us or Kg are dispatched using Theorem 8.1(1) and Theorem 8.2(I). The details are left as an
,
exercise.
,
We are now in a position to construct formal derivations of simple arguments in the style developed in Section 4.
4.9
I
The Predicate Calculus. Consequence
217
EXAMPLES 9.1. Consider the following argument. No human beings are quadrupeds. All women are human beings. Therefore, no women are quadrupeds. Using the methods of translation' of Section 6, we symbolize this as follows. (x) (Hx -+ --i Qx) (x) (Wx --> Hx)
(x) (Wx -* -, Qx)
The derivation proceeds as follows. {1} {2}
{2} {1}
{l, 2} {1, 2}
(1) (x) (Hx --> -, Qx) (2) (x) (Wx - a Hx) (3) Wy --> Hy
(4) Hy -' -' Qy (5) Wy -' Qy (6) (x) (Wx -a
p p 2 us I us
3,4t Qx)
5 ug
9.2. The following argument is more involved.
Everyone who buys a ticket receives a prize. Therefore, if there are no prizes, then nobody buys a ticket. if Bxy is "x buys y," Tx is "x is a ticket," Px is "x is a prize," and Rxy is "x receives y," then the hypothesis and conclusion may be symbolized as follows.
(x)((3y)(Bxy A Ty) - (y)(Py A Rxy)) -i (3x)Px -+ (x) (y) (Bxy --> -, Ty)
Since the conclusion is a conditional, we employ the rule cp in the derivation below. The deduction of line 3 from line 2, that of line 7 from line 6, and that of line 11 from 10 should be studied and justified by the reader. {1} {2} {2} {2} {2} {2} {2}
(1) (x)((3y)(Bxy A Ty) --' (3y)(Py A Rxy)) p (2) -, (3x)Px p (3) (x) -1 Px 2t 4t
{l}
V Rxy (6) (y) (-, Py V -,Rxy) (7) -,Gy)(Py A Rxy) (8) (3y) (Bxy A Ty) --> (yy)(Py A Rxy) (9) -, (3y) (Bxy A Ty)
I us 7, 8 t 9t
11,2)
(4) -, Py (5) -, Py
11,21
(10) (y)(-,Bxy V -,Ty)
11,2)
(11) (y)(Bxy --> -, Ty) (12) (x)(y)(Bxy -> Ty) (13) -, (3x)Px -4 (x) (y) (Bxy --,
{l, 2} {1}
3 us
5 ug
6t
lot
hug Ty)
2, 12 cp
218
Logic
I
CHAP. 4
9.3. Once the reader has subscribed to the soundness of the derivation in the preceding example, he has, in effect, endorsed further rules of inference which serve to expedite derivations. We introduce two further derived rules of in-
ference which render the same service. These are formal analogues of two familiar everyday occurrences in mathematics. If one is assured that "(3x)A(x)" is true, one feels at liberty to "choose" ay such that A(y). Then y is an unknown fixed quantity such that A(y). Conversely, given that there is some y such that A(y), one does not hesitate to infer that "(3x)A(x)" is true. In the predicate calculus the rule which permits the passage from (3x)A(x) to A(y) is called the rule (of existential specification) es. The rule which permits the passage from A(y) to (3x)A(x) is called the rule (of existential generalization) eg. These are the analogues for existential quantifiers of the rules us and ug for universal quantifiers. We shall not validate these rules nor even discuss the restrictions which must be observed in using them. In the following simple example illustrating them we employ a lower-case Greek letter to designate an object which is involved in the "act of choice" accompanying an instance of the rule es.
Every member of the committee is wealthy and a Republican. Some committee members are old. Therefore, there are some old Republicans. {1}
(1) (x) (Cx -, Wx A Rx)
p
{2}
(2) (3x) (Cx A Ox)
p
{1}
(3) Ca A Oa (4) Ca - Wa A Ra
{2}
(5) Ca
3t
(6) Wa A Ra
4, 5t
(7) Oa (8) Rot
3t 61
(9) Oa A Ra
7, 8 t
{2}
{1, 2} {2}
{1, 2} {1, 2) (1, 2}
(10) (3x)(Ox A Rx)
2 es
1 us
9 eg
9.4. The derivation corresponding to the following argument employs all of the rules which we have described. Some Republicans like all Democrats. No Republican likes any Socialist. Therefore, no Democrat is a Socialist.
The reason for the introduction of "x" in line 3 below is this. By virtue of the form of the conclusion, (x)(Dx -> -,Sx), a conditional proof is given. Thus, Dx is introduced as a premise in line 3. Since x occurs free here, we note its presence (as well as in subsequent lines which depend on this premise) to assist in avoiding any abuse of rule ug.
4.9
219
The Predicate Calculus. Consequence {1} {2} {3} (1) {1} {1}
(1) (3x)(Rx A (y)(Dy - Lxy))
p
(2) (x) (Rx -+ (y) (Sy -> -, Lxy))
p
(3) Dx (4) Ra A (y)(Dy - Lay) (5) (y)(Dy -> Lay)
x, p 1 es
4t
(6) Dx -, Lax
5 us
(7) Lax
x, 3, 6 t
{2}
(8) Ra --> (y) (Sy --> -i Lay)
2 its
{l}
(9) Ra
4t 8, 9 t 10 us x, 7, 11 t
{1, 31
{1, 21
{1,2} {l, 2, 31
{1,2} (1,21
(10) (y)(Sy - -,Lay) (I1) Sx --i --,Lax (12) -,sx (13) Dx -> --,Sx (14) (x)(Dx -> --1Sx)
3,12cp 13 ug
The foregoing examples lend plausibility to the contention that the predicate calculus is adequate for formalizing a wide variety of arguments. Lest there be concern over the lengths of derivations of such simple arguments as those considered, we assure the reader that an extended treatment would include the introduction of further derived rules of inference to streamline derivations. The outcome is the concept of an "informal proof." In mathematics this amounts to a derivation in the conversational style to which one is accustomed: mention of rules of inference and tautologies used is suppressed, and attention is drawn only to the mathematical (that is, nonlogical) axioms and earlier theorems employed. (Further details of this are supplied in the next chapter.) The principal advantage accrues in having informal proofs as the evolution of formal derivations is this: One has a framework within which it can be decided in an objective and mechanical way, in case of disagreement, whether a purported proof is truly a proof. EXERCISES Construct a derivation corresponding to each of the following arguments. 9.1. No freshman likes any sophomore. All residents of Dascornh are sophomores. Therefore, no freshman likes any resident of Dascomb. (Fx, LV, Sv, Dx) 9.2. Art is a boy who does not own a car. ,Jane likes only boys who own cars. Therefore, ,Jane does not like Art. (Bx, Ox, Lxy, a, j) 9.3. No Republican or Democrat is a Socialist. Norman Thomas is a Socialist. Therefore, he is not a Republican. (Rx, Dx, Sx, t)
9.4. Every rational number is a real number. There is a rational number. Therefore, there is a real number.
220
Logic
I
CHHAP. 4
9.5. All rational numbers are real numbers. Some rationals are integers. Therefore, some real numbers are integers. (Qx, Rx, Zx) 9.6. All freshmen date all sophomores. No freshman dates any junior. There are freshmen. Therefore, no sophomore is a junior. 9.7. No pusher is an addict. Some addicts are people with a record. Therefore, some people with a record are not pushers. 9.8. Sonle freshmen like all sophomores. No freshmen likes any junior. Therefore, no sophomore is a junior. (Fx, Lxy, Sx, Jx) 9.9. Some persons admire Elvis. Some persons like no one who admires Elvis. Therefore, some persons are not liked by all persons. (Px, Ex, Lxy)
BIBLIOGRAPHICAL NOTE Extended treatments of symbolic logic, pitched at approximately the same level as that of this chapter, appear in Copi (1954), Exncr and Rosskopf (1959), Suppes (1957), and Tarski (1941). Formulations of both the statement calculus and the first-order predicate calculus as axiomatic theories are given in Chapter 9 of this book. The bibliographical notes for that chapter include references to more comprehensive accounts of this subject matter.
CHAPTER
5 Informal Axiomatic Mathematics
ONE OF THE striking aspects of twentieth century mathematical research is the enormously increased role which the axiomatic approach
plays. The axiomatic method is certainly not new in mathematics, having been employed by Euclid in his Elements. However, only in relatively recent years has it been adopted in parts of mathematics other than geometry. This has become possible because of a fuller understanding of the nature of axioms and the axiomatization of logic.
The axiomatization (in the way we shall discuss it presently) of various fragments of mathematics was the main subject of studies of the
foundations of mathematics, from the late 1880's until the 1920's. At that time the present-day approach began to flourish. Distinctive features of this modern approach include the explicit incorporation into the set of axioms of a theory, those which provide for a "built-in" theory of inference, and the concentration on the theory of models for structures characterized by sets of axioms. Chapter 9 is devoted to an
introduction to this modern approach. The present chapter, when judged relative to standards imposed by the present stage of investigations of the foundations of mathematics, belongs to the past. But, we repeat, it expounds the axiomatic method as it is used currently in everyday mathematics.
1. The Concept of an Axiomatic Theory The concept to be described is an outgrowth of the method used by Euclid in his Elements to organize ancient Greek geometry. The plan of this work is as follows. It begins with a list of definitions of such notions
as point and line; for example, a line is defined as length without breadth. Next appear various statements, some of which are labeled axioms and the others postulates. It appears that the axioms are intended to be principles of reasoning which are valid in any science (for example, 221
222
Informal Axiomatic Mathematics
I
CtfAP. 5
one axiom asserts that things equal to the same thing are equal to each
other) while the postulates are intended to be assertions about the subject matter to be discussed-geometry (for example, one postulate asserts that it shall be possible to draw a line joining any two distinct points). From this starting point of definitions, axioms, and postulates, Euclid proceeds to derive propositions (theorems) and at appropriate places to introduce further definitions (for example, an obtuse angle is defined as an angle which is greater than a right angle). Several comments on Euclid's work are in order. It is clear that his goal was to deduce all of the geometry known in his day as logical consequences of certain unproved propositions. On the other hand, we can only conjecture as to his attitude toward other facets of his point of departure. From a modern viewpoint it may be said that he treated point and line essentially as primitive or undefined notions, subject only to
the restrictions stated in the postulates, and that his definitions of these notions offer merely an intuitive description which assists one in thinking about formal properties of points and lines. I lowevcr, since the geometry
of that era was intended to have physical space as an interpretation, it is highly plausible that Euclid assigned physical meaning to these notions. Further evidence to support this conclusion is to be found in some proofs where Euclid made assumptions that cannot be justified on the basis of his primitive notions and postulates, yet which, on the basis of the intended interpretation of his primitive notions, appear to be evident. If, indeed, Euclid was confused between formal or axiomatic questions and problems concerning applications of geometry, then herein lies the source of the only flaws in his work as judged by modern stand-
ards. Concerning the postulates, he probably believed them to be true statements on the basis of the meaning suggested by his definitions of the terms involved. Since proofs were not provided for the postulates,
they acquired the status of "self-evident truths." This attitude with respect to the nature of postulates or axioms (now, incidentally, no distinction is drawn between these two words) still persists in the minds of many. Indeed, in current nonmathematical writings it is not uncom-
mon to see such phrases as "It is axiomatic that" and "It is a fundamental postulate of" used to mean that some statement is beyond all logical opposition. Within mathematics this point of view with respect to the nature of axioms has altered radically. The change was gradual and it accompanied the full understanding of the discovery by J. Bolyai and (independently) N. Lobachevsky of a non-Euclidean geometry. Let us elaborate on this matter.
5.1
1
The Concept of an Axiomatic Theory
223
In the traditional sense a non-Euclidean geometry is a geometry whose formulation coincides with that of Euclidean geometry with the one exception that Euclid's fifth postulate (the "parallel postulate") is denied. The fifth postulate is "If two lines are cut by a third so as to make the surn of the two interior angles on one side less than two right angles, then the two lines, if produced, meet on that side on which the interior angle sum is less than two right angles." An equivalent formulation, in the sense that either, together with the remaining postulates, implies the other, and one which is better suited for comparison purposes,
is "In a plane, if point A is not on the line 1, then there is exactly one line on A parallel to 1." This is one of many axioms equivalent to the parallel postulate which were obtained as by-products of unsuccessful attempts to substantiate the belief that the parallel postulate could be derived from Euclid's remaining axioms. Bolyai and Lobachevsky dispelled this belief by developing a geometry in which the parallel postulate was replaced by the statement "In a plane, if the point A is not on line 1, then there exists more than one line on A parallel to 1." Apparently, the "truth" of this new geometry was initially in doubt. But on the basis of measurements that Could be made in the portion of physical space available, there appeared to be no measurable differences between the predictions of the Bolyai-Lobachevsky geometry and those of Euclidean geometry. Also, each geometry, when studied as a deductive
system, appeared to be consistent so far as riot yielding contradictory statements. The ability to examine these geometries from the latter point of view represented a great advance, for, in essence, it amounted to the detachment of physical meaning from the primitive notions of point, line, and so on. A second advance in the attitude toward the axiomatic method accompanied the creation of various models in Euclidean geometry of the Bolyai-Lobachevsky geometry. A typical example is the model proposed
in 1871 by helix Klein, for which he interpreted the primitive notions of plane, point, and line, respectively, as the interior of a fixed circle in the Euclidean plane, a Euclidean point inside this circle, and an openended chord of this circle. If, in addition, distances and angles are computed[ by formulas developed by A. Cayley, in 1859, then all axioms
of plane Bolyai-Lobachevsky geometry become true statements. The immediate value of such an interpretation was to establish the relative consistency (a concept which will be described in detail later) of the Bolyai-Lobachevsky geometry. That is, if Euclidean geometry is a consistent logical structure, then so is the Bolyai-Lobachevsky geometry.
224
Informal Axiomatic Mathematics
i
CH A P. 5
Of greater significance, so far as understanding the nature of axiomatic theories, was the entertainment of the possibility of varying the meaning of the primitive notions of an axiomatic theory while holding fixed its deductive structure. This evolution in the understanding of the nature of the axiomatic
method set the stage for the present-day concept of an axiomatic theory. In its technical sense the word "theory" is applied to two sets of statements, of which one is a distinguished subset of the other. The entire set of statements defines the subject matter of the theory. In the sciences, apart from mathematics, the members of the distinguished subset arc those statements which are classified as true statements about the real world, with experiment the ultimate basis for the classification. In sharp contrast, it is a characteristic feature of an axiomatic theory that the notion of truth plays no role whatsoever in the determination of the distinguished subset. Instead, its members, which arc called theorems or provable statements, are defined to be those statements of the theory that can be deduced by logic alone from certain initially chosen statements called axioms (or postulates). A precise definition of theorem can. be given in terms of the notion of proof. A (formal) proof is a finite column (Si, S2, - , Sk) of statements of the theory such that each S either is an axiom or comes from one or more preceding S's by the rules of inference of the system of logic employed. A theorem or provable statement is a statement which is the last line of some proof. Note that, in particular, an axiom is a theorem with a one-line proof.
In the consideration of an axiomatic theory the notion of truth is relegated to possible applications of the theory. In any circumstance in
which the axioms are accepted as true statements and the system of logic is accepted, then the theorems must be accepted as true statements since the theorems follow from the axioms by logic alone. That is, it is
the potential user of an axiomatic theory who is concerned with the question of the truth of the axioms of the theory.
Today, axiomatic theories are usually presented in essentially the same way that Euclid began his development of geometry--by listing the primitive notions and the axioms of the theory. However, in order to meet one of the present-day requirements of an axiomatic theorythat truth play no role--the primitive potions are taken to be undefined and the axioms are taken as simply an initial stock of theorems. We, shall elaborate on these matters in connection with a discussion of the evolu-
5.1
I
The Concept of an Axiomalii: Theory
225
tion of axiomatic theories from intuitive theories (which constitute a primary source of axiomatic theories). Usually one's first exposure to some branch of science is by way of an intuitive approach; subjects such as arithmetic, geometry, mechanics,
and set theory, to cite just a few, are approached in this way. An axiomatization of such an intuitive theory can be attempted when the fundamental notions and properties are believed known and the theory appears to be sound to the extent that reliable predictions can be made with it. The first step in such an attempt is to list what are judged to be the basic notions discussed by the theory together with what are judged to be a basic set of true statements about these notions. In order to carry out this step efficiently, one often elects to presuppose certain theories previously constructed. In most axiomatic work in mathematics it is customary to assume a theory of logic along with a theory of sets. t In axiomatic work in an empirical science such as economics or physics it is standard procedure to assume, in addition to logic and general set theory, parts of classical mathematics. Once it has been decided what theories will be assumed, the key steps in the axiomatization can be carried out. The first of these is the introduction of symbols (including, possibly, words) as names for those notions which have been judged to be basic for the intuitive theory. These are called the primitive sym-
bols (or, terms) of the axiomatic theory. The only further symbols which are admitted (aside from symbols of the presupposed theories) are defined symbols, that is, expressions whose meaning is explicitly stated in terms of the primitive symbols. (The intuitive theory in mind often suggests the introduction of some such symbols.) The next step is the translation of those statements that were singled out as expressing
fundamental properties of the basic notions of the intuitive theory into the language which can be constructed from just the primitive. and defined terms (and those of any theory which is presupposed). To obtain an example of a language of the sort mentioned above, let us consider an axiomiatization of intuitive set theory with the first-order predicate calculus as the only presupposed theory. In addition to logical symbols, only one further (primitive) symbol, the familiar one for the membership relation, shall be employed. Then the language which is available is that described in Section 4.7, with expressions of the form
xCy t By a theory of sets we mean some development which includes roughly the content of Chapters 1, 2, and 3. Often a theory of sets which encompasses this material is referred to as "general set theory."
226
Informal Axiomatic Mathematics
I
CHAP. 5
constituting the totality of prime sentences (formulas). A list of useful defined symbols for the theory appears in Example 4.7.1. If some theory of logic is not assumed for an axiomatization, then one must include in
the presentation of the theory an axiomatized version of a theory of inference. A detailed discussion of this begins in Section 9.3.
In a program of the sort we have described for axiomatizing an intuitive theory, there is often considerable leeway in the choice of primitive notions. Different. sets may be suggested by various combinations of notions which occur in the intuitive theory. In the modern axiomatization of Euclidean geometry devised by D. Hilbert there are six primitive notions: point, line, plane, incidence, betweenness, and congruence. On the other hand, in that created by M. Pieri there are but two primitive notions: point and motion. Obviously the choice of primitive notions for an axiomatic theory influences the choice of axioms.
A great variety of more subtle remarks can be made concerning the selection of axioms for a particular theory. Some are presented in Section 4. While we are dealing in generalities we will mention another stimulus
for the creation of axiomatic theories-the observation of basic likenesses in the central features of several different theories. This may prompt an investigator to distill out these common features and use them as a guide for defining an axiomatic theory in the manner described above. Any one of the theories which an axiomatic theory is intended to formalize serves as a potential source of definitions and possible theorems
of this axiomatic theory. An axiomatic theory which successfully formalizes an intuitive theory is a source of insight into the nature of that theory, since the axiomatic theory is developed without reference to meaning. One which formalizes each of several theories to some degree has the additional merit that it effects simplicity and efficiency. Since
such an axiomatic theory has an interpretation in each of its parent theories (on a suitable assignment of meaning to its primitive terms), it produces simplicity because it tends to reduce the number of assumptions which have to be taken into account for particular theorems in any one of the parent theories. Efficiency is effected, because a theorem of the axiomatic theory yields a theorem of each of the parent theories. Herein
lies one of the principal virtues of taking the primitive terms of an axiomatic theory as undefined.
A by-product of the creation of an. axiomatic theory which is the common denominator of several theories is the possibility of enriching and extending given theories in an inexpensive way. For example, a
5.2
I
Informal Theories
227
theorem in one theory may be the origin of a theorem in the derived theory and it, in turn, may yield a new result in another parent theory. In addition to the possible enrichment in content of one theory by another, by way of an axiomatic theory derived from both, there is also the possibility of "cross-fertilization" insofar as methods of attack on problems are concerned. That is, a method of proof ' whic h is standard
for one theory may provide a new method in another theory with a derived theory serving as the linkage. A full understanding of such remarks as the foregoing cannot possibly be achieved until one has acquired some familiarity with it variety of specific theories and analyzed some successful attempts to bring diverse theories under a single heading. The field of algebra abounds in such successful undertakings. Indeed, it is perhaps in algebra that this type of genesis and exploitation of theories has scored its greatest successes. Several important examples of algebraic (axiomatic) theories are discussed later.
2. Informal Theories In this section we shall discuss the formulation of axiomatic theories when a theory of inference and general set theory are presupposed as already known. Such axiomatic theories will be called informal theories. As has already been mentioned, it is common practice in mathematics to present axiomatic theories as informal theories. The first [natter to be thoroughly understood about informal theories
is the working forms which are adopted for the assumed theories of inference and of sets that is, the actual settings in which informal theories are presented. Concerning the theory of inference, it is simply the intuitive theory which one. absorbs by studying That this theory is clearly defined is suggested by the fact that what is judged to be it proof by one competent is usually acceptable to other mathematicians. '['his is not the end of the matter, however. The contents of Chapter 4 indicate that there is it systctu of logic (the first-
order predicate calculus) which is adequate for much of mathematics and which can be described in precise terms. Both the preciseness and adequacy of the first-order predicate calculus take on sharp forms later when we give an version of this theory (Section 9.3) and then prove its in the sense that every valid formula is a theorem. Further, there is considerable evidence to support the contention that the definition of logical correctness which is supplied by this
228
Informal Axiomatic Mathematics
I
c tt A P. 5
symbolic logic is closely attuned to the corresponding intuitive notion which mathematicians acquire. Such a book as Logic for Mathematicians, by J. B. Rosser (1953), is rich in examples which illustrate his thesis that logical principles which are judged correct by most mathematicians are
classified as correct by symbolic logic and vice versa. That is, there is considerable evidence in support of the thesis that the system of logic which is presupposed for an informal theory is a clearly defined theory which can be spelled out if necessary. This empirical conclusion does not
evidence itself in mathematicians giving formal proofs and then using the mechanical procedures provided by the predicate calculus for testing their correctness. However, it is usually not difficult to convince oneself that an accepted, informal proof could be formalized if demanded. The set-theoretical framework which is assumed for an informal theory
is the general set theory developed in Chapters 1-3. Although contradictions can be devised within this intuitive theory, that part which is employed in developing informal theories does not lead to such difficulties so far as is known. For the moment we shall support this latter statement with only the following remark. The intuitive set theory we have discussed can be axiomatizcd in such a way that (i) so far as is
known, all undesirable features (that is, the known paradoxes) are avoided, and (ii) all desirable features consonant with (i) are retained. An outline of such a development is given in Chapter 7. We turn now to some examples of informal theories. These will serve to illustrate the two circumstances described at the end of the preceding section under which axiomatic theories are devised (namely, to axiomatize some one intuitive theory and to formalize simultaneously several
theories). Further, they will serve to illuminate our later discussion of informal theories.
EXAMPLES 2.1. In Example 2.1.2 appears what is essentially Peano's axiomatization of the natural number system. The primitive notions are natural number, zero (0), and successor ('), and the axioms are the statements Pi--P6 appearing there. 2.2. Immediately following Theorem 3.4.1 we called attention to certain likenesses in the properties of the rational numbers and integers. Specifically, we noted that the system consisting of Q, the operation of addition, and 0,, as well as the system consisting of 0 - {0,}, multiplication, and 1, share, with the system made up of Z, addition, and 0;, properties (1)-(4) of Theorem 3.3.1. Thus, we argued, any further properties of the integers which can be derived from (1)-(4) (for example, those mentioned in Exercise 3.3.5) also hold for the other
5.2
I
Informal Theories
229
two systems. In terms of our current discussion we may classify that argument as a bit of axiomatic mathematics. Before formulating explicitly the axiomatic theory involved we remark that for the derivation of the results stated in Ex-
ercise 3.3.5 the property of commutativity of addition is not required (we "allowed" the reader to use this property because simpler proofs can be given with it). Essentially the same simplifications in the proofs can be achieved if commutativity is assumed only in part, as in the axioms below. The axiomatic theory to be described is called group theory. The primitive notions are an unspecified set C, a binary operation in G, for which we use multiplicative notation (that is, the operation will be symbolized by and the value at (a, b) of this function on C X G into G will be designated by a b), and an element e of G. The axioms are the following. G1. G2.
For all a, b, and c in C, a (b c) = (a b) c. For all a in G there exists an a' in G such that a a' = a'
a = e.
The above is a formulation of group theory as one might find it in an algebra text. In harmony with the agreement to write the value of at (a, b) as a b, we call this element the product of a and b. Henceforth we shall use the simpler notation ab for it. An element which has the property assumed for e in G2 is called an identity element and an element which satisfies Ga for a given a is called an inverse of a (relative to e). A few theorems of group theory, including those to which reference has been
made in connection with number systems, are proved next. G4.
Proof.
G contains exactly one identity element. In view of G2, only a proof of the uniqueness is required. Assume that
each of el and e2 is an identity element of G. Then ela = a for every a, and ae2 = a for every a. In particular, ele2 = e2 and eie2 = el. Hence, el = e2 by properties of equality. G6.
Each element in G has exactly one inverse.
Since G$ asserts the existence of an inverse for each element a, only the proof of its uniqueness remains. Assume that both a' and a" are inverses of a. Proof.
Then a"a = e and aa' = e. By G1, (a"a)a' = a"(aa'), and, hence, ea' = a"e. Using G2 it follows that a' = a". In multiplicative notation the inverse of a is designated by "a'1"; thus a -'a = as 1 = e (the unique identity element of C). Ge.
For every a, b, and c in G, if ab = ac, then b = c, and, if ba = ca, then b = c.
Informal Axiomatic Mathematics
230
I
CHAP P. 5
Assume that ab = ac. Now a-'(ab) = (a-'a)b = eb = b. On the other hand, a-'(ab) = a-'(ac) _ (a-'a)c = ec = c. Hence, b = c. The proof of the Proof.
remaining assertion is similar. Proofs of the next two theorems are left as exercises. G7. G8.
For all a and b in G, each of the equations ax = b and ya = b has a unique solution in G. For all a and b in G, (ab)-' = b-'a
2.3. The theory to be described has its origin in Euclidean plane geometry. It is that generalization of Euclidean geometry known as afline geometry. The primitive notions arc a set (1' (whose members are called points and will be denoted by capital letters), a set 2 (whose members are called lines and will be denoted by lower-case letters), and a set q called the incidence relation. The axioms are as follows. AG,.
J C 6' X 2. ((P, 1) C 9 is read "P lies on l," or "l contains P," or "1 passes through P.")
AGs.
For any two distinct points P and Q there is exactly one line passing
through P and Q. (This line will be denoted by P + Q.) Before stating the next axiom we make a definition. If I and in are two lines such that either I = m or there exists no point which lies on both l and in, then I and in are called parallel. AG3.
AG,.
AG6.
For any point P and any line I there exists exactly one line in passing
through P and parallel to I. If A, B, C, D, E, and F are six distinct points such that A + 13 is parallel to C + 1), C + I) is parallel to E -F F,, A 4- C is parallel to B + D, and C + E is parallel to 1) + F, then A -F- E is parallel to B + 1". There exist three distinct points not on one line.
Proofs of a few simple theorems are called for in the following exercises.
Since axiomatic theories are often elaborate structures, they deserve
elaborate symbols as n;lnes. To our mind, capital (;ernrut letters suffice. Consielcr now an iniorinal theory T. Associated with it is a language which can be constructed from the primitive and defined terms of I and the terminology of set theory and logic. We shall call this language the T -language and its nlcniber sentences `;-sentences. "Those T-sentences which involve no free variables shall be called a-statements. (Parenthetically we remark that Z--sentences, are usually
5.2
I
Informal Theories
231
written using a combination of words and symbols, as in the foregoing examples, instead of the purely symbolic style of Examples 4.7.1 -4.7.3.) An interpretation of `;" consists of selecting a particular nonempty
set D (called the domain of the interpretation) as the range for the individual variables of ` and assigning to each primitive term an object
of the same "character" constructed from D; that is, to a binary relation symbol we assign a binary relation in D, to a binary operation symbol we assign a binary operation in D, to an individual constant we assign an element of D, and so on. This can scarcely be regarded as a definition of an interpretation in view of its vagueness. Until such time as we correct this deficiency we shall rely on the reader's intuition and the examples below. If I is an interpretation of `3: in a system X9J1 and if S is a `3"-sentence, then we shall call the sentence, which results on the assignment of meaning (as specified by I) to the primitive terms of `;" that occur in S, an interpretation of Sin 931. If an interpretation of S in 931 is it true statement of 931, we shall say that S is true in 931, or that 931 is a model of S. If 2; is a set of a-sentences, then 9X is
called a model of I iff it is a model of each member of 2;. If T1 is a model of the set of axioms of T, then 931 is called a model of `i;". Notice
that such definitions are relative to some one interpretation of `; in P. As our first illustration of the notion of it model we note that each of the progressions described in Example 2.1.1 is a model of the Peano axioms under the obvious interpretations of natural number, zero, and successor. Next, the set 0(X) of all one-to-one mappings on a nonempty set X onto itself together with function composition and ix is a model of the theory of groups or, more simply, is a group. Again, the power set of any set together with the symmetric difference operation and the empty set is a group. As for models of atline geometry, one who is familiar, to some degree, with intuitive Euclidean geometry will undoubtedly accept it as an afline geometry. A radically different model results on setting (f' = 11, 2, 3, 4},,c _ {{1, 21, {1, 3}, 11, 4}, {2, 3}, {2, 4}, 13, 4}} and defining P to be on I iff P C 1. The verification that all axioms are satisfied is left as an exercise. It is an accepted property of a model T1 of an informal theory `;` that each theorem of is true in 91.f The supporting argument is simply that
(by definition of a model of Z) each axiom is true in 931 and each theorem of T_ is derived from the axioms by logic alone. An illustration may be given in terms of Theorem G8 of Example 2.2. 'The interpretat A 2-sentence in which an individual variable x has a free occurrence is interpreted as if the quantifier "For all x" were prefixed to the sentence.
232
Informal Axiotnatic Mathematics
I
e n n t' . 5
Lion of Gg in the group G(X) of mappings is the statement that if a, b C G(X) then (a o b)-' = b'-' o a-', which is an important property of functional inversion. The interpretation of Gg in the group consisting
of Z, addition, and 0 is the statement that -(a + b) = (-b) + (-a). Thus, these two results, diverse in appearance, are interpretations of a single statement of group theory.
EXERCISES 2.1. Prove Theorems G7 and Cg in Example 2.2. 2.2. The theory of commutative groups differs from the theory of groups in that it includes one further axiom: G9.
For all a and
b
in G, ab = ba.
It is common practice to use additive notation for the operation in a commutative group (that is, to write a -I- h instead of ab), to write 0 instead of e, and
to write -a instead of a`. Suppose that G together with A- and 0 is a commutative group. Prove each of the following theorems.
(a) -(a -l- b) = (-a) + (-b). (b) If "a - b" is an abbreviation for "a + (-b)," then a + b = c iff b = c - a.
(c) a-(-b)=a-I-batic] -(a-b)=b-a.
(d) If f: G-+- G where f(a) _ -a, then f is a one-to-one and onto mapping. 2.3. Let Z be the set of residue classes [a] of Z modulo n (see Section 1.7). Show that the relation {(([a], [b]), [a -I- bbl)I Jal, [b] F_ Z,,} is a binary operation in 7.,,. Show that Z. together with this operation and [0] is a commutative group.
2.4. Show that an operation + can be introduced in the set I of equivalence classes defined in Exercise 1.7.11, by the definition [a, bJ A- [c, d] = [a + c, b + d], where [a, b] is the equivalence class determined by (a, b), and so on. Prove that I together with this operation and [1, 1] is a commutative group. 2.5. Show that R together with the operation * such that x * y = (x3 +ya) 13 and 0 is a group. 2.6. Write out the elements of G(X) for X = 11, 2} and for X = 11, 2, 3}. Show that the group associated with the latter set of mappings is not commutative.
2.7. Let G be a nonempty set and be a binary operation in G such that G, and G7 hold. Prove that C, , and a suitable clement of C is a group. 2.8. Let G be a nonempty finite set and be a binary operation in C such that G, and G6 hold. Prove that C, , and a suitable clement of C is a group.
5.3
Definitions of Axiomatic Theories
I
233
2.9. This exercise is concerned with afine geometry as formulated in Example 2.3.
(a) Prove that "is parallel to" is an equivalence relation on C. An equivalence class is called a pencil of lines. (b) Let a, and 7r2 be two distinct pencils of lines. Using only AG2 and AG3, prove that the number of points on any line 1 of 7r, is the same as the number of lines of 7r2.
(c) Using (b), prove that if there exist three distinct pencils of lines, then all lines have the same number of points, all pencils have the same number of lines, and every pencil has the same number of lines as the number of points on every line. (d) From AG6 infer that there exist at least three distinct pencils of lines. (c) Show that the set of four points and six lines given in the text is a model of the theory. (f) Show that any affine geometry contains at least four points and six lines.
2.10. Let S be the axiomatic theory having as its primitive notions two sets P and L and as its axioms the following.
A3.
If l C L, then I -C P. If a and b are distinct elements of P, then there exists exactly one member I of L such that a, b C 1. For every I in L there is exactly one I' in L such that I and I' are
A4. A6.
disjoint. L is nonempty. Every member of L is finite and nonempty.
A,. As.
Establish the following theorems for e.
(a) Each member of L contains at least two elements. (b) P contains at least four elements. (c) L contains at least six elements. (d) Each member of L contains exactly two elements.
3. Definitions of Axiomatic Theories by Set-theoretical Predicates We continue our discussion of the axiomatization of intuitive theories with a description of a uniform approach which takes fuller advantage of the expressive powers of general set theory. The point of departure is the observation (which is substantiated, in part, by those theories discussed in Examples 2.1-2.3) that the primitive notions of a great variety of mathematical theories consist of a set X and certain constants
Informal Axiomatic Mathematics
234
I
CHAP. 5
associated with X. These constants may be of various types: elements of X (such as the identity element of a group), subsets of X, collections of subsets of X (such as the lines of an affine geometry), subsets of X" for some n (which include relations in X and operations in X), and so on. Collectively, the constants serve as the basis for imposing a certain structure on X (which is the object of study of the theory). The structure itself is given in the axioms, which are the properties assigned to X and the constants (including, possibly, the existence of inner relations among them). The approach to the axiomatization of theories which stems from the foregoing observations calls for definitions of axiomatic theories by way of set-theoretical predicates. A consideration of several examples will serve to bring the procedure into focus. In our first example we consider the theory of partially ordered sets. The purely set-theoretical character of the predicate "is a partially ordered set," which is defined should be apparent.
DEFINITION A. W is a partially ordered set if there is a set X and a binary relation p such that 4C = (X, p) and 01. p is reflexive in X, 02. p is antisymmetric in X, 03. p is transitive in X.
This definition illustrates a convention which we shall follow in this discussion, namely, to exhibit the basic set as the first coordinate of an ordered n-tuple, and the associated constants, in some order, as the remaining coordinates.
The sentence in Definition A may be regarded as being in need of recasting if it is to appear in the running text since it begins with a symbol. The following version meets this objection.
A partially ordered set is an ordered pair (X, p) where X is a set, p is a binary relation, and the following conditions are satisfied. 01.
p is reflexive in X.
02.
p is antisymmetric in X.
O. p is transitive in X.
5.3
1
235
Definitions of Axiomatic Theories
An alternative to Definition A, which is closer to standard mathematical practice, is a conditional definition.
DEFINITION B. Let X be a set and p be a binary relation. Then (X, p) is a partially ordered set if
0t. p is reflexive in X, 02. 03.
p is antisyrnmetric. in X, p is transitive in X.
This definition is conditional in the sense that the proper definition is prefaced by a hypothesis. When it definition is so formulated it is common practice to omit the hypothesis in stating theorems of the theory. Our second example is a definition of group theory along the lines suggested by the axiomatization appearing in Example 2.2.
DEFINITION C. (i is a group if there is a set X, a binary operation
in X, and an element e of X such
that V = (X, , e) and G1.
for all a, b, and c in X, a
Cs.
for all a
(b
c) _ (a
b)
c,
in X there exists an a' in X such
that a- a' = a theory Z is axiomatized by defining a set-theoretical predi-
cate, what we have called up to this point the primitive symbols (or terms) of the theory appear in the running text immediately preceding the axioms. Also in this circumstance models of are simply those `.
entities which satisfy the predicate. For the theory of groups, for example, the point can be put quite trivially as follows: If (X, , e) is a group, then (X, , e) is a model for the theory of groups.
EXERCISES 't'hese exercises are concerned with the theory of simply ordered commutative groups, which may be defined as follows: (SS is it simply ordered continut,ttive group (.r.o.c.g.) iff (SS = (C, 1-, 0, <), where SG,. SG2. SG3.
(G, -+, 0) is a commutative group, ((;, <) is a simply ordered set, for all a, b, and e in G, if a < b, then a -- c < b -1- c. (I Iere, "a < b" is an abbreviation for "a < b and a X. b.")
Informal Axiomatic Mathematics
236
I
C H A P. 5
All results obtained earlier for groups, in particular, commutative groups, may be used when needed. Also, properties of simply ordered sets may be used. 3.1. Find two s.o.c.g. within the real number system.
3.2. If (G, +, 0, <) is a .s.o.c.g., define. GE to be (a C GI0 < a). Prove the following properties of
G+.
(a) If a E G+, then -a (Z GE. (b) If a ; 0, then either a E G+ or -a C G1.
(c) If,a,bCG',then a+bCG+.
3.3. Prove the following theorems for a s.o.c.g.
(a) Ifa
(b) If a + c < b + c, then a < b.
(c) If a
4. Further Features of Informal Theories In this section we introduce a variety of notions which have relevance
to informal theories. Most of these serve to provide a classification scheme for a given theory. Thereby its status and its merits can be summarized concisely.
Suppose that A is a formula of some theory Z and that both A and -1A are theorems. Then, if the system of logic employed includes the statement calculus with modus ponens as a rule of inference, any formula B of the theory is a theorem. Indeed, A (-i A---> B) is a theorem
since it is a tautology, and two uses of modus ponens establish B as a theorem. A theory Z is called inconsistent if it contains a formula A such that both A and -i A are theorems. A theory is called consistent if it is not inconsistent-that is, if it contains no formula A such that both A and --,A are theorems. Since in any theory which we shall consider the logical apparatus will
include what was used above, we regard an inconsistent theory as worthless, since every formula is a theorem. Thus, the question of establishing the consistency of a theory becomes of primary importance. A moment's reflection will point out the high degree of improbability of
reaching an answer by direct application of the definition and, consequently, of the need for a "working form" of the definition of consistency. That which is usually adopted in mathematics is: the existence
5.4
I
Further Features of Informal Theories
237
of a model of a theory implies the consistency of the theory. The supporting argument is based on (i) the property of a model mentioned at the end of Section 2, namely, if 9x1 is a model of the theory Z, then each theorem of X is true in T Z, and (ii) the assumption that if S is a Z-statement then not both of S and -,S are true in 92. Indeed, assuming (i) and (ii), suppose that has a model TZ. If both of the T-statements S and -,S are theorems, then both S and --,S are true in X71 by (i) and
this is a contradiction by (ii). Hence, if Z has a model, then Z is consistent.
In essence, the foregoing working form of consistency merely substitutes an inspection of true statements about a model of a theory for an inspection of theorems of the theory. If a model of a theory (X, ) can be found such that the interpretation of X is a finite set, one may expect that the question of whether it is free from contradiction can be
settled by direct observation. For example, the fact that ((e}, , e), where e e = e, is a model of group theory establishes the consistency of group theory beyond all doubt.
If, on the other hand, a theory has only infinite models (that is, models where the interpretations of the basic set are infinite), then no net gain results upon substituting an inspection of true statements about a model for that of theorems of the theory. Such models of a given theory T really amount to interpretations of T in another theory such
that the interpretation of each axiom of Z is a theorem of the other theory. If this other theory is consistent, then T must be. For suppose that a contradiction were deducible from the axioms of Z. Then, in the other theory, by corresponding inferences about the objects constituting the model, a contradiction would be deducible from the corresponding theorems. Such demonstrations of consistency are merely relative: The theory for which a model is devised is consistent if that from which the model is taken is consistent. Let us consider some examples. As described in Section 1, the plane geometry of Bolyai-Lobachevsky has a model in Euclidean plane geometry. Thereby the relative consistency of this nonEuclidean geometry is established in the form : If Euclidean geometry
is consistent, then so is the Bolyai-Lobachevsky geometry. A proof of
the consistency of Euclidean geometry, as precisely formulated in Hilbert (1899), can be given by interpreting a point as an ordered pair of real numbers and a line as a linear equation; in more familiar guise this is simply the standard coordinatization of the Euclidean plane.
However, since the theory of real numbers has never been proved
238
Informal Axiomatic Mathematics
I
C H A P. 5
consistent, one may conclude merely that if the theory of real numbers is consistent, then so is Euclidean geometry. In other words, we obtain a relative consistency proof. In turn, since we have seen that a construction of the real numbers can be given, starting from Peano's axioms, within a sufficiently rich theory of sets, a consistency proof of the theory of real numbers can be given relative to a theory which embraces both Peano's theory and this theory of sets.
Assuming that the consistency of a theory has been settled in the affirmative by proof or by faith, the question of its completeness may be raised. In rough terms, a theory is called complete if it has enough theorems for some purpose. The variety of purposes which may enter in this connection are responsible for a variety of technical meanings being assigned to this notion. However, most definitions of completeness fit into either the category which corresponds to a positive approach or that which corresponds to a negative approach to the question of a sufficiency of theorems. We shall give one definition in the first category and two in the second. The setting for the first of these, which is in the positive vein, is as follows. We know that if Dl is a model of a
theory Z and T is a theorem of Z, then T is true in V. We might regard T as being complete with respect to T1 if, conversely, whenever a
Z-statement has a true statement of 1J as its interpretation, then that
a-statement is a theorem. This suggests calling Z complete if it is complete with respect to every model. If we understand by a (universally) valid statement of a theory one which is true in every model, then the notion of completeness which we have in mind may be formulated as: A theory Z is deductively complete if every valid statement
is provable. The statement calculus can be formulated as an axiomatic theory which is complete in this sense (see Section 9.2);
of
that is, every tautology is a theorem. If we approach the question of a sufficiency of theorems in a negative fashion, we are led to a second category of formulations of completeness. For example, we might say that a theory is complete if the axioms
provide all theorems we can afford to have without some dire consequence (such as inconsistency) ensuing. A circumstance which might suggest this interpretation of completeness is an attempt to devise an axiomatic theory intended to formalize some intuitive theory. For then one strives to include sufficient axioms that as many as possible true propositions of the intended model aan be obtained as interpretations of theorems of the theory. Hence, one keeps adding, as axioms, formulas
5.4
I
Further Features of Informal Theories
239
which express true propositions of the model up to the point that an inconsistent theory results. This approach to completeness may be crystallized in the following definition. An axiomatic theory Z is formally complete provided that any theory ', which results from Z by the adjunction to the axioms of of a statement of which is not already a theorem of ';', is inconsistent. A theory which is formally complete may be said to have maximum consistency. is said to be negation complete if, for any An axiomatic theory statement A of the theory, either A or --,A is a theorem. It is clear that negation completeness implies formal completeness. Conversely, if the theory of inference employed in developing an axiomatic theory includes a deduction theorem-that is, a theorem which asserts that if a formula B is deducible from formulas A,, A2, ., A,,,, then A. B is deducible , A,,,_, -then formal completeness implies negation cornfrom A,, A2, pleteness. To show this, suppose that a theory T. is formally complete
and that the T--statement A is not a theorem. Th.'n the theory which results on the adjunction of A as an axiom is inconsistent. That is, if I' is the set of axioms of :?", then a contradiction C can be derived from IF U I A y, whence A - C can be derived from 1'. In turn, since (A --> C) --+ --,A is a theorerrt (being a tautology), A can be derived from A > C. I fence, -, A can be derived from I'; that is Z is negation complete. We may loosely relate consistency and completeness in the following
way. An axiomatic theory is consistent if it does not have too many theorems and it is complete if it does not have too few. If an axiomatic theory is both consistent and negation complete, then all questions which arise within the framework of the theory are theoretically decidable in exactly one way. For any statement of the theory is either provable or refutable (that is, its negation is provable) because of completeness, and cannot be both proved and refuted because of consistency. Such a state of affairs for a theory does not always imply that proofs or refutations of specific statements of the theory are automatically made
available, but in some interesting cases it does. 't'hat is, for some con-
sistent and complete theories there exists a method which can be described in advance for deciding in a finite number of steps whether a given formula of the theory is a theorem. Such theories are called decidable (sere Section 9.5).
Notions of the sort which we have introduced so far in this section as well as that of categoricity (which is described next) cannot, in general,
240
Informal Axiomatic Mathematics
(
CFI A P. 5
be discussed in a precise and definitive way at our present intuitive level
of discourse. A precise account is possible only when the theory of inference is explicitly incorporated into an axiomatic theory. In Chapter 9 we shall show how this can be done for an important class of theories. Then we shall re-examine for this class the concepts of consistency, completeness, categoricity, and decidability, including interrelations which exist among them. The remaining notion which we shall introduce as an ingredient of a classification scheme for informal theories arises in connection with the purpose for which a theory is devised. If it is intended that an axiomatic theory formalize some one intuitive theory, a natural requirement for the successfulness of the axiomatization is the presence of a theorem to the effect that any two models of the theory arc indistinguishable apart
from the terminology they employ. In other words, the theory has essentially only one model. For example, one would certainly hope to have such a theorem for any theory designed to formalize Euclidean geometry or the real number system, since we think of each of these as a single clearly delimited theory. A theory is called categorical if it has essentially only one model. This will qualify as a definition as soon as the vague notion that models of a theory are indistinguishable is made precise. The sort of indiscernibility of models which is involved is known as isomorphism. A definition which could cover all conceivable situations
would be too unwieldy to attempt. This is the reason for the repeated occurrence of definitions bearing this name. Each is tailored to fit the distinguishing features of the theory under consideration. Already we. have given three such definitions: one for partially ordered sets, one for integral systems, and another which is applicable to systems consisting of a set with two binary operations and an ordering relation. In order to further strengthen the reader's comprehension of the concept and to serve as a vehicle for several general comments, we offer definitions in three specific cases (labeled I,, Ia, and 13). These together with those definitions given earlier should serve to clarify the essence of isomorphism.
I,. Let (X,, pi) and (X2, p2) be two models of a theory having a set and a pertinent relation as primitive notions. Then (X,, p,) is isomorphic to (X2, p2) if there exists a function f such that-
(i) f is a one-to-one correspondepce between X, and X2, (ii) if x, y C X, and x p, y, then f(x) P2 f(y),
(iii) if x, y C X2 and x ply, then f-'(x) pl f-'(y)
5.4
I
Further Features of Informal Theories
241
This definition is patterned after that of isomorphism for partially ordered sets (Section 1.11). It is applicable to the case where pi is a function on Xi into Xi, i = 1, 2. In this event the definition of isomorphism can be simplified to the following, as the reader can verify. Let (X1, f,) and (X2, fz) be models of a theory whose primitive notions are a set and a function on that set into itself. Then (XI, fl) is isomorphic to (Xz, fz) if there exists a function f such that
(i) f is a one-to-one correspondence between X, and Xz, (ii) if x C Xi, then f(f,(x)) = fz(f(x)).
Thus, in this case only one of the two requirements for isomorphism must be proved; the other, which completes the symmetry inherent in the concept of isomorphism, necessarily follows.
Let (XI, -i) and (Xz, 02) be two models of a theory having a set 12. and a binary operation in that set as its primitive notions. Then (Xi, ^i) is isomorphic to (X2j -z) if there exists a function f such that (i) f is a one-to-one correspondence between X, and Xz, (ii) if x, y C XX, then f(x -,y) = f(x) -z f(y).
It is left as an exercise to show that this formulation of isomorphism is an equivalence relation in any collection of models of the theory described. In particular, therefore, as in the specialized version of I, given above, the symmetric nature of the concept follows automatically. Is. Let (Xi, Y,, p,) and (Xz, Y2, P2) be two models of a theory having as its primitive notions two sets and a relation whose domain is the first set and whose range is the second set. Then (X,, Y,, pi) is isomorphic to (X2j Yz, p2) iff there exists a function f such that
(i) f is a one-to-one correspondence between X, U Y, and Xz U Yz such that f(XI) = Xz and f(Y,) = Y2, (ii) f preserves the relations p, and P2 in the sense of definition I,.
This is not the only definition of isomorphism which might be made under the circumstances. The one given takes into account the preservation of set-theoretical interconnections between Xi and Yi, i = 1, 2. We now define an informal theory to be categorical if any two models of it are isomorphic. In view of Theorem 2.1.8, the theory of integral systems, which was devised to axiomatize the natural number
242
Informal Axiomatic Mathematics
I
CHAP. 5
sequence, is categorical. t This result is one which might be hoped for since the theory is intended to formalize just one intuitive theory. An elementary example of a categorical theory is obtained by adding to the five axioms for affinc geometry (Example 2.3), the following. AG6. The set (P has exactly four members.
The resulting theory is consistent by virtue of the model given in Section 2. The proof that it is categorical is left as an exercise. Analogous to the acceptance of the existence of a model as a criterion
for consistency, the existence of essentially only one model (that is, categoricity) is often accepted as a criterion for negation completeness. To state the pertinent result we make a definition. A statement of a consistent theory T will be called a consequence of if it is true in every model of T. Then, if Z is a consistent and categorical theory, for each T-statement S, either S is a consequence of Z or -1 S is a consequence of Z. This,
it will be noted, amounts to negation completeness with provability replaced by a weaker notion. The proof makes use of the following property of models. If Y, and 9Jtz are isomorphic models of a theory Z, then for every a-statement S, either S is true in both f l1 and 9)22 or S is false in both. Assuming this as proved, the main result can be derived as follows. Suppose that the T-statement S is not a consequence of the
consistent theory Z. Then, by the definition of consequence, there exists a model 9N, of Z which does not satisfy S. Let 9N be any model of Z. Then, since 99J1 is isomorphic to 9 lt, S is not true in T2, and, hence
-,S is true in P. Since 9t is any model of T, this means that --is is a consequence of T. A theory which is consistent and noncategorical has essentially different (that is, nonisomorphic) models. This is precisely what should be
anticipated for a theory intended to axiomatize the common part of several different theories. The theory of groups is an excellent example. Because it has such a general character it has a wide variety of models, which means that it has a wide range of application. We conclude this section with several miscellaneous remarks. The first involves assigning a precise meaning to the word "formulation" which we have used frequently. As we described it, an informal theory Z includes a list TO of undefined terms, a list T1 of defined terms, a list
P of axioms, and a list Pl of all thqse other statements which can be inferred from Po in accordance with some system of logic. The set TO t Later we shall find it necessary to modify this assertion.
5.4
I
Further Features of Informal Theories
243
serves to generate TO U Ti, the set of all technical terms of ;'; the set P0 serves to generate Pu U P1, the set of all theorems of Z. For the ordered pair (To, Po) we propose the name of a "formulation" for T. A study of T may very well culminate in the discovery of other useful formulations. To obtain one amounts to the determination of: (i) a set To' which is a
subset of To U T, (which may or may not differ from 7o), and (ii) a subset P of Po U P, whose member statements are expressed in terms of the members of To' and from which the remaining theorems of the theory can be derived. For a pair of the form (To', PP) to be a formulation of T, it is clearly sufficient that the members of To can be defined by means of those in To and that the statements of Po can be derived from those
of P. For many of the well-known axiomatic theories there exists a variety of formulations. This is true, for example, of the theory of Boolean algebras discussed in Chapter 6. A rather trivial example appears
in Section 1.11, and we may rephrase it to suit our present purposes: As a different formulation of the theory of partially ordered sets we may take that consisting of a set X together with a relation that is it-reflexive and transitive on X (see Exercise 1.11.3). Another example is implicit
in a remark made in Section 1; rephrased, it amounts to the assertion that Hilbert and Pieri gave different formulations of a theory which axiomatizes intuitive plane geometry. Different formulations of a theory amount to one variety of possible
approaches which can be made to one and the same mathematical structure. Depending on the criteria adopted, one may show a marked preference for one formulation over others. Aesthetic considerations may
influence one's judgment, and the simplicity of the set of axioms in conjunction with the elegance of the proofs may also play an important role. One may prefer a particular formulation because he feels it has a
"naturalness" that others lack. He may favor a formulation which involves the fewest number of primitive notions or axioms.
A notion which is pertinent to a formulation of an informal theory is that of the independence of the set of axiomns. A set of axioms is independent if the omission of any one of them causes the loss of a theorem; otherwise it is dependent. A particular axiom (considered as a member of the set of axioms of some formulation) is independent if its omission causes the loss of a theorem; otherwise it is dependent. Clearly, an independent axiom cannot be proved from the others of a set of which it is a member, and conversely. Further, the set of axioms of a formulation is independent iff each of its members is independent. Models may be used to establish the independence of axioms. For
244
Informal Axiomatic Mathematics
I
c t r A P. 5
example, the independence of the axioms 0r, 02, 03 for the theory of partially ordered sets (see Section 3) may be shown by constructing a model of each of the three theories having exactly two of 0r, 02, and 03 as axioms and in which the interpretation of the missing axiom is false. Otherwise expressed, the independence of 03, for example, is equivalent
to the consistency of the theory having 01, 02, and the negation of 03 as axioms. The independence of a set of axioms is a matter of elegance. A dependent set simply contains one or more redundancies; this has no effect on the theory involved. The foregoing concepts of independence for both individual axioms and sets of axioms have analogues for primitive terms. A given primitive term (considered as a member of the set of primitive terms in a formula-
tion of a theory) is independent if it cannot be defined by relation to the remaining primitive terms and a set of primitive terms is indepcndent if each of its members is independent. Models are also used to show
such independence in the following way. To prove that a particular primitive symbol Q of some formulation of a theory Z is independent of the remaining primitives, we exhibit two models 9121 and J 2 of $ which have the same domain and in which the interpretation of each primitive term except Q is the same but which give different interpretations to the symbol, Q. This is known as Padoa's method for demonstrating definitional independence. A complete account of this method,
which is due to the Italian logician, A. Padoa, is given in J. C. C. McKinsey (1935); we shall be content. to consider an example. In the exercises for Section 3 is a formulation of the theory of simply ordered
commutative groups. We will show that the binary relation < is an independent primitive. For this we introduce the interpretations I2r and 9)22 in both of which we take G as Z, + as ordinary addition, 0 as zero and, in Mgr we take < to be the familiar relation of less than or equal to, while in 122 we take < to be the familiar relation of greater
than or equal to. Then, clearly, the interpretations of < are different (for example, 2 < 3 is true in 912r but false in T22)- We conclude that < cannot be defined in terms of the remaining primitives, for otherwise its interpretation would have to be the same in both models since the other primitives are the same. In order to motivate the final remark we recall'i'heorern 1.11.1, which asserts that every partially ordered set is isomorphic to a collection of sets partially ordered by inclusion. That is, to within isornorphisrn, all models of the theory of partially ordered sets are furnished by col-
lections of sets. In general, a theorem to the effect that for a given
5.4
I
Further Features of Informal Theories
245
axiomatic theory Z a distinguished subset of the set of all models has the property that every model is isomorphic to some member of this subset is a representation theorem for Z. Analagous to the case of the theory of partially ordered sets where, from the outset, collections of sets constitute distinguished models, in the case of an arbitrary theory Z,
even though it is noncategorical, one particular type of model may seem more natural. In this event a representation problem arisesthe question whether there can be proved a representation theorem for X which assi its that this type of model yields all models to within isomorphism. When such a problem is answered in the affirmative, new theorems may follow for Z by imitating proof techniques that have proved useful in those theories which, in effect, supply all models.
EXERCISES 4.1. (a) Establish the consistency of the theory of partially ordered sets by way of a model. (b) Show that this is a noncategorical theory. (c) Show that the set of axioms {O,, 02, 03) for partially ordered sets is independent. 4.2. (a) Show that the theory of groups is noncategorical. (b) Defining a group as an ordered triple (C, , e) such that G1, G2, and G3 of Example 2.2 hold, establish the independence of {G,, G2, G3} . (Suggestion: Use a multiplication table for displaying the operation which you introduce into any set.)
4.3. Consider the axiomatic theory having as its primitive notions two sets A and (Id and having as axioms the following.
(i) Each element of B is a two-element subset of A. (ii) If a, a' is a pair of distinct elements of A, then {a, a') C B. (iii) A V B. (iv) If B, B' is a pair of distinct elements of B, then B (1 B' C A. Show that this theory is consistent. Is it categorical? 4.4. Consider the axiomatic theory whose primitive notions are a nonempty set A and a binary operation (x, y) -9- x - y (that is, we write the image of (x, y) as x - y) in A, which satisfies the identity
y=x-[(x-z)-(y-z)].
Show that this theory is consistent. 4.5. Consider the axiomatic theory whose primitive notions are a nonempty set A, a binary operation (x, y) x X y in A, and a unary operation x-4- x' in A. The axioms are the following.
(i) X is an associative operation.
(ii) (x X y)' = y' X x'.
246
Informal Axiomatic Mathematics
CHAP. 5
(iii) IIxXv= zXi for some z, then x=y. (iv) If x = y', then x X y = z Xi for all z. (a) Show that the theory is consistent. (b) Show that this set of axioms is dependent. 4 6. Prove the assertion made in the text to the effect that if p; is a function or. A, into X;, i = 1, 2, then (X,, pl) is isomorphic to (X2, p2), provided there exists a one-to-one correspondence f : X, -+- X2 such that f (pi(x)) = ps(f (x)) for all x in X. 4.7. Prove that the type of isomorphism labeled 12 is an equivalence relation in any set whose members are systems consisting of a set together with an operation in that set. 4.8. Assume that of two isomorphic models of the theory considered in Exercise 4.4, one is a group. Prove that the other is a group. 4.9. The set {e, a, b, c} together with the operation defined by the following multiplication table is a group. Determine six isomorphisms of this group with itself. e
a
e
a
b
c
a
e
c
b
b
c
e
a
c
b
a
e
b
c
4.10. Devise a definition of isomorphism for systems consisting of a set together with two operations. 4.11. Consider an axiomatic theory Z formulated in terms of two sets, whose members are called points and lines, respectively, and whose axioms are as follows.
(i) Each line is a nonempty set of points. (ii) The intersection of two lines is a point. (iii) Each point is a member of exactly two lines. (iv) There are exactly four lines.
(a) Show that Z is a consistent theory. (b) Show that there are exactly six points in a model of Z. (c) Show that each line consists of exactly three points. (d) Find two models of T. (e) Is T categorical? Give reasons for your answer. 4.12. Show that the axiomatic theory defined in Exercise 4.4 is a formulation of the theory of commutative groups. 4.13. Show that the axiomatic theory defined in Exercise 4.5 is a formulation of the theory of groups. 4.14. Show that the following is another formulation of the theory of groups.
247
Bibliographical tote A group is an ordered triple (C, , ') such that G is a set, in C, ' is a unary operation in C, and (i) C is nonempty, (ii)
(iii) a'
is a binary operation
is associative, (a
b) = b = (b a) a' for all a and b.
4.15. Show that the following is another formulation of the theory of groups. A group is an ordered triple (C, , e) such that C is a set, is a binary operation in C, e is a member of C, and is an associative operation, (i) (ii) for each a in C, e a = a, and there exists a' in G such that a'
a = e.
4.16. Consider the theory whose primitive notions are a set X, a binary operain X, and whose axioms are the following.
tion
(i) X is nonempty. is an associative operation. (iii) To each element a in X there corresponds an element e of X such that (ii)
e
a = a e = a, and a possesses an inverse a' relative to e in X (that is,
Show that if (S, ) is a model of the theory, then there exists a partition of S such that each member set determines a group. 4.17. Consider the theory T whose primitive notions are the power set of a set S and a mapping f on (P(S) into itself, and whose axioms are as follows. (i) For all .Y in (P(S), Xf 13 X.
(ii) For all X in P(S), (Xf)f = X'. (iii) For all X and Y in PPS), X D Y implies Xf 1) Y'. Show that another formulation of T. results on adopting as the sole axiom:
(X U Y)f D (Xf)f U Yf U Y, for all X and Y in 6'(S).
BIBLIOGRAPHICAL NOTE Discussions of axiomatic theories and the axiomatic method, pitched at about
the same level as ours, appear in R. L. Wilder (1952), E. It. Stabler (1953), and A. Tarski (1941).
CHAPTER
6 Boolean Algebras
T; r. THEORY or Boolean algebras has historical as well as present-day practical importance. For the beginner its exposition should prove a serviceable vehicle for assimilating many of the concepts discussed in relation to informal theories in Chapter 5. Moreover, it illustrates the important type of axiomatic theory known as an "algebraic
theory." The theory of Boolean algebras is, on one hand, relatively simple and, on the other hand, exceedingly rich in structure. Thus, its detailed study serves in some respects as an excellent introduction to techniques which one may employ in the development of a specific axiomatic theory. The only possible shortcoming is that the ease with which it may be put into a relatively completed form is somewhat misleading, so far as axiomatic theories in general are concerned. This chapter presents first a natural formulation of the theory. Then a formulation which is commonly regarded as being more elegant is given. This second formulation is used in the development of the next topic, the representation of Boolean algebras as algebras of sets. Next, it is shown that a statement calculus determines a Boolean algebra in a
natural way. It is by way of this Boolean algebra associated with a statement calculus that statement calculi can be analyzed by so-called Boolean methods and interconnections be established between the theory of Boolean algebra-, and that of statement calculi. This is developed in the last three sections of the chapter.
1. A Definition of a Boolean Algebra By an algebra of sets based on U we shall mean a nonempty collection a of subsets of the rronempt.y set U such that if A, B E a, then A U B, A fl B C a, and if A C a, then fl C a. For example, the power set of U, (P(U), is an algebra of sets. However, certain proper subsets of (p(U) may be an algebra of sets (see Exercise 2.6). If a is an algebra of
sets based on U, then U C a (since if A C a, then U = A U A C a) and 0 C a (since if A C a, then 0 = A n A E a). Further, Theorem 248
6.1
I
A Definition of Boolean Algebra
249
1.5.1 may be interpreted as a list of properties of an algebra of sets. That this is a fundamental list of properties is suggested by the variety of other properties (for example, those in Theorem 1.5.2) which may be deduced solely from them. As formulated below, the theory of Boolean algebras may be regarded as the axiomatized version of algebras of sets when viewed as systems having the properties appearing in Theorem 1.5.1.
A Boolean algebra is a 6-tuplc (B, U, n, ', 0, 1), where B is a set, U is a binary operation (called union or join) in B, n is a binary operation (called intersection or meet) in B, ' is a binary relation in B having B as its domain, 0 and 1 are distinct elements of B, and the following axioms are satisfied. (i) Each operation is associative: for all a, b, c C B,
aU (bUc) _ (aUh)Uc and an (bnc) _ (anb)nc. (ii) Each operation is commutative: for all a, b E B,
aUb=bUa and anb=bna (iii) Each operation distributes over the other: for all a, b, c C B,
aU (bnc) = (aUb) n (aUc) and
an (bUc) = (anb) U (anc). (iv) For all a in B,
aUO=a and and =a.
(v) For each a in B there exists a '-related element a' such that
a U a' = I
and ana'=0.
The consistency of the theory that we have just formulated can be established by choosing for B the power set of a nonempty, finite set U,
taking U and n as set-union and set-intersection, respectively, ' as complementation relative to If, and, finally, choosing 0 and 1 as 0 and U, respectively. The uniqueness of the elements 0 and I is established in Theorem 2.1. These uniquely determined elements are called the zero element and unit element, respectively, of a Boolean algebra. It was in anticipation of this uniqueness and terminology that the symbols "0" and "1" were used in the axioms. We might have postulated their uniqueness; however, we would then be obligated to prove uniqueness as part of any verification that an alleged Boolean algebra is truly
just that. An element which is '-related to an element a is called a complement of a; that each element has a unique complement (and,
Boolean Algebras
250
I
CHAP P. 6
hence, that ' is a function having B as its domain) is proved below. The set of axioms is not independent, since the two associative laws can be derived from the remaining axioms. A hint as to how this can be done
is given in an exercise accompanying the next section. If the set of remaining axioms is regarded as having seven members, which is the case when each of (ii)-(iv) is divided into two parts, then it is an independent set of axioms. This fact, which is interesting but unimportant, was established by E. V. Huntington (1904) with appropriate models.
EXERCISES 1.1. Accepting for the moment the fact that the associative laws (i) in the formulation of the theory of Boolean algebras are redundant, the independence of the remaining set of seven axioms can be demonstrated by a collection of seven systems of the form (B, U, n,', 0, 1), one of which satisfies (ii)-(v) except the commutativity of U, another of which satisfies (ii)-(v) except the commutativity of n, and so on. For a B having just a few elements, an operation in B can be defined by means of a "multiplication table," that is, a square array whose rows and columns are numbered with the elements of B and such that at
the intersection of the ath row and the bth column the composite of a and b appears. For example, the following two tables define two operations in the set B = {a, b}.
nl
a
b
b
a
a
a
a
b
a
b
UI
a
b
a
a
b
b
Show that (B, y, n,', 0, 1)-where B = {a, b}, U and n are defined as above, ' is the relation {(a, b), (b, a)) (that is, a' = b and b' = a), 0 is a and 1 is b-satisfies all of (ii)-(v) except the first half of (iii), thereby demonstrating the independence of this axiom. Next, show that the system which results from the foregoing upon substituting the multiplication tables
U
a
b
a b
a
b b
b
n
a
b
a
b
a
b
a
b
for U and n establishes the independence of the second half of (iii). 1.2. Construct five other systems which demonstrate the independence of the other axioms.
2. Some Basic Properties of a Boolean Algebra ,
The properties of a Boolean algebra which are derived in this section
are the abstract versions of the results obtained in Section 1.5 for an
6.2
Some Basic Properlies of a Boolean Algebra
251
1
algebra of sets. The only essential difference is that now the set of axioms of a Boolean algebra is used in place of the first theorem of that earlier section.
We begin by describing the principle of duality for Boolean algebras. By the dual of a statement formulated within the framework of a Boolean algebra is meant the statement that results from the original upon the replacement of U by n and n by U, I by 0 and 0 by 1. We observe that each axiom is a dual pair of statements, with (v) regarded as self-dual. Hence, if 1' is any theorem of Boolean algebras, then the dual of T is a theorem, the duals of the steps appearing in the proof of 7' providing a proof of the dual. This is the principle of duality for the theory at hand; it yields a free theorem for each theorem which has been obtained, unless that theorem happens to be its own dual. Turning to theorems of the theory of Boolean algebras, we mention first the validity of the general associative law and the general commutative law for each operation, as well as the general distributive law for each operation with respect to the other. Theorem 2.2.2, Exercise 2.2.4, and Exercise 2.2.5 dispose of these matters. The next group of results, which make up our next theorem, is the Boolean algebra version of Theorem 1.5.2.
THEOREM 2.1. In each Boolean algebra (B, U,
0, 1) the
following hold.
(vi) The elements 0 and 1 are unique. (vii) Each element has a unique complement. (viii) For each element a, (a')' = a.
(ix) 0' = 1 and 1' = 0. (x) For each element a, a U a = a and a n a = a. (xi) For each element a, a U 1 = 1 and a n 0 = 0. (xii) For all a and b, a U (a n b) = a and a n (a u b) = a. (xiii) For all a and b, (a U b)' = a' n b' and (a n b)' = a' U Y. Proof. For (vi) assume that Ol and 02 are elements of B such that a U OL =a and a U 02 = a for all a. Then 02 U Oi = 02 and OL U
02 = 01. By axiom (ii), 02 U 01 = 01 U 02, and, hence, 02 = 01. Thus there is a single element in B satisfying the first property in (iv). (The uniqueness of 1 follows by the principle of duality.)
For (vii) assume that a,' and a. are both complements of a. Then by (iv); a. = a. U 0, since anal = 0; = al U (a 0 a6),
Boolean Algebras
252
= (a'. U a) n (al' U as), = (a U ai) n (a,' U as),
by (iii) ;
= 1 n (ai U ai),
since a U a,' = 1; by (ii) ; by (iv).
_ (al' U a,ae) n 1, al
= U 2) By a similar proof we get
CHAP. 6
by (ii);
az=a'Uat. Hence, by (ii), a, = aa. For (viii), by definition of the complement of a, a U a' = 1 and
a n a' = 0. Hence, by (ii), a' U a = 1 and a' n a = 0. That is, (a')' = a, by (vii). The proof of (ix) is left as an exercise. The proof of (x) is the following computation.
aUa=(aUa)n1,
by (iv) ;
_ (a U a) n (a U a'),
by (v); by (iii) ; =aUO, by (v); = a, by (iv). The proofs of the remaining parts of the theorem are left as exercises.
=aU(ana'),
The property of cornpleuientation stated as (vii) means that { (a, a')Ia C B) is a function on B into B (that is, complementation is a unary operation in B). According to (viii) this function is of period 2 and, consequently, one-to-one and onto. It is possible to introduce into the set B of an arbitrary Boolean algebra (B, U, n,', 0, 1) a partial ordering relation which resembles that of set inclusion. The characterization of inclusion in Theorem 1.5.3 in terms of set intersection is the origin of the following definition. If (B, U, n, ', O, 1) is a Boolean algebra, then for a, b E B
a
anb=a if aUb=b. The proof of this as well as the proofs of such related facts as a < b iff a nb' = 0 and a < b if b' < a' are left as exercises. Important features of the new relation are stated in the next theorem.
6.2
I
Some Basic Properties of a Boolean Algebra
253
THEOREM 2.2. If (B, U, n, ', 0, 1) is a Boolean algebra, then (B, <) is a partially ordered set with greatest element (namely, 1) and least element (namely, 0). Moreover, each pair { a, b } of elements
has a least upper bound (namely, a U b) and a greatest lower bound (namely, a n b). The proof is straightforward and is left as an exercise.
EXERCISES 2.1. Referring to Theorems 1.5.2 and 2.1, it is obvious that (viii)-(xiii) of Theorem 2.1 are the abstract versions of 8, 8'-13, 13' of Theorem 1.5.2. Show that (vi) and (vii) of Theorem 2.1 are the abstractions of 6, 6' and 7, 7', respectively, of Theorem 1.5.2. 2.2. Supply proofs for parts (ix), (xi), (xii), and (xiii) of Theorem 2.1. 2.3. In regard to a proof of the assertion that the associative laws for U and n can be derived from the remaining axioms for a Boolean algebra, we observe first that the given proofs of (vi)-(viii) and (x) do not employ (i). Further, the proofs of (ix), (xi), and (xii) called for in the preceding exercise need not use (i). Hence, (ii)-(xii) are available to prove (i). Supply such a proof. Hint: Given a, b, and c, define x=aU(bUc) and y=(aUb)Uc,
and then deduce, in turn, that all x = any,a,nx=a'ny,x=y. 2.4. Establish each of the following as a theorem for Boolean algebras.
(a) a
(b) a
(c) a
(d) For given x and y, x = y if 0 = (x n y') U (y n x'). 2.5. Prove Theorem 2.2. 2.6. Let d be the collection of all subsets A of Z+ such that either A or A is
finite. Show that (a, u, n, ^, 0, z+), where the operations are the familiar set-theoretical union and intersection, is a Boolean algebra. Remark. The remaining problems in this section are concerned with a type
of generalization of a Boolean algebra called a lattice. A lattice is a triple (X, U, n), where X is a nonempty set, U and n are binary operations in X (read "union" and "intersection," respectively), and the following axioms are satisfied. For all a, b, c C X,
L,. aU(bUc)_(aUb)Uc, U. an(bnc)_(a() b)flc, L2. aUb=bUa, U. anb=bna, L3.
(a U b) n a = a,
U. (a fl b) U a = a.
254
Boolean Algebras
I
C11 A P. 6
2.7. State and prove a principle of duality for a lattice. 2.8. Derive the following properties of a lattice.
(a) For all a, a U a = a andafla = a. (b) For all a, b, the relations a U b = a and a () b = b are equivalent. (c) For all a, b, the relations a (1 b = a and a U b = b are equivalent. 2.9. Let (X, <) be a partially ordered set such that each pair of elements has a least upper bound and a greatest lower bound in X. Thus, if we set a U b = lub {a, b} and a (1 b = gib {a, b}, then U and (1 are operations in X. Prove that (X, U, fl) is a lattice. Next, prove that, conversely, if in a lattice (X, U, (l)
we define the relation < by a < b if a (1 b = a, then (X, <) is a partially ordered set such that each pair of elements has a least upper bound (namely, a U b) and a greatest lower bound (namely, a fl b). Remark. This result gives, in effect, a second formulation of the axiomatic theory called lattice theory. Thus, one may think of a lattice in either way. If the formulation is in terms of <, then, by U and (l, one understands the operations in Exercise 2.9. If the formulation is in terms of U and fl, then, by <, one understands the ordering relation defined, again, in Exercise 2.9.
2.10. Let (X, U, fl) and (X', y', fl') be lattices. Show that they are isomorphic (using the definition of isomorphism suggested by 12 in Section 5.4)
iff the associated partially ordered sets (X, <) and (X', <') are isomorphic (using the definition in Section 1.11). 2.11. Show that there are exactly five nonisomorphic lattices of fewer than five elements and that there are exactly five nonisomorphic lattices of five elements. (Hint: For this problem it is more convenient to think of a lattice as a partially ordered set.)
3. Another Formulation of the Theory The formulation which we have given of the theory of Boolean algebras has much to recommend it. The primitive notions are few, and the simplicity and symmetry of the axioms lend aesthetic appeal. Moreover,
if the associative laws are omitted, the resulting set is independent. Finally, the formulation clearly reflects the type of system that motivated it. However, it is always a challenge to see if a formulation can be pared down in one or more respects. In the case of Boolean algebras this challenge has been successfully met by a great variety of formulations. We shall describe one that has become quite popular. It achieves for arbitrary Boolean algebras the analogue of the familiar fact for an algebra of sets that either of the operations of union and intersection can be eliminated in terms of the other together with complementation
[for example, A U B = (A n B)) .
6.3
Another Formulation of the Theory
1
255
If (B, V, n,', 0, 1) is a Boolean algebra, then B is a set with at least two distinct members. Moreover, the binary operation fl and the unary operation ' have the following properties. fl is commutative. fl is associative.
For a,binB,ifaflb'=cflc'forsome cinB,then aflb=a. For a,binB,ifaflb=a,then aflb'=cflc'forallcinB. The first two properties are axioms, and the last two follow from the
facts that for all c in B,cflc' = 0, andaflb' = 0iffaflb = a. We shall prove next that a triple (B, fl, ') having the properties mentioned above (a precise description appears in the next theorem) may be taken as a formulation of the theory of Boolean algebras. That is, the primitive notions of the initial formulation of the theory can be defined and the axioms (i)-(v) can be derived as theorems.
THEOREM 3.1. The following is a formulation of the theory of Boolean algebras. The primitive notions are an unspecified set B of at
least two elements, a binary operation fl in B, and a unary operation ' in B. The axioms are as follows. B1. fl is a commutative operation. B2. fl is an associative operation. B3. For all a, b in B, if a fl b' = c fl c' for some c in B, then
aflb=a. B4.
For all a,binB,ifaflb=a,then aflb' =cflc'forallc
in B.
It remains to prove that the primitive notions of the original formulation can be defined and the axioms derived from a triple Proof.
(B, fl, ') satisfying B1--B4. As the undefined set, the meet operation,
and the binary relation ' of the original formulation we take B, fl, and ', respectively. A Join operation and the distinguished elements 0
and 1 are defined below. The first ten results (TI-T 10) which we prove, al.)uut (13, fl, '), together with 13, and 133, establish the validity of all axioms of the original formulation except the distributive laws.
The remainder of' the proof is concerned with them. A,telegraphic style of presentation is used for ease in reading.
Ti.
afla=a.
Pr. T2. Pr.
anal = anal. Now apply 113.
afla'=bflb'. T1 and 134.
Boolean Algebras
256
(
CHAP. 6
This result justifies the following definition.
Dl. T3.
Pr.
0 = a fl a' and 1 = 0'.
an0=0. afl0=afl(afla'), (ana)na',
=
by B2;
by TI and DI.
0,
T4. Pr.
by DI;
all = 'a. 1 . 2. 3. 4. 5.
6.
a" l a' = 0,
a" fla=a" , a"" n a" = a.". al", n a
a""
a"', na'=0 a' n a". = a' a," n a' =
, ,
from D1 and B2. from I by B3. from 2.
from 2 and 3, by B2. from 4, by B4 and Dl. from 5, by B, and B3.
10.
ana"'=0 ana"a, ,
from 2. from 6 and 7. from 8 and D1. from 9 by B3.
11.
a" = a ,
from 2 and 10, by B1.
7. 8.
9.
a
= a,
afll =a. afl(afla')'=a, afll =a, 001.
T5. Pr.
afl(ana')"=0,
by T4, Ti, and DI, from the above, by B3.
by Dl.
Assume 0 = 1.
a fl 0 = a,
from 1 and T5. by T3. from 2 and 3. a = 0, This contradicts the assumption that there exist at least two distinct elements in B.
an0 = 0, 5.
D2.
T7. Pr. T8. Pr.
T9. Pr.
TI0. Pr.
a U b = (a'nb')'. (a U b)' = a' fl b' and (a fl b)' = a'Ub'. Both follow from D2 and '1'4.
aUb = b U a and a U (b U c) = (aUb)Uc. The first follows from B2, and the second follows from Ba and T4.
aUa'=1.
This follows from D2, T4, B1, and
aUO=a. This follows from D2, D1, and T4.
63
I
Another Formulation of the Theory
Tll. Pr.
T12. Pr.
257
afl (aUb) =a. 1. b' fl (a fl a') = 0, 2. a (l (a' fl b') = 0, 3. a fl (a' fl b')" = 0,
by T3 and Dl. from 1, by B, and B2-
4. a f l (a' f l b')' = a, 5. a f l (a U b) = a,
from 3, by B3.
an (anb)'=aflb'. 1. aflb"fl(aflb)'=0,
from 2, by T4. from 4, by D2.
2. a fl (a 0 b)' fl b" = 0,
by Dl and T4. from 1, by B1.
3. a fl (a (1 b)' (1 b'
=afl(aflb)',
from 2, by B3.
4. a fl b' fl (a fl b)'
=afl(afl1')', 5. a flb'fl(aflb)' = a fl b' fl (b' U a'),
6. a fl b'f (b'Ua') =aflb',
7. a fl(a(lb)'=aflb',
by T7 and B1.
byTll. from 4, 5, and 6.
a fl c = a, a fl c' =0 and a U c = c are equivalent
T13. Pr. T14. Pr.
from 3, by B1.
properties. Left as an exercise.
aflc=aandbflc=bimply (aUb)lc=aUb.
Assume that a fl c = a and b fl c = b. Then a U c = c and b U c = c, by T13. By TI 1,
(aUb)fl [(aUb)Uc]=aUb. Two substitutions within the brackets give the desired result.
an (bUc) _ (a fl b)U(aflc)and
T15.
a U (b fl c) _ (a U b) fl (a U c). Pr.
1.
(a fl b) fl [a fl (b U c) ]
=aflbfl (bUc) =aflb, by B2, Ti, andT11. 2. (a fl c) n [a fl (b U c) ] = a fl c, similarly. 3. [(a fl b) U (a fl c) ] fl [a fl (bUc) ] from 1, 2, and T14. = [(a fl b) U (a fl c) ), 4. [a fl (bUc) ] fl [(a fl b) U (a U c) ]' =afl (bUc)fl(aflb)'fl(aflc)',
= a ll b' ( c, r) (b U c),
=afl(bUc)'fl(bUc), = 0.
by T7; by B, and T12;
Boolean Algebras
258
I
C It A P. 6
5. [an(bUc)Jn [(a n b) U (a n c) I =an(bUc), from 4byT13. 6. a n (1 U c)
= (a n b) U (an c),
from 3 and 5 by B,. The proof of the other distributive law is left as an exercise.
The set of axioms in the new formulation of the theory of Boolean algebras is independent. A proof of this requires the determination of a system (13, n, ');, which satisfies all the axioms except B i = 1, 2, 3, 4. Below arc defined four systems which demonstrate the independence of the axiom with the corresponding label. (B2) B = (a, b, c } (B,) B = (a, b } (B,) B = (a, b, c J
nI
a
b
c
nI
a
b
c
a
a
a
a
a
a
c
b
a
b
b
b
c
b
b a
c
a
c
c
c
b
a
c
nI
a
b
a
a
b
b
b
b
i
I
a b c
(B4)
b a
a
a
b
c
c
b
a b
b b
B = (A C v'(Z+) I7+ - A is a finite set).
n is set intersection. ' is defined as follows. We note that for each A in B there exists a least positive integer a such that [a, the set of all integers x > a, is included in A. Then A is the disjoint union of [a and A0, a subset of
, a - 2} (unless A = Z+, in which case A = [1). Now we (1, 2, define A' to be A U [(a + 1), where A is the complement of AD in , a - 11 (unless A = Z+, in which case A' = [2). (1, 2, Some hints for the analysis of this example, which establishes the independence of B4, appear in Exercise 3.2. Possible substitutes for B4 are described in Exercise 3.3.
EXERCISES 3.1. Prove T13 and the remaining distributive law in the proof of Theorem 3.1.
3.2. Regarding the system (B, n, '), which, it is asserted, establishes the independence of B4, it is clear that B, and B2 hold. Prove that the system
6.4
I
Congruence Relations for a Boolean Algebra
259
satisfies B3 but not B4. hint: for B3, show that if C = Co U [c, then C n C' _ [(c + 1), and, if A = Ao U [a and B = Bo U [b, then
AnB'= {(AflB)U[b+l) (Ao n B') U [a
ifa b
3.3. Show that each of B6, B6, - - -, B,o defined below implies B4 in the presence of B,, B2, and B,. Infer that each of B6, B6, - - -, B9 together with B1, B2, and B, yields a formulation of the theory of Boolean algebras. For some calculations
it is convenient to use the fact that if (B, n, ') satisfies B1, B2, and B3, then (B, <), where a < b means a n b = a, is a partially ordered set. So prove this first. B6.
For all a and b, ana'=bnb'.
Bs.
For all a, a" = a.
B. There exists in B an element m such that whenever x n m = x, x = M. Bs. There exists an integer n > 1 such that for all a, the nth iteration of a under ' is equal to a. B9. For all a and b, a < b implies b' < a'. B,o. B is finite.
4. Congruence Relations for a Boolean Algebra We turn to an examination of an aspect of the two given sets of axioms for a Boolean algebra that has not been touched on. It is sufficient to consider the second set of axioms, since the reader will readily
see what alterations are required for our remarks to apply to the first set. When the statements labeled B1, B,, B,, and 134 were introduced, no
mention was made of the precise meaning to be assigned the relation symbolized by " ="; rather, it was intended that the reader supply his own version of equality. Suppressing any preconceived notions that we might have in this connection, let us determine a set of conditions which are adequate for our purposes. An analysis of the proofs of TI-T15 in the proof of Theorem 3.1 reveals that the following is a sufficient set of conditions.
(E) " =" is an equivalence relation. (S) Let F be an element of the Boolean algebra (B, n, ') resulting fi-oin elements a, b, - of B using the operations in B, and let a = a,, b = b,, - -. Then, if P; is an element which results from F by the replacement of some or all occurrences of a by al, b by b,, - -, then F = F,. -
-
-
-
260
Boolean Algebras
I
CHAP. 6
Now (S) can be derived from the following two simple instances of this substitution principle. (
C)
If a = b, then a fl c = b fl c for all c. If a = b, then a' = V.
The proof, which we forego, is by induction on the number of symbols in the element F. Thus (E) and (C) insure (E) and (S), and, hence (E) together with (C), which are clearly necessary properties of equality, are also sufficient for our purposes. As such, equality is an instance of a congruence relation for a Boolean algebra, a notion which we discuss next. Before focusing our attentions on congruence relations for Boolean algebras we make several remarks about this concept in a general setting. When one is presented with, or constructs, some specific mathematical system, there is among its ingredients a "natural" congruence relation either explicitly or implicitly defined. This means that there is present an equivalence relation which is preserved under the operations at hand in the sense suggested by (C) above. Normally one symbolizes
this relation by "_," calls it equality, and uses it without comment. For example, in the case of sets, the relation is that of set equality; it is a congruence relation on any collection of sets. If one is attempting to demonstrate that a particular system (5 has properties BI-B4, he will interpret the occurrences of the equality sign in these as the natural equality for 0-L For example, in the verification that ((P(X), f1, ') is a Boolean algebra, " _" will be taken to denote set equality. In summary, the equality symbol, as used in B1-B4 need have no absolute nature, but merely a relative one. It suffices that it stand for some congruence relation. We return to the general discussion with the remark that when one is studying any specific mathematical system (X, - -), there are often compelling reasons for identifying elements of X which are distinct rela-
tive to the natural congruence relation. This amounts to the introduction of an equivalence relation p other than the natural one. One then
directs his attention to X/p, whose elements are the p-equivalence classes, and regards it as the basic set. If p is not merely an equivalence
but a congruence relation, then it is possible to introduce into X/p faithful analogues of whatever operations and relations are defined for X. We proceed to discuss this matter in detail for the case of Boolean algebras.
6.4
1
Congruence Relations for a Boolean Algebra
261
Let (B, (l, ') be a Boolean algebra, and let B be a congruence relation on it; that is, let 0 be an equivalence relation on B such that the following hold.
(C,) Ifa0b, then a()c0b flcfor all c. (C2) Ifa8b, then a'8b'. We shall be concerned solely with proper congruence relations, that is, those congruence relations different from the universal relation on B.
We now derive from (CI) an instance of the earlier substitutivity property (S).
(C3) If a O c and bbd, then a ll b U c f d. For proof, assume that a 0 c and b 0 d. Then a fl b 8 c (l b and b fl c 0 d f1 c, by (Ci). Since the meet operation is commutative and 8 is transitive, the result follows. The derivation of the dual of (C3) is left as an exercise. If B/8 is the set of B-equivalence classes if, then in B/8 the foregoing result (C3) becomes the following.
Ifa=candb=rl,then aflb=cfd. This means that the relation
(((a,b),a flb)jaCB/8andbCB/8; is a function on (B/8) X (BIB) into B/8, that is, an operation in B/8. We shall denote this operation in B/8 by fl and its value at (a, 6) by a () 6. So, by definition,
afb=a(lb.
Next, it follows directly from (C) that if 'a = b, then a = b . Hence, the relation {(a, a )Ja C B/8} is a function on B/e into B/8. We denote this function by ' and its value at a by a'. So, by definition,
a = W. It is a straightforward exercise to verify that (B/8, (l, ') is a Boolean algebra. For example, to verify B3, assume that a fl b' = c f) c'. Then, in turn,
anb cflc, a fT b' = c (1 c ,
by definition of x'; by definition of To fl y;
afb'8cflc',
xeyiffx=y,
(a n b')' 0 (c (1 c')',
by (Ca) ;
Boolean Algebras
262 W U b 0 1,
by property of (B,
(a'Ub) fl a01 (l a,
by (C1) ;
an boa,
CHAP. 6
by property of (B, fl, ');
anb=a, anb=a,
x =yiffxey;
by definition of x fly. In summary, we have shown that from a Boolean algebra (B, f), ')
and a proper congruence relation 0 on it one may derive a Boolean algebra (B/0, f), ') whose elements are 9-equivalence classes and whose
operations are defined in terms of those of the original algebra using representatives of equivalence classes. If 0 is different from the equality relation in B, then the derived algebra may be essentially different from the parent algebra. This is true in the first of the following examples.
EXAMPLES 4.1. Consider the Boolean algebra (D'(Z), fl, ') whose elements are the subsets of Z, the set of integers. J We recall the definition of the symmetric difference, A + B, of two sets as the set of all objects which are in one of A and B
but not both. For A and B in 61(Z) let us define A 0 B to mean that A + B has a finite number of elements. It is easily verified that 0 is an equivalence relation on 6'(Z). Further, if A 0 B, then A f l CO B (-l C, since, for all A, B, and C,
(AfC)+(BfC)=(A+B)() C,
and, hence, if A + B is finite, then so is (A fl C) + (B fl C). Finally, if A 0 B, then A' 0 B', since A + B = A' + B'. Thus, 0 is a proper congruence relation on the given algebra, and a new Boolean algebra whose elements are B-equivalence classes results on defining
3o 3=AflB and 4'=A. That a substantial collapse of elements has taken place on transition from the first to the second algebra is indicated by the fact that, in the first the zero element is 0, whereas in the second the zero element, Qf, is the collection of all finite subsets of Z. 4.2. The symmetric difference operation used in the preceding example can be defined in any Boolean algebra. By the symmetric difference of elements x and y of a Boolean algebra, symbolized x + y, we understand the element
(xflY')U(x'fly). It is an easy exercise to prove that this operation is commutative, associative, and nilpotent (x + x = 0). Other properties which we shall need later are f We prefer to use prime symbols to denote the operation of complementation relative to Z in this example, so the bar symbol will be available to denote equivalence classes.
6.4
I
Congruence Relations for a Boolean Algebra
263
x+0=x, (x+y)nz= (xnz)+(ynz), x'+y'=x+y. Further, since the symmetric difference is defined in terms of union, intersection, and complementation, if' O is a congruence relation on a Boolean algebra, then x 0 y implies that x + z 0 y+ z.
At this point it becomes desirable to simplify our notation by identifying an algebra simply by its basic set. Thus, we shall use the phrase "the Boolean algebra B" in place of "the Boolean algebra (B, n, ')." Let us consider now the relationship of a Boolean algebra B/0 to the algebra B from which B/0 is derived using a proper congruence relation. Let p be the natural mapping (see Section 1.9) on the set B onto the set B/0, that is, the mapping p: 13 -*- B/0, where p(b) = b. Since anb =a-7)_b and 27' = a-7, 1)(a n b) = p(a) n p(b)
and p(a') _ (p(a))'.
That is, p is a "many-to-one" mapping (unless 0 is the equality relation on B) which preserves operations. A mapping g on one Boolean algebra, B, onto another, G, which takes meets into meets and complements into complements, that is, g(a n b) = g(a) n g(b), g(a') = (g(a))', is called a homomorphism of I3 onto C, and C is called a homomorphic
image of B. If, in addition, g is one-to-one, then g is called an isomorphism of B onto C. If g is an isomorphisin of B onto C, then g-' (which exists) is easily proved to be an isomorphism of C. onto B, and each algebra is called an isomorphic image of the other and each is said to be isomorphic to the other. Returning to the case at hand, we may say that p is a hornonrorphisrn and 13/0 is a homomorphic image of B. That is, each proper congruence relation on a Boolean algebra determines a homomorphic image. Conversely, each homomorphic image C of a Boolean algebra B determines a proper congruence relation
on B. Indeed, if f: B -} C is a homomorphism, then the relation 0 defined by a 0 b iff f(a) = f(b) is a proper congruence relation on B. The proof is left as an exercise. We continue by showing that BIB, the algebra of 0-equivalence classes, is isomorphic to C. For this we introduce the relation g, which is defined to be
264
Boolean Algebras
I
CRAP. 6
((x,f(x))Iz C B/0}. It is easily seen that g is a function which maps B/0 onto C in a one-toone fashion and that
g(x n y) = g(x -n Y) = f(x n y) = f(x) n f(y) = g(Y) n g(y), g(x') = g(x`) = f(x') = U(x))' _ (g(X))', that is, g is an isomorphism. Moreover, if p is the natural mapping on B onto B/0, then we observe that for the given homomorphism f : B -- C we have f = g o p. The next theorem summarizes our results.
THEOREM 4.1. Let B be a Boolean algebra and 0 be a proper congruence relation on B. Then the algebra B/0 of 0-equivalence classes is a homomorphic image of B under the natural mapping on B onto B/0. Conversely, if the algebra C is a homomorphic image of B, then C is isomorphic to some B/0. Moreover, if f : B -->- C is the homo-
morphism at hand, then f = g e p, where p is the natural mapping on B onto B/0 and g is an isomorphism of B/0 onto C. It should be clear from the foregoing results that the homomorphisms (onto) of a Boolean algebra are in one-to-one correspondence with the proper con-
gruence relations on the algebra. The importance of the role which proper
congruence relations play suggests the problem of practical ways to generate them. One way is provided by a distinguished type of subset of a Boolean algebra, which we define next. A nonempty subset I of a Boolean algebra B is called an ideal iff
(i) x C I and y Cl imply x U y C 1, and
(ii) x C I and y C Bimply xnyC I. For example, if a C B, then {x C Bl x < a } is an ideal; this is the principal ideal generated by a, symbolized (a). To show that (a) i§, an ideal, we note that if x C (a) and y C (a), then a is an upper bound of {x, y } and, consequently, is greater than or equal to x U y, the least upper bound of x and y (see Theorem 2.2). Thus, x U y C (a). Finally, if x C (a) and y C B, then x n y < a, since x < a. Two trivial ideals of
B, namely, 101 and B, are both principal; indeed, 101 = (0), and B = (1). The ideal (0) is the zero ideal, and the ideal (1) is the unit ideal of B. An ideal of B which is different from B is called a proper ideal. The relationship between, proper ideals of B and proper congruences on B is given in the following theorem.
6.4
I
Congruence Relations for a Boolean Algebra
265
THEOREM 4.2. If 0 is a proper congruence relation on a Boolean algebra B, then I = { x C BIx 0 0 } is a proper ideal of B and x O y if x + y C I. Conversely, if I is a proper ideal of B, then the relation 0
defined by x O y if x + y C I is a proper congruence relation on B such that I = { x C Bjx 0 0). Thus, the proper congruence relations on B are in one-to-one correspondence with the proper ideals of B; each 0 corresponds to the ideal I of elements 0-related to 0. Proof.
Let 0 be a proper congruence relation on B and let I =
{ x C Bjx 0 0 1. Then I C B and, if x, y C I, then, in turn,
x00,
x'01,
x'ny'01 ny',
x'ny'Oy',
xUy0y.
The last fact, when combined with y 0 0, implies that x U y 0 0, which proves that I satisfies the first of the defining conditions for an ideal.
Next, let x C I and y C B. Since x 0 0 implies x n y 0 0, the second condition is also satisfied, and I is an ideal.
We prove next that x 0 y if x + y C I. Let x + y C I; that is, x + y 0 0. Then (x +,y) + y 0 0 + y, and hence x 6 y (where we have used properties of the symmetric difference stated in Example 4.2). Conversely, x 0 y implies that x+ y 0 y+ y; that is, x + y 0 0.
Turning to the converse of the foregoing, let I be an ideal of B and define 0 as stated in the theorem. Then 0 is reflexive (since x + x = 0 C I), symmetric (since x + y = y + x), and transitive (since the symmetric difference of two elements of I is in I). Further, x O y implies that x fl z o y fl z, since if x 0y, then, in turn, x +y C I,
(x + y) n z C I, and (x n z) + (y n z) C I. Finally, x 0 y implies that x' 0 y', since x + y = x' + y'. To complete the proof of the converse we must show that x 0 0 if x E I. This follows from the identity x + 0 = X. From the two preceding theorems there follows the existence of a one-to-one correspondence between the homomorphisms of a Boolean algebra and its proper ideals. If f is a homomorphism of an algebra B
onto an algebra C, the associated ideal I, which is called the kernel off, is the set of all elements of B which f maps onto the zero element of C. If 0 is the congruence relation on B that corresponds to 1, then we will often write
B/I instead of "B/O," and call the algebra so designated (an isomorphic
266
Boolean Algebras
I
CH A P. 6
the quotient algebra of B modulo I. If f is an isomorphism, then 0 is the equality relation on B and I is the zero ideal. Conversely, it is clear that if the kernel of a homomorphismf can be shown to be the zero ideal, then f is an isomorphism. Therefore, a homomorphism is an image of C.)
isomorphism iff its kernel is the zero ideal.
We conclude this section with several general remarks about homomorphisms. Since the operations of union and symmetric difference and the ordering relation are expressible in terms of intersection and complementation, it follows that a homomorphism of a Boolean algebra preserves each of the former. Further, the fact that if f is a homomorphism, then f(a () a') = f(a) l (f(a))', implies that f(O) is the zero element of the image algebra. By a dual argument, f(1) is the unit element of the image algebra.
EXERCISES 4.1. Prove the dual of property (Ca) for a congruence relation 0, namely, (Ca)'
IfaOcandb8d,then aUbBcUd.
4.2. Complete the proof of the assertion in the text that (B/B, is a Boolean algebra if (B, 1, ') is a Boolean algebra and 0 is a proper congruence relation on B. 4.3. Prove that the symmetric difference operation has the properties stated in Example 4.2.
4.4. Prove that if g is an isomorphism of the Boolean algebra B onto the Boolean algebra C, then g-' is an isomorphism of C onto B.
4.5. Prove the assertion prior to Theorem 4.1 that if f : B -- C is a homomorphism, then the relation 0 defined in B by a B b if f (a) = f (b) is a proper congruence relation on B. Further, prove that f = g o p, where g and p are the mappings defined in the text. 4.6. Prove the assertion following Theorem 4.1 that the homomorphisms (onto) of a Boolean algebra B are in one-to-one correspondence with the proper congruence relations on B. 4.7. Draw the diagram of the algebra (t; of all subsets of {a, b, c, d}. Locate the members of the ideal ({a}) on the diagram. Then use the diagram to determine the B-equivalence classes of the relation B corresponding to ({a}) in accord-
ance with Theorem 4.2. Finally, draw the diagram of the algebra a/B. 4.8. In the next section an atom of a Boolean algebra is defined to be nonzero element a such that if b < a, then either b = 0 or b = a. Show that there are no atoms in the Boolean algebra of equivalence classes defined in Example 4.1. 4.9. Referring again to Example 4.1, let A 01 B mean that A 0 B and that 3 is not a member of A + B. Prove that Bt is a congruence relation on P(Z). Determine the atoms of'P(Z)/Bi.
6.5
1
Representations of Boolean Algebras
267
5. Representations of Boolean Algebras The set-theoretical analogue of our second formulation of the theory of Boolean algebras is that of an algebra of sets. Since it was essentially the structure of such a system that motivated the creation of the axiomatic theory under discussion, an obvious representation problem arises : Is every Boolean algebra isomorphic to an algebra of sets? This we can answer in the affirmative. We shall begin with the case where the set B has a finite number of
elements, although our first definition is applicable to any Boolean algebra. An element a of a Boolean algebra is an atom if a 0 and b < a implies that either b = 0 or b = a. For x in B let A(x) denote the set of all atoms such that a < x. We next derive several properties of atoms and of the sets A(x) for the case of an algebra (B, n,') such that B is finite. A,.
If x 0 0, there exists an atom a with a < x.
This is a direct consequence of the finiteness assumption. The details are left as an exercise. Proof.
A2.
If a is an atom and x C B, then exactly one of a < x and
an x= 0 holds. Alternatively, exactly one of a < x and a < x' holds. Proof.
Since a n x< a, either an x= a or a n x= 0. Moreover,
both cannot hold, since a
0.
A(x n y) = A(x) n A(y). P r o o f. First we note that x n y is the meet of two elements in B, and A(x) n A(y) is the set of those elements common to A(x) and A(y). Now, assume that a C A(x n y). Then a< x n y, and hence A3.
a< x and a< y. Thus a E A(x) n A(y). hence A(x n y) 9A(x) n A(y). Reversing the foregoing steps establishes the reverse inequality, and hence equality.
A(x') = A(1) - A(x). Proof. First we note that A(1) is the set of all atoms of B. Now let a C A(x'). Then, by A2, it is false that a C A(x). Hence, a E A(1) A4.
A(x). Conversely, if a E A(1) - A(x), then a V A(x). Hence, by A2,
aEA(x').
Boolean Algebras
268
I
CHAP. 6
A(x) = A(y) iff x = y. Proof. Assume x 0 y. Then at least one of x < y and y < x is false. A5.
Suppose that x < y is false. Then x n y' F6 0, so that by Al there exists an atom a< x n y'. By A3, a C A(x) and a C A(y'). Thus, a C A(x) and, by A4, a V_ A(y). Hence, A(x) 0- A(y). The same conclusion follows similarly if it is assumed that y < x is false. A6.
If al, 612,
, ak are distinct atoms, A(al U a2 U . . . U ak) _
{ai, a2, ..., a&}. Clearly, jai, all,
Proof.
,
2,
U ak). For the con-
ak} C A(ai U a2 U
verse, assume that a CA (a, U a2 U
U ak) and a 5-6 a;, i = 1,
, k. Then, by A2, a n ai = 0, i = 1, 2,
af(aIUall U...Uak)= (anal)U(a(l
, k., and hence a = a2)U...U(af ak) =0,
which is impossible.
THEOREM 5.1. Let B be a Boolean algebra of n elements. Then B is isomorphic to the algebra of all subsets of the set of atoms of B. If m is the number of atoms of B, then n = 2"`. Proof.
Let T be the set of m atoms of B. Then the mapping A :
B -N 6'(T) is one-to-one by A5 and onto 6'(T) by A5. According to A3,
the image of a meet in B is the meet of the corresponding images in 6'(7). According to A4, the image A (x') of x' is the complement of the image of x, that is, the relative complement of A(x) in 7'. Thus, A is an isomorphism.
Then n = 2"' follows from the fact established earlier that the power set of a set of m elements has 2" members.
COROLLARY. Two Boolean algebras with the same finite number of elements are isomorphic.
The proof is left as an exercise.
EXAMPLE 5.1. For B we choose {1, 2, 3, 5, 6, 10, 15, 30}, the set of divisors of 30. For
a and b in B define a n b as the least common multiple of a and b and a' as 30/a. It is an easy matter to verify that (B, n,') is a Boolean algebra. The partial ordering relation introduced for the elements of a Boolean algebra takes the following form for this algebra: a <.b ifFa is a multiple of b. Thus, 30 is the least (and zero) element, and 1 is the greatest (and unit) element of the algebra. The atoms are 6, 10, and 15, and, consequently, the algebra is isomorphic to
6.5
I
Representations of Boolean Algebras
269
that determined by all subsets of {6, 10, 15} with the usual operations. The mapping which establishes this isomorphism matches 2 with {6, 10} and 30 with 0, for example. It is left as an exercise to verify that a U b, which in our second formulation of a Boolean algebra is defined as (a' (1 b')', is the greatest common divisor of a and b. Thus, if at the outset we had introduced in B, along
with the operation (l, a second binary operation U by defining a U b as the greatest common divisor of a and b, the outcome would have been the same. However, in the process we would have had to verify the distributive laws, which, in this case, is not a particularly simple matter.
Before continuing with the representation theory we urge the reader to pause and reflect on the extent to which Theorem 5.1 clarifies the structure of finite Boolean algebras (that is, algebras having a finite number of elements). Indeed, it leaves nothing to be desired in the way of a representation theorem. Possibly its definiteness, both with respect to its arithmetical aspect and the inclusion of an explicit recipe for constructing the asserted isomorphism, will be more fully appreciated when the corresponding result for the infinite case is obtained. For this, a different approach must be supplied, since there exist Boolean algebras
without atoms (see Exercise 4.8). In the infinite case the substitute for an atom is a distinguished type of ideal, which we describe next. Let S be the set of all proper ideals in the Boolean algebra B. Since 10 } C S, it is nonempty. Further, the members of S may be characterized as the ideals of B which do not contain 1. As is true of any collection of sets, S is partially ordered by the inclusion relation, and the concept of a maximal element of S is defined. A maximal element of S is a maximal ideal of B. The existence of maximal ideals in an infinite Boolean algebra is secured by an application of Zorn's lemma.
THEOREM 5.2. Maximal ideals of a Boolean algebra exist. Indeed, there exists a maximal ideal which includes any preassigned proper ideal. Proof. We consider the partially ordered set (S, C) defined above. If a is a simply ordered subset of S, then the union, A say, of the collection e is clearly an upper bound for t°. It is a straightforward exercise to verify that A is an ideal. Moreover, A C S, since 1 appears
in no member of e and, consequently, does not appear in A. Thus, since every chain in S has an upper bound in S, Zorn's lemma may be applied to conclude the existence of a maximal element. The same
argument when applied to (I C SII ? J), where J is a given proper ideal, yields the existence of a maximal element which includes J.
270
Boolean Algebras
!
C H A P. 6
We prove next a sequence of theorems about maximal ideals of a Boolean algebra B which closely parallels that derived earlier for atoms.
1, there exists a maximal ideal P with P (x) or, M,. If x what amounts to the same, x C P. Proof. This follows directly from the final statement of Theorem 5.2, choosing. (x) as the given ideal.
For each maximal ideal P and each element x of B, exactly one of x C P and x' C P holds. Proof. We note first that for no x is x C P and x' C P, since it would then follow that 1 (= x U x') C P, which is impossible. Now assume that x (Z P, and consider the set Q of all elements of B of the form b U p with b< x and p C P. Then Q is an ideal, since (i) (b, U p,) U (b2 U P2) _ (b, U b2) U (p' U P2) = b3 U Pa, and M2.
(ii)
if y C B, then (b U p) n y = (b n y) U (p n y) = b, Up,.
Also, P C Q, since, clearly, P C Q and x C Q, while x (Z P. Thus,
Q = B, since P is maximal. Hence, for some b < x and p C P, b U p = 1. It follows that x U (b U p) = x U 1, or x U p = 1. Then X,
By the second part of the definition of an ideal it follows that x' C P. To continue with the derivation of properties of maximal ideals which parallel, in a complementary sort of way, those for atoms, we introduce the analogue of the sets A (x). If x C B, let M(x) be the set of all maximal ideals P such that x V- P or, what amounts to the same by virtue of M2, x' C P. The sets M(x) have the following properties.
M(x n y) = M(x) n M(y). Proof. Let P C M(x n y). Then (x n y)' = x' U Y, C P. Since x' _ x' n (x' U y') and y' = y' n (x' U y'), it follows that x' C P and y' C P. Hence P C M(x) and P C M(y), or P C M(x) n M(y). Since Ma.
each of these steps is reversible, the asserted equality follows. M4. M(x') = M(1) - M(x), where M(1), is the set of all maximal ideals of the algebra.
Proof. We have P C M(x') iff,x' V- P iff x C P iff P C M(1) - M(x). Mb.
M(x) = M(y) iff x = Y.
6.5
I
Representations of Boolean Algebras
271
Assume x /- y. Then at least one of x < y and y < x is false. It is sufficient to consider the consequences of one of these. Let us say y < x is false. Then x U y' s 1, so there exists a maximal ideal P such that x U Y' C P. Now (x U y')' = x' n y (Z P, and, hence, by M3, P C M(x') and P C M(y), or P (Z M(x) and P C M(y). Thus, M(x) 54 M(y). Proof.
The promised representation theorem follows easily from M,-M5. It is valid for an arbitrary Boolean algebra, but, in view of the more precise
result for finite algebras, it is of interest only in the infinite case. The first proof of this result was given by the American mathematician, Marshall Stone (1936).
THEOREM 5.3. Every Boolean algebra B is isomorphic to an algebra of sets based on the set of all maximal ideals of B. Proof. Let JR denote the collection of all sets of ideals of the form M(x) for some x in B. According to M3 and M,, :l is an algebra of sets. The mapping M: B - - :711 is onto by the definition of mz and one-to-one by M5. Finally, in view of M3 and M1, M is an isomorphism.
With the representation theorem for the finite case in mind, it is natural to ask whether the above result cannot be sharpened to read, "Every Boolean algebra is isomorphic to the algebra of all subsets of some set." To discuss this matter we make two definitions. A Boolean algebra B is called atomic if for each nonzero element b of B there exists an atom a of B with a < b. A Boolean algebra B is called complete if for each nonempty subset A of B, lub A exists relative to the standard partial ordering of B. This definition has significance only when A is infinite, since in any Boolean algebra each pair, and consequently each finite set of elements, has a least upper bound. Now it is clear that the algebra of all subsets of a set is both atomic and complete. It is left as an exercise to prove that each of these properties is preserved
under an isomorphism. Hence, an algebra which fails to have either property cannot be isomorphic to an algebra of all subsets of a set. Since, as noted earlier, the algebra described in Example 4.1 is not atomic, the question in mind is settled in the negative. The same conclusion is provided by the algebra defined in Exercise 2.6, since, as the reader may prove, it is not complete. The above pair of conditions which are necessary in order that a
272
Boolean Algebras
I
CHAP P. 6
Boolean algebra be isomorphic to the algebra of all subsets of a set are also a sufficient set. This is our next theorem.
THEOREM 5.4.
Necessary and sufficient conditions that a Boolean
algebra be isomorphic to the algebra of all subsets of some set are that B be complete and atomic. In this event, B is isomorphic to the algebra of all subsets of its set of atoms. Proof. Since the necessity of these conditions has already been ob-
served, we turn to a proof of their sufficiency. Suppose, therefore, that B is complete and atomic and let 7' be the set of all atoms of B. As in the proof of the finite case, let A(x) denote the set of atoms a for which a < x. Then, exactly as in the finite case, it can be proved that
the mapping A on B into 01(T) has properties A3 and A4 (now, of course, property A, is an assumption). This means that A is a homomorphism on B onto an algebra of subsets of 7'. If U is an arbitrary subset of T, then, by the assumed completeness, U has a least upper bound, u say, in B. Then A(u) = U (this is a generalization of A6 for the finite case), so A is onto U'(7'). All that is needed to complete the proof is to show that A is one-
to-one-that is, that the kernel of A is the zero ideal. This follows
from the atornicity of B; if x P6 0, then A(x) s 0, so A(x) _ 0
ifl'x=0. EXERCISES 5.1. Prove property A, of atoms in a finite Boolean algebra. 5.2. Prove the Corollary to Theorem 5.1. 5.3. Referring to Example 5.1, verify that the set of divisors of 30 determine a Boolean algebra. Verify that in this algebra a U b is the greatest common divisor of a and b. 5.4. Referring again to Example 5.1, show that the set of divisors of any square-free integer determines a Boolean algebra in exactly the same way as does the set of divisors of 30. What does this result imply regarding the number of divisors of a square-free integer? 5.5. (a) Prove the converse of property M2 of maximal ideals to obtain a characterization of maximal ideals among the set of proper ideals. (b) Prove that maximal ideals can also be characterized as those ideals I of a Boolean algebra B such that B/I has just two elements. 5.6. (a) In Exercise 2.6 there is defined the Boolean algebra d of all subsets A of Z} such that either A or A is finite. Prove that the collection e of all finite subsets of _L+ is a maximal ideal of a.
6.6
Statement Calculi as Boolean Algebras
273
(b) The same collection a is an ideal of the algebra tP(Z+). Prove that e
is not a maximal ideal of this algebra and determine a maximal ideal which includes C. 5.7. Devise a proof of Theorem 5.3 for the case of a denumerable Boolean algebra B that does not employ Zorn's lemma. (Hint: Prove by induction that if B is denumerable then there exists a maximal ideal which includes any preassigned ideal.)
5.8. Prove that the Boolean algebra in Exercise 5.6(a) above is not complete by showing that the collection of all unit sets of positive even integers has no least upper bound. 5.9. Prove that an isomorphic image of a complete Boolean algebra is complete and that an isomorphic image of an atomic algebra is atomic. 5.10. Prove that every ideal of a Boolean algebra B is principal iff' B is finite. (Note: The proof that B is finite if every ideal is principal is difficult.)
6. Statement Calculi as Boolean Algebras Statement calculi, as described in Section 4.3, yield models of the theory of Boolean algebras. One need merely restrict his attention to the algebraic character of a statement calculus as we now discuss it. According to Section 4.3, the core of a statement calculus is a nonempty set So of statements. This set is extended to the smallest set S of statements (that is, formulas) such that the negation of each member of S is a member of S and each of the conjunction, disjunction, conditional,
and biconditional of any two members of S is a member of S. Since it was observed that the disjunction, conditional, and biconditional of two statements can be defined in terms of negation and conjunction, we may and shall assume that S is simply the closure of So with respect to these connectives. Then A takes on the role of a binary operation in S and ' (which we shall use as the symbol for negation) that of a unary operation in S.
In order to state precisely the structure of the system (S, A, '), that is, the set S together with its two operations, we must decide on the "natural" congruence relation for it. The obvious choice is the eq relation.
With the adoption of eq as the equality relation on S we assert that (S, A, ') is a Boolean algebra. For proof we note first that eq is a proper congruence relation for the system. Indeed, we already know that it is
an equivalence relation and, using truth tables, it is an easy matter to prove that A eq B implies that (A A C) eq (B A C) and A' eq B'. Moreover, it is a straightforward exercise to verify that Bt-B4 of The-
274
Boolean Algebras
I
CHAP. 6
orem 3.1 are satisfied; that is, (A A B) eq (B A A), and so on. The zero element of the Boolean algebra (S, A, ') is A A A' for any formula A, and the unit element is (A A A')'. Frequently the result which we have obtained is stated as "The statement calculus under the connectives `and' and `not' is a Boolean algebra." This is somewhat misleading, since there is a statement calculus for each set So. Actually, it is only the cardinal number of So that matters; two calculi for which the respective sets of basic statements have the same cardinal number differ only in verbal foliage. Thus, a more accurate assertion, in the sense that it recognizes the existence of different statement calculi and the congruence relation employed, is "A statement calculus under the connectives `and' and `not' is a Boolean algebra with respect to equivalence." The Boolean algebra obtained from a statement calculus by the identification of equivalent formulas will be called the Lindenbaum algebra of that statement calculus. Such algebras are discussed in the next section.
7. Free Boolean Algebras t The preceding section provides the genesis of a method for constructing, in a purely formal way, a Boolean algebra from any nonempty set. This involves the use of congruence relations in a way which extends that described in Section 4. Let us dispose of this matter first. In Section 4 the rough assertion was made that if (X, . ) is a mathematical system and p is a congruence relation for it, then, corresponding
to each operation (or relation) in X, there can be defined in X/p an operation (or relation) having all the properties of the original. (This was stated precisely and proved in the case of a Boolean algebra.) Now it can happen that the resulting system with X/p as basic set has additional properties besides those inherited from the original system. In-
tuitively, this seems quite plausible; if X is collapsed appropriately, irregular behavior present in the original system may be smoothed out in the derived one. An instance of this occurs below; a system which has some requisites of a Boolean algebra is forced into determining one by introducing a suitable congruence relation. The system with which we begin is the abstraction of the most obvious features of an intuitive statement calculus. We proceed with its f In the remainder of this chapter there are several forward references to Section 9.2. A mere perusal of that section will suffice for an understanding of the applications to be made to Boolean algebras.
6.7
I
Free Boolean Algebras
275
definition. Let So be an arbitrary nonempty set and A and ' be two symbols which do not designate elements of So. We give an inductive definition of a set S whose elements are certain finite sequences of eleinents of So U I A, '} together with parentheses. (1) If s C So, then s C S.
(II) If I C S, then (t)' C S. (III) If s, t C S, then (s) A (t) C S. (IV) The only members of S are those resulting from a finite number of applications of (1), (II), and (III).
As a direct consequence of the definition of S we may regard A as a binary operation in S and ' as a unary operation in S. In these formal circumstances the natural congruence relation for the system (S, A, ') is that of elements having identical form. As such, (S, A, ') is surely not a Boolean algebra. Can a congruence relation be defined for the system such that a Boolean algebra will result? On the basis of the discussion in Section 4, necessary and sufficient conditions which such a relation 0
must satisfy are that it be an equivalence relation different from the universal relation on S (the latter requirement reflects the fact that a Boolean algebra has more than one element) and that the following hold for all elements of S.
Ifs0t,then sAu01Auforallu.f Ifs 01, then s' 01'.
(C) sA101As.
sA (t Au)0(sAt) Au. IfsA1'0uAu'forsome u,then sA10S. IfsA10s,then sA1'OuAu'forallu. In defense of our assertion we note that the first two parts of (C) are necessary and sufficient conditions that the operations in S induce opera-
tions in S/0 in a natural way, and the remaining four parts constitute a minimal set of conditions which insure that the resulting system is a Boolean algebra. Parenthetically, we remark that at times, when an equivalence relation satisfying (C) is introduced into (S, A, '), it is more natural to continue with the elements of S (instead of those of S/0) as
the basic objects. This attitude is reflected in referring to the system (S, A, ') as a Boolean algebra with respect to 0. There is no question concerning the existence of equivalence relat Here we begin to follow the usual mathematical conventions of omitting superfluous parentheses.
276
Boolean Algebra.s
I
CH A P. 6
tions satisfying (C), since if members of S are interpreted as truth functions, then, as observed in the preceding section, the eq relation satisfies (C). We consider now the set e of all equivalence relations satisfy-
ing (C) and let µ denote the intersection of the collection e. It is left as an exercise to prove that µ C C and, consequently, is the smallest member of ('0, in the sense that it relates the fewest possible pairs of elements of S. The Boolean algebra S/µ is called the free Boolean algebra generated by So. In this context the word "free" is intended to suggest that the elements of the algebra are as unrestricted as is possible if they are to have the structure of a Boolean algebra. Intuitively this is clear,
since the only relations which have been imposed upon them are a necessary and sufficient set to insure that they do have that structure. There are alternative definitions of a free Boolean algebra that are more exotic; our old-fashioned one has the merit that it simultaneously disposes of the existence of such algebra.-,.
For an application in the next section, we note the relationship of the algebra S/0 determined by an arbitrary member 0 of e to S/µ. Since s µ I implies s 0 1, a 0-equivalence class is a union of ptequivalence-classes. Thus it is possible to define a mapping f on S/µ onto S/0 by {
J
[S]B.
That is, the image of the µ-equivalence class determined by s is the 0-equivalence class determined by s. Clearly, f is a homomorphism onto S/0; for example, the calculation f([s]µ A [1]µ) = f([s A 1]µ) _ [s A 1]e = [s]e A [1]e
shows that f preserves intersections. Since the zero element of S/µ is [u A u'],, for any u in S, the zero element of S/0 is [u A u']e. It is possible to give an interesting characterization of the congruence relation u. To this end we consider the µ-equivalence class `U = [ (u A u')'],, for some u in S. This class is independent of u since it includes all members of S having the same form. This follows from the fact that if 0 C 0;
then (s A s')' 0 (u A u')' for all s in S, and hence (s A s')' µ (u A u')' for all s in S. Since the zero element of S/µ is [u A u'],,, `U is the unit element of S/µ. It is left as an exercise to prove that if s, I E S, then
if
(s A 1')' A (s' A 1)' C `U or, introducing s E-' 1 as an abbreviation for (s A t')' A (s' A t)', sµ1
SJA 1
if SHIC`U.
6.7
I
Free Boolean Algebras
277
This characterization of µ in terms of 't) is opaque until S is interpreted as the set of formulas of a statement calculus. Then it will be recognized that p is to be interpreted as the eq relation and that `U becomes the set of valid formulas of S. Finally, the characterization of u in terms of V is simply the formal version of Theorem 4.3.2 (namely, s eq t if 1= s H t).
The same interpretation of S suggests, as an alternative approach to the definition of the free Boolean algebra generated by So, the introduction of the set co first, followed by the definition of p in terms of U. This is possible using some formulation of a statement calculus as an axiomatic
theory. The starting point is the inductive definition of the set Sin terms of the elements of So U (A, '}, just as before. We now wish to obtain the set `U as that subset of S which, under the interpretation of S, constitutes the tautologies. This is possible using the results of Section 9.2. Introducing s -> I as an abbreviation for (s A t')', we define a subset V of S as follows. (I)
Any member of S that has one of the following three forms is a member of V:
(ss). s -, (t ---> .s),
(u --> (s
((u -' s) -, (u --, l))
(II) If s and t arc members of S such that both- s and s
t are
members of V, then t is a member of V.
(III) An clement of S is a member of v iff it can be accounted for using (I) or (II). The desired conclusion, that `U = V, is then secured via the completeness theorem (Theorem 9.2.3) and its converse (Theorem 9.2.4). In terms of V, the relation µ may now be defined by soul if s - IC V. Although statement calculi served as our inspiration for introducing the concept of a free Boolean algebra, now that the latter concept has been firmly established, we may turn matters around and describe the Lindenbaum algebra of a statement calculus as simply the free Boolean
algebra generated by the set of prime formulas of the calculus in question.
EXERCISES 7.1. Show that the relation µ is a member of C.
7.2. Show that sµl iffsHl C U.
278
Boolean Algebras
I
CH A P. 6
7.3. Investigate the question of whether or not the algebra of all subsets of a set X is a free Boolean algebra.
8. Applications of the Theory of Boolean Algebras to Statement Calculi It is by way of the Lindenbaum algebra of a statement calculus that the techniques and results of the theory of Boolean algebras can be applied to the study of statement calculi. The applications include elegant characterizations of various concepts that arise in the study of statement calculi and simple proofs of important metatheorems, as we shall show in this and the next section. We begin by analyzing the theory of deducibility for statement calculi in terms of the theory of Boolean algebras. The first step is to obtain a characterization of the algebraic structure of a statement calculus when a set of formulas is singled out to serve as a set of assumptions. For this let us consider the formal analogue of a statement calculus as described
in the preceding section; that is, let us consider the system (S, A, ') generated by the set So. In it we imitate the designation of a set of formulas of a statement calculus as a set of assumptions by selecting a subset r of S and adjoining to the set (C) of conditions given earlier one of the form
a0(u Au')' for each element a of r. Here u is any member of S. (Notice that the interpretation of this condition is that a is "true.") Let (Cr) denote the resulting set of conditions and e, denote the set of all equivalence relations on S which satisfy (Cr). Further, let µr denote the intersection of er. Then '.cr C C and, indeed, is its least member. Each µr-equivalence class is the union of µ-equivalence classes. In particular, the µr-equivalence class Dr, let us call it, which includes `u, also includes r
and, hence, each is-equivalence class of the form [a],, with a c r. Assuming that there are at least two µr-equivalence classes, the system S/µr is a Boolean algebra and Vr is its unit element. According to an observation made in the preceding section, S/µr is a quotient algebra
of S/µ. Using the characterization given in Theorem 4.2 of the congruence relation which is determined by the associated homomorphism,
we conclude that s A r t iff s + I C [u A u'],,,; that is, if (s + t)' C ` In turn, this condition translates into
sµrI if sue-I C'or,
i..
6.8
1
Applications to Statement Calculi
279
which generalizes the earlier characterization of µ as s µ tiffs .-* I Cl). Before continuing we note that U,. has the following closure properties.
(i) Ifs,IC`U,.,then sAICU,.. (ii) If s E `U, and I C S, then s V I E 0r To prove (i) observe that ifs, t C 'U,., then s p,. (u A u')' and I m,. (u A u')', so s A t µ,. (u A u')'. To prove (ii) let s C. 'U,. and I C S be given. Then, in turn, s µr (t A i')', s' At, t A t', s' A I'm,, I A I', and (s' A I')' fur (t A t')', which means that s V I C `or.
We continue with our generalization of the results of Section 7 by showing that it is possible to reach 'U,, independently of µ,, and then define µr in terms of '0r. To accomplish this we define the subset Vr of S
by modifying part (I) of the earlier definition of V to include I' in Vr. Then it is clear that V,, may be characterized as the smallest subset of S that includes V and r and contains the clement t whenever it contains s
and s - t for some s. On the other hand, v,., as we have noticed, includes V (_ 'o) and F. Further, if s and s -, I (that is, s' V t) are in v,., then so is t V (s A (s' V t)), according to the closure properties which we derived for 'U,.. A calculation shows that l V (s A (s' V t)) µ,. 1, so we may conclude that if s, s --. l C '0,., then t C `o,.. Finally, in view of the minimality of µr (in terms of which 'U,. was defined), we conclude
that U,, has exactly the same characterization as does V,.. Thereby we infer that U,. = V. It follows that µ,, may be defined (or, characterized, at one's preference) as sJ.crt
ilf
V.
Now let us interpret the foregoing from the standpoint of the statement calculus. If we regard (S, A, ') as a statement calculus, then the role of r is that of a set of premises. Under this circumstance, the free Boolean algebra S/µ (the Lindenbaum algebra of the calculus) is supplanted by the quotient algebra S/µ,, and the set V of provable formulas is enlarged to V,., the set of all formulas which are deducible from F. The set Vr, which is the unit element of S/µ,., may be described as the smallest set which includes V and r and is closed under modus ponens. The above characterization of µ,. in terms of V,. amounts to this: Two formulas are in the same member of S/µ,. if each is deducible from the other relative
to 1' as a set of assumptions. Finally, we note that a necessary and sufficient conditions that Sly,, be a Boolean algebra (an assumption which we have made) is that I' be a consistent set of formulas. Further insight into the notion of provability and the nature of so-
280
Boolean Algebras
!
CHAP. 6
called deductive systems at the statement calculus level can be obtained
by reversing our point of view. For this our starting point is the consideration of a Boolean algebra (B, f1, ') whose elements are to be thought of, intuitively, as the statements of some theory. Further, assume that P is a specified noncmpty subset of B whose elements are to be regarded as the provable statements of that theory. With this interpretation of P in mind, it is reasonable to make the following assump.
tions about P. If s and t arc members of P, then so is s n t (that is, "s and t") and, if s is in P, then so is s U t (that is, "s or t") for any choice of t. Nonempty subsets of a Boolean algebra which satisfy these conditions are called filters. That is, a nonempty subset F of a Boolean
algebra B is called a filter if (i) x C F and y C F imply x f1 y C F, and
(ii) x C F and y C Bimply xUyCF. Before considering the set P as a filter we discuss a few properties of filters.
Since the defining conditions of a filter arc the duals of those for an ideal of a Boolean algebra, the term dual ideal is often used in place of filter. Filters and ideals occur in dual pairs. The pairing is easy to describe: if I is an ideal of B, then Ix C Bjx' E I} is a filter; if F is a filter, then (x C Bjx' C F} is an ideal, as is easily proved. 'T'his pairing provides a bridge for transferring observations about ideals to filters. For example, both B and (1) are filters of B. Again, if a C B, then {x E Bjx > a} is a filter; this is the principal filter generated by a. A filter of 13 which is different from 13 is called a proper filter. A proper filter may be characterized as a filter which does not contain 0. A maximal member (with respect to inclusion) of the set of proper filters of B
is called a maximal filter. For example, in the Boolean algebra of all subsets of a nonempty set A, the collection of all those subsets of A that contain a fixed element of A is a maximal filter. The dual of the earlier proof, that if M is a maximal ideal of a Boolean algebra B and x C 13, then exactly one of x and x' is in M, yields the same conclusion about maximal filters. Proofs of the foregoing assertions are left as exercises. Finally, it is left as an exercise to prove that a filter F of B may be characterized as a subset of 13 such that 1 C F and, if x, x' U y E F, then y C F. Introducing x -4y as an abbreviation for x' U y, the latter condition may be rewritten as: if x, x --+ y E F, then y E F. We return to the discussion that we began by defining a Boolean logic
6.8
1
Applications to Statement Calculi
281
to be an ordered pair (Z, P), where 8 = (B, n, ') is a Boolean algebra and P is a filter of the algebra. The elements of B will be called statements
and those of the filter P will be called provable statements. We shall abbreviate "s is in P" by "l- s." As the first logical concept that we shall introduce into a Boolean logic, we choose that of consistency. A Boolean logic (58, P) is called consistent if for no s in B both s and s' belong to P. Since P is a filter, (58, P) is consistent if P is a proper filter. Next, let us call (S$, P) negation complete if for every s in B, either s or s' is provable. We contend that (58, P) is negation complete and consistent
iff p is a maximal filter. For the proof assume first that the logic is consistent and negation complete. Consistency implies that P is a proper filter and hence has a chance to be maximal. To show that it is maximal,
suppose that Q is a filter which properly includes P, and let s be a member of Q that is not in P. Negation completeness implies that /C P and hence that s' C Q. But s and s' in Q imply that 0 = s n s' E Q, which means that Q = B. The converse is an immediate consequence of an earlier remark that for each element x of a Boolean algebra exactly one of x and x' belongs to a maximal filter. We state our result as the next theorem.
THEOREM 8.1. A Boolean logic (53, P) is consistent and negation complete if P is a maximal filter of Z. The next logical notion that we discuss for a Boolean logic ($i, P) is that of deducibility. If r is a subset of B, then we shall say that a statement s of B is deducible from I' if there exists a finite sequence ul, u2,
-,u
of statements of B such that u,, is s and if for each i, I < i < n, either u, is in I' or P or there exist j < i and k < i such that Uk is u, --' u:. Since P is a filter we know that 1 C P and y C P whenever x, x --+y C P. It
follows that P satisfies the axioms of a statement calculus [that is, conditions (I) and (II) for V in the preceding section] and, hence, the deduction theorem (Theorem 9.2.1) in the form proved for the statement calculus is available. In the present context we may state it in the following form: If I' C B, then s is deducible from r ii' there exists a finite subset { r,, r2j
,
rk) of r such that i-- r, n r2 n
n rk - s. We shall
denote the set of statements deducible from I' by r; of course, r depends on both r and the choice of P.
THEOREM 8.2. The set r of statements deducible from r is the smallest filter that includes both I' and P.
282
Boolean Algebras
I
CHAP P. 6
Clearly this is true if r is the empty set, since then P = P. Suppose that r is not empty and let Q be any filter that includes r and P. If s C F, then there exist elements r,, r2, , rk of r such that I- r, n r2 Proof.
-
n . . n rk - s, by the deduction theorem. Hence r, n r2 n rk C Q and rin r2 n
n
n rk -+ s C Q, so s C Q. Thus we have proved that every filter which includes r and P also includes P. It remains to prove that P is a filter which includes both r and P. This is left as an exercise. We shall call a subset A of B a deductive system if if includes A. By the
previous theorem, A C 0, so A is a deductive system if A = 0 and this implies that A is a filter including P. Conversely, if A is a filter that
includes P, then A = 0, by the same theorem. Thus, the notion of deductive systems coincides with that of filters that include P.
EXERCISES 8.1. Show that the relation µ,, is a member of Cr. 8.2. Show in detail that sµrt ifP s H I C u r. 8.3. Write an expanded version (supplying all proofs) of the paragraph in which V,. is defined and the result that tir = Vr, is obtained. 8.4. Prove the assertion in the text that two formulas are in the same member of S/µr if each is deducible from the other relative to r as a set of assumptions. 8.5. Prove the assertion in the text that S/µi is a Boolean algebra iT r is a consistent set of formulas.
8.6. Show that a proper filter may be characterized as a filter which does not contain 0. 8.7. Show that a filter F of B may be characterized as a subset of B such
that1 CFandx,x'UyCFimply that yCF. 8.8. Show that a maximal filter can be characterized as a filter such that for each x exactly one of x and x' is in it. 8.9. Rewrite the proof of Theorem 5.3, using filters in place of ideals. 8.10. Complete the proof of Theorem 8.2.
9. Further Interconnections between Boolean Algebras and Statement Calculi The two-element set IT, F} determines a Boolean algebra having T as unit element and F as zero element. By a two-valued homomorphism of a Boolean algebra B we shall mean any homomorphism of B onto a two-element Boolean algebra. Since all two-element Boolean algebras are isomorphic, we may always use IT, F} in considering a two-valued
6.9
I
Further Interconnections with Statement Calculi
283
homomorphism of B and, thereby, regard such a homomorphism as providing a "truth-valuation" of the elements of B. There is a natural one-to-one correspondence between the set of maximal ideals and that of maximal filters and between each of these and the set of two-valued homomorphisms of B. In fact, if I is a maximal
ideal of B, then the dual of I (that is, the set of all a' where a E I) is a maximal filter and the formula (1) v(b) _ {T
if b (Z I
defines a two-valued homomorphism of B. Similarly, if F is a maximal filter of B, then the dual of F (that is, the set I of all a' such that a E F) is a maximal ideal and (1) defines a two-valued homomorphism corresponding to F. On the other hand, if v is a two-valued homomorphism of B, then the set I = {b C Blv(b) = F} is a maximal ideal and the set
F = {b C Blv(b) = Ti is a maximal filter dual to I. By virtue of these natural correspondences, the following assertions are equivalent to each other.
(2) For every proper ideal I there exists a maximal ideal which includes I.
(3) For every proper filter F there exists a maximal filter which includes F.
(4) For every proper ideal I [proper filter F] there exists a two-valued homomorphism v such that v(b) = F for b C I [v(b) = T for b C F].
Now (2) is simply our Theorem 5.2, so the validity of (3) and (4) then follow. As an application of the foregoing we analyze the nature of truth-value
assignments to the formulas of a statement calculus. If So is the set of prime formulas of the statement calculus (S, A, '), then an assignment of truth values to the elements of S amounts to the extension of a given mapping on So into IT, F} to one on S onto IT, F}, in accordance with the inductive definition given in Section 4.3. Thereby it is insured that equivalent formulas are assigned the same value. Hence, the extended
284
Boolean Algebras
I
CH A F. 6
mapping may be construed as a mapping v on the Lindenbaum algebra (S/µ, A,') onto IT, F), and the definition of v implies that it is a twovalued homomorphism of the. Lindenbaum algebra. The kernel of v is the maximal ideal which is related to v in the natural correspondence mentioned above. On the other hand, any two-valued homomorphism
v of (S/µ, A, ') (regarded as simply a free algebra) yields a truthvaluation of the elements of S/µ and Hence of the elements of S upon assignment of T or F to a formula according as the u-equivalence class to which it belongs is assigned T or F. It is easily shown that this is a truth-value assignment in the sense of Section 4.3. In summary, truthvalue assignments to the formulas of a statement calculus coincide with two-valued homomorphisms of the Lindenbaum algebra of the calculus. Furthermore, the existence of truth value assignments to a statement calculus is insured by the existence of maximal ideals in a Boolean algebra, and conversely.
The existence of maximal ideals that include a preassigned proper ideal of a Boolean algebra also insures the existence of an isomorphic
image of the algebra in the form of an algebra of sets. Indeed, the existence of such maximal ideals is the basis for the proof of Stone's representation theorem! Conversely, from the assumption that (5) For every Boolean algebra there is an isomorphic algebra of sets.
may be inferred the existence of maximal ideals in Boolean algebras. This result, which is also due to Stone (1936), follows immediately from the existence of maximal ideals in an algebra of sets. To prove this, in turn, let us consider an algebra a of sets based on 11. Let V be any subset of U and let a(V) be the collection of all elements of a which
are included in V. Then it is possible to prove that a(V) is an ideal of a and that a(V) is a maximal ideal of a if U - V has exactly one member. Since the proof makes an interesting exercise, we shall allow the reader to carry this out. The completeness theorem (Theorem 9.2.3) for the statement calculus
can also be obtained from the theorem on the existence of maximal ideals, and hence filters, in a Boolean algebra. To show this let us consider a statement calculus e _ (S, A, ') and its Lindenbaum algebra 21 = (S/,u, A, '). In Section 9.2 we prove that the completeness theorem
for e is cc,u
6.9
I
Further Interconnections with Statement Calculi
285
assignments to elements of a statement calculus, the satisfiability of s corresponds to the existence of a two-valued homomorphism v of ?l such that v([sj,,) = T. Thus, the completeness theorem may be translated into the following form: For any nonzero element a of 21 there exists a two-valued homomorphism v of ?1 such that v(a) = T. An equivalent statement, which results upon considering the principal filter generated by a and then the equivalence of propositions (3) and (4), is: Each nonzero element of 21 is a member of a maximal filter of W. It is this proposition which we shall take as the Boolean translation of the completeness theorem. Then the completeness theorem follows immediately from the theorem on the existence of maximal filters. We note that this derivation of the completeness theorem does not involve any restriction on the cardinality of the set of primitive symbols of the statement
calculus. In particular, therefore, the set of primitive symbols may be assumed to be uncountable. Conversely, the existence of maximal filters can be deduced directly from the completeness theorem formulated in a stronger form. To be precise, we can prove the equivalence of the existence of maximal filters
and the strong completeness theorem for the statement calculus (with no restrictions on the cardinality of the set of primitive symbols). For this we use the fact (see Section 9.2) that the strong completeness theorem for t is equivalent to the proposition that (6) Every consistent set of formulas is simultaneously satisfiable. Now assume that r is a consistent set of formulas of CS and let V, denote
the set of all formulas which are deducible from r. Then V,./µ is a proper filter of $l, as we shall show. To prove that
is a filter we
use the characterization of a filter given earlier as a subset I' of a Boolean
algebra B such that (i) 1 C F and (ii) if a and a -' b are in F, then so is b. In the case at hand, (i) is satisfied because the set of theorems of S is included in Vr, and (ii) is satisfied because V, is closed under modus is a proper filter. ponens. Finally, the consistency of IF implies that Next, analyzing the satisfiability of I' as we did above for the case of a single formula, we infer that as the Boolean translation of the strong completeness theorem we may take the statement
(7) Every proper filter of the Lindenbaum algebra of a statement calculus is included in a maximal filter.
Since (7) is a special case of (3), to prove the equivalence of (7) and (3) it must only be shown that (7) implies (3). For this let B be a
Boolean Algebras
286
I
CHAP. 6
Boolean algebra and I be some proper ideal of B. We now form the statement calculus (s, A, ') generated by a set So whose members p= are in one-to-one correspondence with the elements x of B. Now consider the mapping f on S onto B given by the following inductive definition:
AM =x,
for all s in S, As') = U(s)) ', f (s. A t) = f (s) A f (t), for all s and tin S.
It is seen immediately that if t is a theorem of S, then f(t) = 1 and for all s and tin S, f(s t) = I iff f(s) = f(t). These facts imply (recalling Section 7) that if sit 1, then f (s) = f (l). Hence, f induces a mapping g on S/µ, the Lindenbaum algebra of S, onto B. Clearly, g is a homomorphism onto B, so B is isomorphic to a quotient algebra (S/ic)/J. Now let K denote the counterimage in S/µ of the given proper ideal I of B. Then K 2 J since I includes 10). From our assumption (7) follows the existence of a maximal ideal M that includes K, and consequently J.
Now M, as an ideal, is a Boolean algebra and J is an ideal of this algebra. It is left as an exercise to prove that M/J is a maximal ideal of (S/µ)/J. But then the isomorphic image of M/J in B is a maximal ideal of B that includes I. This shows that (2), and hence its equivalent (3), holds. From the results which have been obtained it is clear that the state-
ments (2)-(5) about Boolean algebras are equivalent to each other. Moreover, the equivalence of each pair has been established without recourse to the axiom of choice. On the other hand, all known proofs of (2), for example, are based upon the axiom of choice or an equivalent
principle of set theory. A problem arises as to whether (2) is really. dependent on the axiom of choice. This problem has been responsible for the derivation (without use of the axiom of choice) of a great variety of statements about Boolean algebras which are equivalent to (2) and, also, the investigation of specialized forms of the axiom of choice which are consequences of (2). The most comprehensive treatment of these' matters to date is due to J. Loi and C. Ryll-Nardzewski (1954-1955). The strongest result which they found is that (2) implies the axiom of. choice for the case of a collection of nonempty finite sets. The question as to whether the axiom of choice is independent of (2) is as yet unsolved; the evidence suggests that the answer is in the affirmativej f (Added in proof.) It has just come to my attention that a further contribution to thit matter appears in J. D. Halpern (1961). There it is asserted that in certain models of sdi theories (2) is true but the axiom of choice is not.
Bibliographical Notes
287
We conclude by remarking that the demonstration of the strong completeness theorem [in the form (6) ] for the statement calculus is not the end of the applications of Boolean methods to mathematical logic.
Many fundamental theorems about the predicate calculus and about first-order theories can be easily proved by applying Boolean methods to appropriate "Lindenbaum algebras" associated with such theories.
An outline of such applications appears in R. Sikorski (1960). EXERCISES
9.1. Show that if v is a two-valued homomorphism of the Lindenbaum algebra of a statement calculus, then it provides truth value assignments to the elements of the statement calculus in the sense of Section 4.3. 9.2. Complete the proof of the result that (5) implies (2). 9.3. Show in detail that we may take (7) as the Boolean translation of the strong completeness theorem for a statement calculus. 9.4. Fill in the details of the proof in the text that (7) implies (3).
BIBLIOGRAPHICAL NOTES Sections 1-3. An introductory account of Boolean algebras appears in E. R. Stabler (1953). A more sophisticated treatment is to be found in P. C. Rosenbloorrr (1950). A high-level, modern treatment of the theory, which treats Boolean algebras primarily from the standpoint of a generalization of algebras of sets, has been given by R. Sikorski (1960). Another high-level account, which
places more emphasis on the algebraic structure of the theory, appears in G. Birkhoff (1948). There exists a great variety of formulations of the theory of Boolean algebras. The book by Sikorski lists references to many of these. The
axioms introduced in Section 3 are due to L. Byrne (1946). The same set is used by Rosenbloom in his book.
Section 4. A discussion of congruence relations for Boolean algebras appears in Rosenbloorrr (1950). Congruence relations for algebraic systems in general are discussed in Section 8.1 of this book. For a more comprehensive
treatment of ideals, homomorphism, and so on, Sikorski (1960) should be consulted.
Section 5. An exhaustive treatment of representations of Boolean algebras by algebras of sets along with related topics is given in Stone (1936, 1937, 1938).
The fundamental representation theorem and the theorem on the existence of maximal ideals have been the subjects of many papers. References to such papers and a concise presentation of Stone's work appear in Sikorski (1960). Sections 6-7. An expository account of the subject material of these sections appears in P. R. Halmos (1956). This paper also gives an introduction .to polyadic algebras, which stand in the same relationship to the pure predicate
288
Boolean Algebras
I
CHAP. 6
calculus of first order as do Boolean algebras to the statement calculus. Rosenbloom (1950) also discusses some of these topics. Section 9. The application of Boolean methods to mathematical logic was the subject of many papers in the early 1950's. Many of these papers were published in Fundamenta Mathemalicae. Exact references are given in Sikorski's book.
CHAPTER /
Informal Axiomatic Set theory
THE ANTINOMIES OF INTUITIVE set theory pose the problem of pro-
viding a theory of sets which is free of contradictions. The analysis of the well-known antinomies (Section 2.11) for the purpose of determining possible fallacies in methods of constructing and reasoning about sets-methods which had seemed convincing before they were found to generate contradictions-has led to several reconstruction of set theory along axiomatic lines. This chapter is devoted to outlining that one known as Zermelo-Fraenkel set theory [although it would be more appropriate to call it Zermelo-Fraenkel-Skolem set theory, since it is the
theory of E. Zermelo (1908) as modified by both A. Fraenkel and T. Skolcm]. In the last section fleeting contact is made with the other axiomatization of set theory with which mathematicians feel comfortable-the von Neumann-Bernays-Godel theory. Since that part of Zermclo-Fraenkel set theory which reconstructs the theory of Chapter 1 and Chapter 2, up to cardinal numbers, closely parallels the earlier intuitive development, we shall, so to speak, merely provide the axiomatic underpinnings for it. Then, for Cantor's theory of transfinite arithmetic, we substitute the theory of ordinal and cardinal numbers due to von Neumann.
1. The Axioms of Extension and Set Formation The recipe in Section 5.2 for presenting an informal theory cannot be used here since it calls for a "general theory of sets" as an ingredient. An obvious alternative, which we shall adopt, is to presuppose only a system of logic. As the primitive notions of Zermelo-Frankel set theory, which we shall symbolize by e, we take set and (the 2-place predicate) membership. We shall denote the relation of membership by "C" and, at the outset, denote sets by lower-case letters. Before describing the prime 289
290
Informal Axiomatic Set Theory
I
CHAP. 7
formulas of CS a decision must be reached as to whether the relation of equality shall be taken as part of the underlying logic or introduced as a defined relation of the theory; either is possible. In Example 4.7.1 the latter point of view is adopted. Here we elect the former viewpoint. This is in keeping with the procedure in Zermelo (1908). With the equality relation included in the underlying logic it is possible, in an interpretation of the theory, to admit nonsets (that is, objects which, like the empty set, have no members but are distinct from the empty set) in the domain of the relation assigned to E. [Such objects are commonly called individuals; Suppes (1960) and Fraenkel-BarHillel (1958) discuss this matter. ] Although we intend that in the theory which we shall formulate all variables shall denote sets, initially we shall suggest a possible distinction between sets and objects which may be members of sets by using "a," "b," to denote the former and to denote the latter. "x," "y,"
With equality included as part of the system of logic, the prime formulas of G have the form (1)
xCa
or the form (2)
a=b.
The first of these we shall read as "x is an element of a" or "x is contained
in a." For a precise definition of a (composite) formula of Cs, we now refer the reader to the beginning of Section 4.7. However, in order to avoid completely any illusion that we are setting up a formal theory, the only symbolism that we shall employ in writing formulas is of the sort displayed in (1) and (2), along with
xQa and a0b for "not (x E a)" and "not (a = b)," respectively. Thus, we shall not use the symbolism of the predicate calculus but, instead, the (meaning-
ful) English equivalents of connectives and quantifiers. In harmony with this agreement, we shall use the word "sentence" in place of the
word "formula." In particular, a formula (in the technical sense) which contains a free occurrence of x will be called a "condition on x" or a "property of x" and symbolized A(x).
A statement (in the technical sense) we take to be true or false, since we assume that each prime formula is either true or false.
7.1
I
The Axioms of Extension and Set Formation
291
This completes our description of the ground rules. We proceed with our first two axioms. (ZF1) (Axiom of extension). If a and b are sets and if, for all x, x E a ifl' x E b, then a = b.
(ZF2) (Axiom schema of subsets). For any set a there exists a set b such that, for all x, x E b if x E a and A(x). Here, A(x) is any condition on x which (considered as a formula in the technical sense) contains no free occurrence of b.
In contrast to (ZF1), which is a statement, (ZF2) is an infinite collection of statements. That is, it is a scheme for producing axioms, one for each choice of A(x). This accounts for (ZF2) being called an axiom schema. As in intuitive set theory, to indicate the way b is obtained from a and A(x) we shall write b = (x E aIA(x) }.
It is an immediate consequence of (ZF1) that the axiom schema of subsets determines b uniquely. The usage of the term "subset" here anticipates the introduction of a C b (read : a is a subset of b, or, a is included in b) as an abbreviation for "all x, if x C a then x C b "
At this point we might derive familiar properties of the inclusion relation and continue with the definition of proper inclusion and properties of this relation. Both here and subsequently, when we have car-
ried a notion or topic belonging to the general set theory of Chapters 1-3 to a point where the earlier definitions and proofs are applicable,
we shall drop the matter. Our emphasis will be directed principally toward notions and procedures of intuitive set theory which apparently cannot be carried out within the axiomatic framework.
Our first illustration of the last remark can be given now. It is clear that (ZF2) is a substitute for the intuitive' principle of abstraction (Sec-
tion 1.2) and that (ZF2) is more restrictive in this respect. Whereas the earlier principle provides a set for each condition or property, the present version only provides the existence of a set corresponding to a condition and which is a subset of an existing set. With this restric-
tive feature, Russell's paradox cannot be reconstructed, so far as we know. What can be produced by imitating the earlier argument is the following. According to (ZF2), with A(x) as x (4 x, for any set a, if b = (xEajx(Z x), then, for ally,
292 (3)
Informal Axiomatic Set 77teory
I
CRAP. 7
yCbiffy Caand y(Zy.
It follows that b V a. The proof is by contradiction. Assume that b C a.
Now either b C b or b V b. If b C b, then in view of our assumption and (3), we have b (Z b and hence a contradiction. If b V b, then this and our assumption yield, in view of (3), b C b, a contradiction. The as-
sumption that b C a having led to a contradiction, we may conclude that b (Z a. Since the set a was unspecified in reaching this result, we infer that there is no set that contains every set. In Halmos (1960) this is paraphrased as "nothing contains everything." The axiom schema of subsets is often referred to by its German name Axiom der Aussonderung (axiom of "singling out" or "separation"). This name is suggestive since it does permit us to single out or separate off those elements of a given set which satisfy some condition and form the
set consisting of just those elements. Incidentally, this axiom schema may be considered as characterizing Zermelo's attitude with regard to a reconstruction of set theory which avoids the classical antinomies. His analysis of these contradictions led him to conclude that they resulted from the admission into intuitive set theory of "too large" sets. This led him to limit severely, by means of axioms, allowable methods of forming sets from existing sets and, in addition, to modify the principle that every condition determines a set.
2. The Axiom of Pairing The goal of anyone who aspires to axiomatize set theory has already been mentioned: To create a consistent theory within which as much as possible of the general set theory of Chapters 1-3 can be developed and, if a proof of consistency is not within reach, to incorporate ade-
quate safeguards to insure that the classical antinomies cannot be derived. Axiom schema (ZF2) has both a constructive as well as a restrictive quality, the latter evidencing itself in its conditional nature. In order to imitate the intuitive set theory of Chapter 1, further means of constructing sets from existing sets must be introduced. The next three axioms are in this category. In this section we introduce one of them.
(ZF3) (Axiom of pairing). If a and b are sets, then there exists a set c such that a C c and bCc.
7.2
!
293
The Axiom of Pairing
Using the instance of (ZF2) obtained by taking A(x) to be "x = a or x = b" and c to be a set such that a C c and b C c, we infer the existence of the set
{xCclx=aorx=b}.
Clearly this set contains just a and b and (ZF1) implies there is only one such set. We shall denote it by the symbol {a, b}
and call it the (unordered) pair formed by the sets a and b. As is easily shown, an equivalent formulation of (ZF3) is the statement that for sets a and b there exists a set c such that x C c iff x = a or x = b. If we take A(x) to be condition "x = a or x = b," the foregoing
remark means that we may express (ZF3) as: There exists a set d such that x E d ifl'A(x). Now (ZF2), applied to a set c, asserts the existence of a set d such that (1)
(2)
xCdiff(xEcandA(x)).
Comparing (1) with (2) may suggest that (1) is a special case of (2) and, in turn, that (ZF3) is superfluous. This reasoning is spurious; for it is only when the existence of a set which contains a and b is assured that (2) yields (1), and it is precisely (ZF3) which gives this assurance. With the notation of intuitive set theory in mind, it seems natural to denote the set d described in (1) by {xIA(x) } ; that is, to write
{a, b} _ {xJx =aorx = b}. Henceforth we shall use this symbolism when it is convenient and permissible. That is, if A(x) is a condition on x such that those x's which A(x) specifies do constitute a set, then we may denote that set by {xIA(x) } .
With this convention we may rewrite {x C aIA(x) }, where a is a set, as {xJx E a and A(x) }, but we shall not do so since the latter denotation is longer than the former. If a is a set, we may form the pair {a, a). This set we denote by {a}
and call the unit set of a. As an illustration of the notation agreed upon in the preceding paragraph, we may write
{a} = {xIx = a).
294
Informal Axiomatic Set Theory
I
CHAP P. 7
The specialization of (ZF3) which yields the unit set of a set insures that every set is an element of some set and (ZF3) in its general form insures that any two sets are elements of some one set. Thereby, given a set a, it becomes possible to manufacture a variety of sets such as (a (a, (a) }, ((a}, ((a} 11, and so on.
3. The Axioms of Union and Power Set None of the axioms up to this point assert the existence of any sets. It will prove to be expeditious to anticipate a later axiom which does this (and more), by introducing as a temporary axiom: there exists a set. Then we can establish the existence of a set without elements. Indeed,
let a be a set and take A(x) to be "x 9-6 x." Then, according to (ZF2), there exists the set (x C ajx s x}.
This (uniquely determined) set has no elements. We shall call it the empty set and adopt the familiar symbol
0 for it. We now turn to the first business of this section by observing that if c is a nonempty set, that is, if c 0, then there exists a set a such that
x C a if x C y for every y which is a member of c. In other words, for each nonempty set there exists a set that contains exactly those elements that belong to every member (set) of the given set. To prove this assertion, let b be any member (set) of c and define
a = (xCbiforally (ifyCc,thenxCy)}. The set a is independent of the element b since it is easily shown that a = (xj for ally (if y E c, then x E y) }. The set a is called the intersection of c. For a discussion of the notation used for intersections we refer the reader to Section 1.10. Here we shall only call attention to the notation
allb, where a and b are sets, for the set defined by
alb=Ix CaixEb}.
7.3
I
The Axioms of Union and Power Set
295
Since x C a (1 b iff' x C a and x C b, it follows that
an b = {xjxCaandxcb}. In contrast to the situation for intersection, we require a further axiom to be able to produce in CS the notion of the union of a set. The following is a generous form of the necessary axiom. (ZF4) (Axiom of union). For every set c there exists a set a such that if x C b for some member b of c, then x C a. If c is a set and a is a set of the kind specified in (ZF4), then we may apply (ZF2) to form the set
{xCal for some y(xCyandyCc)}. Clearly, for all x, x is contained in this set, which we call the union of c, if x is an element of an element of c. We may then write the union of c as
{xj for somey (xCy and y E: c)). The notation a U b, where a and b are sets, will be used for the union of the set {a, b}. By virtue of the definition of the union of a set, x C a U b if x is a member of a or x is a member of b. Thus
aUb= {xjxCaorxCb}. For a discussion of the notation used for unions we again refer the reader to Section 1.10. With the aid of (ZF4) it is possible to generalize pairs. For instance,
the (unordered) triple formed by sets a, b, and c, symbolized {a, b, c},
may be defined by
(a, b, c) = ({a) U {b}) U {c}. Then it follows easily that (a, b, c) = {xjx = a or x = b or x = c}. The extension of the notation and terminology to the case of further terms is clear. It is now possible to introduce, for sets a and b the relative complement of b in a as the set a - b, defined by
a-6= {xCajxiZ b},
296
Informal Axiomatic Set Theory
I
C H A P. 7
and in turn, the symmetric difference of a and b as the set a + b, defined by
a + b = (a-b)U(b-a).
At this point it is possible to derive all the results listed in Chapter 1 concerning properties of union, intersection, relative complement, and symmetric difference, including their interrelations. To complete the reconstruction of the intuitive theory of Chapter 1 within C5, we need the theory of relations, for which the starting point is the notion of an ordered pair. Since the (unordered) pair formed by two sets as well as the unit set of a set can be constructed, the ordered pair of sets a and b (with first coordinate a and second coordinate b) can be introduced as the set (a, b), defined by (a, b) = ( (a}, {a, b} },
just as in Chapter I. The earlier proof carries over : if (a, b) and (c, d) are ordered pairs and if (a, b) _ (c, d), then a = c and b = d. However, the existence in 6 of what we called earlier the cartesian product of two sets requires a principle of set construction which the axioms at hand do not seem to permit. We can dispose of the matter at hand as well as the existence of the power set of a set with the aid of the following axiom.
(ZF5) (Axiom of power set). For each set a there exists a set b such that, for all x, if x C a, then x C b. To secure the existence of the power set of a set from this axiom is an easy matter. If a is a set and b is a set which contains all of the subsets of a as members, then we apply (ZF2) to form the set {x C bIx g a}. For all x, x is a member of this set if x is a subset of a. We call this set the power set of a, symbolized P(a).
Thus,
6'(a) = {xix Cal. To establish the existence of the cartesian product of sets a and b, we notice first that if x C a and y C b, then { x } C a, l y } C b, and hence the sets (x} and {x, y} are included in a U b. In turn, {x} and {x, y} are members of (P(a U b), which implies that { {x}, {x, y} } =
(x, y) is a subset of 61(a U b). It follows that (x, y) C 6(G (a U b)). We infer that the set we want can be obtained by an application of (an instance of) (ZF2) to tP(6'(a U b))'. The appropriate condition is quite
7.4
1
The Axiom of Infinity
297
long; for the sake of both brevity and clarity we shall write it in sym-
bolic form. The cartesian product of sets a and b is the set a X b defined by
{wE6'(6'(aUb))I(3x)(3y)(x0yAxCaAvCbA(z)(zCwE-r z = {x} V z = {x,y}))V (3x)(x C a A x C b A
(z)(zC w-'z = {x)))}.
Since w C a X b iff w = (x, y) for some x in a and some y in b, a X b = { w l for some x in a and some y in b, w = (x, y) 1.
Defining a (binary) relation as a set each of whose members is an ordered pair, it is of importance to know that we can prove that a relation is a subset of the cartesian product of two sets. In this connection we recall Exercise 1.10.1, where it is asserted that if r is a relation,
then (using notation introduced in Section 1.10) r is a subset of the cartcsian product of UUr with itself. We may apply (ZF2) to this cartesian product, taking for the condition first "for some x ((x, y) C r)," and secondly "for some y ((x, y) E r)," to produce the sets {xI for some x ((x, y) C r) } and
{yJ for some y ((x, y) E r) },
which we call the domain and the range, respectively, of r. In particular, the domain and the range of a relation are sets and a relation is a subset of the cartesian product of its domain and its range. At this point it is possible to complete the reconstruction of the set theory of Chapter 1, obtaining the theory of equivalence relations, functions, and partial ordering relations found there.
4. The Axiom of Infinity Let us consider for a moment the theory of sets based on just the axioms (ZFI)-(ZF5) plus the temporary axiom that a set exists. The presence of the axiom of pairing makes possible the formation of an arbitrary large number of distinct (two-element) sets. We infer that the domain of any model of the theory must be infinite. On the other hand, since the union of a finite collection of finite sets and the power set of a finite set are finite sets, it does not appear that the axioms are adequate to prove the existence of an infinite set. The correctness of this surmise may be demonstrated by way of a model devised by W. Ackermann (1937).
Informal Axiomatic Set Theory
298
I
CHAP P. 7
The domain of the interpretation which can be shown to be a model is N. In order to define the relation of membership, we shall need the fact that a positive natural number a has a unique representation in the form
a = 2=' + 2- + ... + 2zr, where the x's are natural numbers and xl < X2 < . < x,.. Then, for natural numbers (that is, sets) x and a, we define x C a as true if x appears as an exponent in the representation of a in the form exhibited above. Thus, each set has only a finite number of elements. It is left as an exercise for the reader to prove that this interpretation is indeed a model of the theory under discussion. Actually, this system is a model
of the theory whose axioms are all such that they will eventually be assigned to S except the axiom of infinity which is introduced below. Thereby the system provides a proof of the independence of this axiom. Ackermann, however, devised it for a more profound purpose, namely, to provide the basis for a finitary consistency proof of the theory having
(ZF1)-(ZF5) together with the axiom of choice (see Section 5) as axioms.
There are compelling reasons for strengthening the set of axioms introduced thus far, to provide for the existence of an infinite set. Specifically, the existence of the set of natural numbers is essential for
the theory of denumerable sets and for the theory of real numbers. Although we have not as yet given a precise definition of infinity, it seems plausible that sets of the kind which are postulated by the following axiom merit being called infinite on intuitive grounds.
(ZF6) (Axiom of infinity). There exists a set a such that 0 C a
and, ifxEa, thenxU {x} Ca. Zermelo was the first to recognize the necessity of such an axiom; earlier workers regarded the existence of infinite sets as evident. He constructed the natural numbers as 0, 10 }, {{0 } ), - - , which is a satisfactory approach but one that does not generalize to the construction of infinite ordinals as easily as that adopted below. For every set x we define the successor x+ of x by
x+ = x U {x}.
Further, we shall say that a set,a is a successor set if 0 C a and if x+ E a whenever x C a. In this terminology, (ZF6) says that there exists a successor set. We shall now prove the existence of a unique minimal
7.4
I
The Axiom of Infinity
299
successor set. It is left as an exercise to prove that the intersection of a nonempty collection of successor sets is again a successor set. So, if a is some successor set, then the intersection of the (nonempty) collection
of successor sets which are included in a is a successor set which we denote by co (with the notation introduced in Section 2.6 in mind). The set w is a subset of every successor set. To prove this, consider an arbitrary successor set b. Then a n b is a successor set which is included in a. It follows that w g a (1 b, and hence w C b. In turn, the minimality of co characterizes it uniquely. For if w' is a successor set which is included in every successor set, then we have w g w' and co' C co.
Then (ZFI) implies that co = co'. We now define a natural number to be an element of the minimal successor set co. Further, we define , 9 by writing 0, 1, 2,
0=0,
I = 0+(= {0}),
2=1+(={0,1}), 9 = 8+(= 10, 1, 2, 3, 4, 5, 6, 7, 8}).
For other natural numbers we employ the usual decimal notation. We continue by proving that (w, +, 0), where now we regard + as a function on co into w, is an integral system or, what amounts to the same, that this system satisfies Peano's axioms P1-P5 in Example 2.1.2.
Since w is a successor set, 0(=Q) C co [that is, Pl is satisfied] and, if n C w, then n+ C w [that is, P2 is satisfied]. Moreover, n+ s 0 for all n in w, since n C n+ and n i[ 0 [that is, P4 is satisfied]. The minimality property of w can be expressed as: If a subset a of w is a successor set, then a = w. But this means that P5 is satisfied. It remains to prove that P4 (if m+ = n4", then m = n) is satisfied. This requires two preliminary results which we state as lemmas.
LEMMA 4.1 . No natural number is a subset of any of its elements. Proof.
Let a be the set of those natural numbers that are not included
in any of their elements. Thus, n C a iff n C w and, if x C n, then n!9 x. Clearly, 0 C a, since 0 has no elements. We assume next that n C a and consider n+. Since n+ = n U {n}, the elements of n+ are n and the elements of n. Now n'- n for, since n C n (and, n C a), n V n. Moreover, n+ is not included in any member of n, since if n+ S x, then n C x (because n C n+), which implies (since n C a)
300
Informal Axiomatic Set Theory
!
C r r A P. 7
that x (Z n. Therefore, n+ is not a subset of any of its elements and consequently n+ C a. By the principle of induction (P5), it follows that a = w, and this completes the proof. In order to state the next lemma it is convenient to make a definition. A set a is called complete if each member of a is a subset of a. Expressed
otherwise, a is complete if y C x and x C a imply that y C a.
LEMMA 4.2. Every natural number is a complete set. A proof by induction can be supplied by the reader. We now prove that if m and n are natural numbers such that m = n+, then m = n. For this we assume that m+ = n+" and m P6 n, and derive
a contradiction. From m+ = n+ it follows that m C. n+, and hence either m = n or m C n. Similarly, either n = m or n C m. Assuming, as we are, that m 0 n, we infer that both m C n and n C m hold. Hence,
by Lemma 4.2, n C n. Combining this with the fact that n C n, we conclude that n is a subset of one of its members, which contradicts Lemma 4.1. With the proof now completed that w satisfies the Peano axiorns, the stage is set for a development of the arithmetic of w. If, as in Chapter 2, the definition of a relation that well-orders co is taken as the first order
of business, then there is the following alternative to the procedure followed in Chapter 2. The first step is to prove (an exercise for the reader) the following result.
LEMMA 4.3. For each pair in, n of natural numbers, either m C n or m = n, or n C m.
Using Lemmas 4.1 and 4.2, it is then an easy matter to show that exactly one of these three alternatives holds. A further consequence of Lemma 4.3, in conjunction with Lemma 4.2, is stated next; the proof is left to another exercise.
LEMMA 4.4. If m and n are distinct natural numbers, then m C n
ifmCn.
We now define m to be less than n, symbolized
m < n, if m C n or, equivalently, in C n. Defining m < n in the usual way, one may then go on to show that < well-orders CO.
7.4
I
The Axiom of Infinity
301
Next in order is the introduction of Theorem 2.1.2, so inductive definitions of addition and multiplication can be given. Turning to other definitions and results in Chapter 2 which pertain to natural numbers, we recall that there are several in Section 2.3 phrased in the language of cardinal numbers. All such can be handled easily in the present development in terms of the notion of the similarity of two sets (that is, the existence of a one-to-one correspondence between them) and the properties of natural numbers sketched so far.
Preparatory to what we have in mind we state the following two results. Each can be proved by induction.
LEMMA 4.5. Each proper subset of a natural number is similar to some smaller natural number.
LEMMA 4.6. No natural number is similar to a proper subset of itself.
We may infer from Lemma 4.6 that a set can be similar to at most one natural number. Then, defining a set to be finite if it is similar to some natural number (and to be infinite, otherwise), it follows that a finite set is not similar to any one of its proper subsets (Theorem 2.3.3)
and, in turn, that w is an infinite set. Also, Lemma 4.5 implies that every subset of a finite set is finite. Once the Schroder-Bernstein theorem
(Theorem 2.3.1) is proved, it can also be shown that a set a is finite if a < w.
In concluding this section we note that the theory of countable sets, including Cantor's theorem (Theorem 2.3.6) stated in the forma < 61(a),
could now be presented. Also it is possible to carry out the extension of w to the system of real numbers, as described in Chapter 3. EXERCISES 4.1. Prove that Ackermann's system satisfies axioms (ZF1)-(ZF5). 4.2. Show that the intersection of a nonempty collection of successor sets is a successor set.
4.3. Prove Lemma 4.2. 4.4. Prove Lemma 4.3. 4.5. Prove Lemma 4.4. 4.6. Prove that < well-orders w. 4.7. Prove Lemma 4.5. 4.8. Prove Lemma 4.6. 4.9. Let us define the number of elements in a finite set a, symbolized n(a),
302
Informal Axiomatic Set Theory
I
CHAP P. 7
to be the unique natural number similar to a. Prove the following statements for finite sets a and b. (a) If a C b, then n(a) < n(b). (b) The set a () b is finite and n(a () b) < n(a) and n(a f b) < n(b). (c) The set a U b is finite and n(a U b) < n(a) + n(b). (d) The set a X b is finite and n(a X b) = n(a)n(b). (e) The set ab is finite and n(ab) = n(a)n(b).
(f) The set P(a) is finite and n(P(a)) = 2n(a).
5. The Axiom of Choice In order to clean up some details in connection with the subject matt°r sketched at the end of the preceding section and to develop a reasonable theory of cardinal numbers when they are defined as certain ordinals, the axiom of choice is required. With these applications in mind, we shall state it in the following form. An indication of a preference in this connection has no foundation, however, for within the framework of c it is possible to derive as equivalent statements those appearing in Section 2.8. (ZF7) (Axiom of choice). For each set a there exists a function f whose domain is the collection of nonempty subsets of a and, for
every bCawith b76 0, f(b) C b. Concerning applications of this axiom to topics touched on in Section 4, we note first that every known proof that an infinite set is similar to a proper subset of itself (Corollary 2 of Theorem 2.9.1) requires the axiom of choice. Also we recall that this axiom was needed to prove the law of trichotomy for sets; that is, for any two sets a and b, exactly one of a < b, a N b, b < a holds. This is the content of the Corollary to Theorem 2.7.4, once the well-ordering theorem has been derived from (ZF7). Looking ahead to the theory of ordinal numbers which follows, when cardinal numbers are defined as certain ordinals, the axiom of choice is needed to show that every set has a cardinal number.
6. The Axiom Schemas of Replacement and Restriction In this section we complete the description of Zermelo-Fraenkel set theory by introducing two further axiom schemas. One of these serves to guarantee the existence of "larger" sets than can be constructed on
7.6
I
The Axiom Schemas of Replacement and Restriction
303
the basis of the earlier axioms-sets which must exist if a full-blown theory of transfinite ordinal and cardinal numbers is to be possible. The other schema, whose role has not as yet been fully explored, serves to exclude the existence of certain objects as sets. To create some interest in the axiom schema of replacement consider
the theory of sets based on just (ZFI)-(ZF7). Then, as we have seen, are sets. In w is a set. In turn, by virtue of (ZF5), 6'(w), general, defining P(w) to be w and (pk+"(w) to be P(6k(w)), each of w, P(w), P2(w), is a set. Now, can we establish the ex, yn(w), istence of a set whose members are precisely these sets? That is, can we establish the existence of COI = {W, (Q(w), 192(&J),
M'
as a set? Since it does not appear possible to achieve this desirable state of affairs on the basis of just (ZFI)-(ZF7), a further axiom or (in order to cope with other similar situations) axiom schema is in order. A suitable candidate was first proposed by Fraenkel (1922), and independently by Skolem (1922). As modified by von Neumann (1928), it says, roughly, that if with each element of some subset of a set there is associated some one set, then the collection of the associated sets is itself a set. The instance of this schema which results upon choosing w as the initial set and associating with each n in w the set P"(w), declares that w" is a set. In the following official version of the schema in question, the hypothesis of the axiom means that for each x in a there is at most one y such that B(x, y).
(ZF8) (Axiom schema of replacement.) If B(x, y) is a sentence (formula) such that for each x in a set a, B(x, y) and B(x, z) imply that y = z, then there exists a set b such that y C b if there exists an x in a such that B(x, y).
It is of interest that the axiom schema of subsets, (ZF2), can be derived from (ZF8). Indeed, given a set a and a sentence A(x), take B(x,y) to be "x = y and A(x)." The hypothesis of the axiom which results is satisfied, so we may infer the existence of a set b such that y C b ill there exists an x in a such that x = y and A(x). That is, given a and A(x), there exists a set b such thaty E b iffy C a and A(y), which is (ZF2).
The axiom of pairing, (ZF3), can also be derived from the axiom schema of replacement and (ZF5), the axiom of power set. This result
304
Informal Axiomatic Set Theory
I
CHAP P. 7
appears in Zermelo (1930). To prove it, let c and d be two sets whose
pair is to be formed. As the set a in (ZF8) we select the power set 10, {0)) and as B(x, y) we take "x = 0 and y = c or, x = 10) and y = d." Then, for each x in 61(6'(0)) there is exactly one y such that B(x, y). Hence, by (ZF8), there exists a set b such that y C b if there exists an x in 61(c>'(0)) such that x = 0 and y = c or x = {0) and y = d. Thus, b is the set having just c and d as members. Next let us indicate how we can prove the existence of w, as a set with (ZF8). The intuitive idea, as we have already noted, is to replace the element n of w by 6'n(w) for n = 0, 1, 2, . A suitable choice for B(x, y) in (ZF8) is the following formula, which, for the sake of clarity, we will write in terms of the symbolism of the predicate calculus: (u) (((0, w) E u A (v) (w) ((v, w) C u -' (v, 61(w)) E u)) -* (x, y) E u).
The reader may ponder our contention that this is a suitable choice for B(x, y).
In contrast to the axiom schema of replacement which, as we shall show later, provides for the existence of enough sets to reproduce all of Cantor's theory of transfinite arithmetic, the final axiom schema has a restrictive character. Since the theory based on axioms (ZFI)-(ZF8) appears to be sufficiently comprehensive for mathematics, it is natural to consider the inclusion of an axiom which would serve to limit the theory to the minimal extension embracing these axioms. There are reasons to believe that this is too ambitious a goal. However, various axioms of a restrictive nature suggest themselves if it is desired to exclude
as sets certain models of (ZF1)-(ZFB) having features that run counter to the intuition. One such feature is the possibility of a set which is a member of itself or, more generally, a collection of n sets a,, az, , an such that a,Can,anCan-i,
The existence of such collections-even that of an infinite descending ) sequence of sets (that is, a sequence such that a i l , E a; for i = 1, 2, -is consistent with the theory having (ZF1)-(ZFB) as axioms. It is possible to prevent finite cycles of membership as well as infinite descending
sequences of sets by means of an axiom. Such an axiom was initially proposed by D. Mirimanoff (1917) as a consequence of his discovery that descending sequences of the type just mentioned might exist. It is an instance of the axiom schema which we adopt. Von Neumann (1925) was the first to introduce it.
7.6
(
The Axiom Schemas of Replacement and Restriction
305
(ZF9) (Axiom schema of restriction). Let A(x) be any condition on x which (considered as a formula) has no free occurrences of y
or z. If there exists an x such that A(x), then there exists a y such that A(y) and, for all z, if z C y then it is not the case that A(z).
If we take A(x) to be "x C a," where a is a set, the resulting axiom, which is called the axiom of regularity is : Every nonempty set a contains an element b such that a n b = 0. This axiom is due to Zermelo (1930); it is a simplified version of an essentially equivalent axiom given in von Neumann (1929). The axiom of regularity is sufficient to exclude phenomena of the type mentioned above. We substantiate, in part, this claim by deducing from it the following two results.
LEMMA 6.1. For eachseta, a iZ a. Proof. Assume, to the contrary, that a is a set such that a E a. Then, on the one hand, (1) a C {a} n a since a C (a). On the other hand, by (ZF9), there is a member of
{a} whose intersection with {a} is the empty set. Since the only mem-
ber of {a} is a, it follows that {a} n a = 0, which contradicts (1). LEMMA 6.2. For no two sets can each be a member of the other. Proof. Assume, to the contrary, that a and b are sets such that a C b and b C a. Then aC {a,b} nb and bC {a,b} na. (2) The axiom of regularity implies the existence of an element x in
(a, b) such that (a, b } n x= 0. But since we must have either x = a or x = b, it follows that either { a, b } n a = 0 or (a, b } n b = 0, which contradicts (2). In order to give an application of the axiom schema of restriction of a different nature, we recall that prior to the statement of (ZF8) we mentioned that there appears to be no way to obtain wi as a set on the basis of (ZFI)-(ZF7). When these axioms are augmented with (ZF9) it can be proved, by way of a model, that wi cannot be shown to be a set; this was done first by von Neumann (1928). For convenience in discussing this matter, let us denote the theory whose axioms are those of e, except for (ZF8), by Coo. Consider the interpretation of to, whose domain is the union of wi. We contend that it is a model of C5o. First, it is clear
that (ZF1) and (ZF4)-(ZF6) are satisfied and, since (ZF2) requires
306
Informal Axiomatic Set Theory
I
CHAP. 7
only the existence of certain subsets of a given set, it also is satisfied. To prove that (ZF3), the axiom of pairing, is fulfilled, consider two members a and b of Uw,. Then there exist m and n such that a C pm(w) and b C 61"(w) Hence, both a and b are members of 61"'+"(w). Thus {a, b} is a member of p-+"+1(w) and, therefore, a member of Uw,. A proof that the interpretation under consideration satisfies (ZF7) is complicated and we omit it. To prove that (ZF9) is satisfied we assume, to the contrary, that it is not fulfilled and derive a contradiction. So, by hypothesis, there exists a condition A(x) such that (i) there is an x such that A(x) holds, and (ii) for all y, if A(y) holds, then there is a z such that z C y and A(z) holds. Let xo be an x which satisfies (i) and take it as y in (ii). Let x, be a set which satisfies (ii) ; hence x, C xo and A(xi) holds. Thus, by (ii) again, there exists an x2 such that x2 C x, and A(x2) holds. Continuing in this fashion yields a sequence xo, x,, x2, such that , x2 C x1, x, C xo. Now there exists an n such that xo C 61"(w). It follows that, in turn, x, C 4'"-'(w), x2 C ps-2(w), - , x" C w. Finally, we conclude that for some m, x"+. C 0, which is impossible. Now we raise the question of whether Uw1 is a set in this model. The answer is "no" by virtue of Lemma 6.1. Therefore, since there is a model of Coo in which Uw, is not present, Coo is not sufficiently strong
for proving the existence of co, as a set. Furthermore, it follows that 6 is a stronger theory since, as observed earlier, we can prove the existence in it of to,. In conclusion, we call attention to Section 9.11, wherein appear some
remarks about Zermelo-Fraenkel set theory when formulated as a formal axiomatic theory.
EXERCISES 6.1. By imitating the proof of Lemma 6.2, prove the nonexistence of three sets a, b, and c such that a C b, b C c, and c C a. 6.2. Use the axiom of regularity to prove that if a is a set such that a C a X a,
then a = 0. 6.3. Prove Lemma 6.1, using the instance of (ZF9) corresponding to the condition, "there exists an x such that x C x."
7. Ordinal Numbers In this and the following section we shall outline the theory of ordinal numbers due to von Neumann (1928a) as simplified by R. M. Robinson
7.7
I
Ordinal Numbers
307
(1937). We shall presuppose familiarity with several definitions and theorems in Chapter 2. The definitions that we have in mind are those of a well-ordered set, an initial segment of a well-ordered set, and ordinal
similarity (symbolized =),of two well-ordered sets. The results which we shall presuppose are (i) for each set there is a relation which wellorders it, (ii) the principles of proof and definition by transfinite induction, (iii) the existence of exactly one isomorphism between ordinally similar well-ordered sets, (iv) a well-ordered set is not ordinally similar to any of its initial segments, and (v) for well-ordered sets a and fl, exactly one of the following hold: a is ordinally similar to an initial segment of X13, a = j9, S is ordinally similar to an initial segment of a.
Also we shall use the fact that if a is a well-ordered set, then a+ = a U { a } is a well-ordered set when we order the elements of a in the given way and, further, require that t < a for all t in a. In the von Neumann theory, an ord}nal number is a specific wellordered set of a particular kind. Thelleby the concept of order type (which, at best, is a hazy notion) is avoided completely. The defining property of those well-ordered sets which are called ordinal numbers may be thought of as qualities which well-ordered sets should have if they are to serve as "counting numbers" in the sense that the natural numbers serve this end. We begin by calling attention to several properties of natural numbers, relative to the ordering relation <, which culminate in one observation which is crucial for the generalization in mind. A natural number n is a set whose members are natural numbers; indeed, n = {x C cwjx < n}, since x < n means x C n. In particular, as a subset of the well-ordered set co, n is a well-ordered set. Suppose that in C n. Then the initial segment s(m)t of n which is determined by m is {x C njx < m } = m. That is, a natural number is a well-ordered set such that the initial segment determined by each of its elements is equal to that element.
This is the property on which the extended counting process is based. We now define an ordinal number as a well-ordered set a such that for all >; in a, s(t) = t. In addition to the natural numbers qualifying as ordinal numbers, w does also. Moreover, w+, (w+)+, are ordinal numbers, since we can prove that if a is an ordinal number then so is a+. The proof goes - -
-
as follows. If >; C a+, then either t E a, in which case s(s) = i;, by assumption, or else i; = a, in which case s(s) = a; that is, s(a) = a, t This notation for initial segments of well-ordered sets is better suited for our present exposition. We may recall at this point that an element of the initial segment determined by a member f of a well-ordered set is called a predecessor of f.
308
Informal Axiomatic Set Theory
I
CHAP P. 7
by the definition of order in a+. Anticipating notation from ordinal arithmetic, we shall denote the ordinal numbers w, w+, (w+)+, by
w,w+1,w+2, Applying (ZF8) with a as co and B(x, y) as y = co + x we may infer that w, co + 1, co + 2, form a set. The union of this set and w we shall denote by w2.
Is w2 an ordinal number? The answer would appear to depend on the choice of the definition of order in w2. Actually, the question is settled
automatically without any human intervention. The facts are these. The condition that a well-ordered set a must satisfy in order to qualify as an ordinal number, namely, s(E) = t for each t; in a, serves to specify the collection of initial segments determined by the elements of a. But, as the reader can easily show, even a simple ordering relation in a set is uniquely defined by the collection of initial segments determined by the elements of the set. (That is, if < and <' are simple orderings of a set S and, for each x in P, the initial segment determined by x relative to < is
equal to that determined by x relative to <', then < = <'.) Hence, since s(E) = t; means that the set of predecessors of l; must be the elements of , the only possible ordering of a which can lead to the conclusion that a is an ordinal number is the relation < such that for all t and 'j in a, t < 71 iff t E 'n Now either this relation is a well-ordering of a such that s(s) = t for each t in a or it is not. In the first case a is an ordinal number and in the second case it is not. In particular, it is now an easy matter to see that w2 is an ordinal number. After the ordinal number w2 comes its successor w2 + 1, followed by the successor w2 + 2 of w2 + 1, and so on. Next, after all terms of the sequence with this beginning comes w3; this set is secured by the ap-
plication of another axiom of replacement. There follows in turn and immediately after these comes w4. In this w3 + 1, w3 + 2, manner we get successively co, w2, w3, . Then with the application of another axiom of replacement we get an ordinal number which follows the members of this sequence in the same sense that w follows
the natural numbers. This ordinal number is w2. Continuing in this manner (and, continuing to anticipate the notation of ordinal arithmetic) we can secure all "polynomials" in to as ordinal numbers, in a manner parallel to that discussed in Example 2.7.10. We derive next several basic properties of ordinal numbers.
7.7
1
Ordinal Numbers
309
LEMMA 7.1. Each element of an ordinal number is itself an ordinal number.
Let i be an element of the ordinal number a. Then i is a subset of a, since from the fact that s(s) = i it follows that an eleProof.
ment of l is a predecessor of t, and hence an element of a. Therefore,
as a subset of a well-ordered set, t is well-ordered. Now consider an element '7 of E. The initial segment determined by rt in i; coincides
with the initial segment determined by rt in a and, since the latter is equal to rr, so is the former. Thus, in t, s(j) = n for all ri.
LEMMA 7.2. If two ordinal numbers are ordinally similar, then they are equal.
Let a and 0 be ordinal numbers and suppose that f is an ordinal similarity on a onto P. It is left as an exercise to prove by transfinite induction that f Q) = t for each t in a. This implies that Proof.
a=P.
The next result asserts that every set of ordinal numbers is wellordered. As in Chapter 2, we first prove that any two ordinal numbers are comparable. If a and (3 are ordinal numbers, then, as well-ordered sets, either they are ordinally similar or one is ordinally similar to an initial segment of the other. In the first case a = /3, by Lemma 7.2. To examine the consequences of the other possibilities, assume that a is ordinally similar to an initial segment of P. Now an initial segment of 0 is an element of /9 and hence an ordinal number, by Lemma 7.1. Using Lemma 7.2 again, it follows that a is an element of fl; so we may write
a
Thus, for ordinal numbers a and /3, exactly one of a = (3, a < (3, (3 < a holds. Moreover, the conditions a C P, a C /3, and a < I3 are equivalent to each other. Since the proof of the well-ordering of any set of ordinal numbers parallels that given for the earlier statement of this result (Theorem 2.7.6), we shall leave it as an exercise. For completeness we record the conclusion again.
LEMMA 7.3. Any set of ordinal numbers is well-ordered. To establish the next property of ordinal numbers it is convenient to make a definition in connection with well-ordered sets. If a and b
310
Informal Axiomatic Set Theory
I
CHAP P. 7
are well-ordered sets, we shall call b a continuation of a if a is an initial segment of b and if the ordering of the elements in a is the same as their
ordering in b. For example, if a and P are distinct ordinal numbers, then one of them is a continuation of the other. Now let C be a collection of well-ordered sets such thaf for each distinct pair of elements of C, one. is a continuation of the other. This condition may be expressed by
saying that e is a chain with respect to continuation. It is a straightforward exercise to prove the following property of such a chain.
LEMMA 7.4. Let a be a collection of well-ordered sets that is a chain with respect to continuation. Then there exists a unique wellordering of c, the union of C, such that c is a continuation of each set (other than c) in the collection C.
LEMMA 7.5. Every nonempty collection of ordinal numbers has a least upper bound.
Let C be a collection of ordinal numbers. Then e satisfies the hypothesis of Lemma 7.4, as noted above. Hence the union y
Proof.
of C is a well-ordered set such that y is a continuation of each t in C, other than y itself. Actually, y is an ordinal number, since the initial segment determined by an element of y is equal to the initial segment determined by that element in whatever set of C it occurs. If E E C, then E < y, which means that y is an upper bound for C. Indeed, y is the least upper bound for C, since if S is an upper bound for C then t C S whenever i; C C and, therefore, y C S.
As in the case of the Russell -paradox, the Burali-Forti paradox is avoided in Zermelo-Fraenkel set theory by our ability to prove that the troublesome set of the intuitive theory is not a set of the axiomatic theory. In the present case we can argue that if there were a set whose members consisted of all ordinal numbers, then we could form its least upper bound. That ordinal number would be greater than or equal to every ordinal number. But for each ordinal number there exists a greater one-its successor, for example. This contradiction rules out the existence of the proposed set.
In our concluding result we bring the present theory in still closer agreement with the intuitive theory.
LEMMA 7.6. Each well-ordered set is ordinally similar to exactly one ordinal number.
7.7
I
Ordinal Numbers
311
The uniqueness is clear, since, for ordinal numbers, ordinal similarity is the same as equality. The major step in proving the Proof.
existence of a suitable ordinal number for a given well-ordered set is the preparation for an application of the principle of transfinite in-
duction to show that each initial segment of a well-ordered set is ordinally similar to some ordinal number. Let a be a well-ordered set
and suppose that c is an element of a such that the initial segment determined by each predecessor of c is ordinally similar to some ordinal
number. There exists a set e whose members are precisely all such ordinal numbers (that is, which are ordinally similar to the initial segment determined by some element of s(c). This follows from the axiom of replacement corresponding to the set s(c) and the sentence B(x, a), which says, "a is an ordinal number and s(x) = a." [This sentence does satisfy the hypothesis of (ZF8) in view of Lemma 7.2.1 Now either c is the immediate successor of one of its predecessors or c = lub s(c). If the first possibility is true and c is the immediate successor of d, then s(c) = S+, where b is the ordinal number to which s(d) is ordinally similar. If the remaining possibility is true, then lub e. Therefore, in every case, s(c) is ordinally similar to an s(c) ordinal number. Now consider the well-ordered set i of all initial segments of a (that we may do this follows from Lemma 7.4), and let j be that subset consisting of those initial segments which are ordinally similar to some ordinal number. Then the result obtained above comes to this: If x is a member of i such that s(x) C j, then x C j. By the principle
of transfinite induction we then have j = I. That is, each initial segment of a is ordinally similar to an ordinal number. From the axiom of replacement corresponding to the set a and the sentence B(x, a) used above, it follows that there exists a set D whose members
are precisely those ordinal numbers which are similar to an initial segment of a. Then it is an easy matter to justify the conclusion that a is ordinally similar to an ordinal number by the same argument employed above to show that s(c) has the same property.
For a well-ordered set a we shall symbolize the unique ordinal number which is ordinally similar to a (that is, its ordinal number) by ord a.
If a is finite then ord a is the same as the natural number n(a) defined in Exercise 4.9. The natural numbers are, of course, the finite ordinal
312
Informal Axiomatic Set Theory
I
e }r A P. 7
numbers; the others are called transfinite. As in Chapter 2, those ordinal numbers which have an immediate predecessor (as is the case for each finite ordinal number other than 0) are called ordinal numbers of the first kind and those (like co) which do not are called ordinal numbers of the second kind or limit ordinals. EXERCISES 7.1. Prove the assertion made in the text that an ordering relation in a set a is uniquely determined by the collection of initial segments of the members of a. 7.2. Complete the proof of Lemma 7.2. 7.3. Prove Lemma 7.3. 7.4. Prove Lemma 7.4.
8. Ordinal Arithmetic There are two standard approaches to definitions of arithmetical operations for ordinal numbers: one relies on set theory and the other on the principle of definition by transfinite induction. The set-theoretical approach is based on formulating arithmetical operations in terms of operations of set theory; illustrations are provided by the definitions in Section 2.6 of addition and multiplication for order types. The inductive approach follows the pattern we employed to define operations for natural numbers with the principle of definition by induction replaced by that of definition by transfinite induction. Whichever approach one elects, the definitions in one can be proved as theorems in the other. Illustrations are suggested by results which are at hand. For example, from the inductive definitions of addition and multiplication for natural numbers, the reader proved in Exercise 4.9 that the number of elements in the cartesian product of two finite sets a and b is equal to n(a) n(b). This result could be used instead to define multiplication
of natural numbers. That is, for natural numbers r and s we could define their product by choosing sets a and b such that n(a) = r and n(b) = s and writing r s = n(a X b). Since we wish to maintain as close contact with intuitive set theory as possible, we shall emphasize the set-theoretical approach.
As a preliminary to defining operations for ordinal numbers, we recall the technique introduced in Section 2.5 for obtaining from two given sets a and b, possibly not disjoint, sets which have the same structural features as a and b and which are disjoint: replace a by a X 101
7.8
313
Ordinal Arithmetic
and b by b X 111. The obvious one-to-one correspondence which exists between such pairs as a and a X (0) may be used to transfer whatever structure is assigned to a to its replacement. This leads to the conclusion that if we are given two sets having possibly some structure we may as-
sume at the outset, without loss of generality, that they are disjoint. This conclusion can be generalized to arbitrary families of sets. If {a:lx C i } is a given family, then replace each ax by ax X {x} to obtain
a disjoint family which may be assigned all features of the original.
The definition of addition for ordinal numbers follows the same pattern as that given in Section 2.6 for addition of order types. Let a and b be disjoint well-ordered sets. In their union a U b we define an ordering relation as follows : Pairs in a and pairs in b are ordered according to the given orderings in a and by respectively, and each element of a precedes each element of b. The assumption that a and b are well-
ordered implies that a U b is well ordered. This well-ordered set we call the ordinal sum of the well-ordered sets a and b. The concept of the ordinal sum of two well-ordered sets extends directly to an arbitrary (well-ordered) family of well-ordered sets. First, a word about the notation for such families. In view of Lemma 7.6
we may take the indexing set to be an ordinal number. We shall do this and use notation like
{ael E X} for such a family. If, then, { aEI C X) is a disjoint family of well-ordered
sets, indexed by (the ordinal number) It, we define its ordinal sum as
U tat ordered as follows: If x and y are members of the union and in the same set at, then the order in at prevails; if x E at and y C all, where z; < 77, we take x < y. To define the sum of ordinal numbers a and ft we introduce disjoint
well-ordered sets a and b such that ord a = a and ord b = ft. Let c be the ordinal sum of a and b. The sum, a + I3, is defined to be ord c. It is left as an exercise to prove that a + 3 is independent of the choice of the sets a and b (provided, of course, that each has the correct ordinal number). Analogues of this remark hold for the other arithmetic operations for ordinal numbers; they will be omitted. The definition of sum extends without difficulty to an arbitrary family {atlE C X} of ordinal numbers. Let {atlt C X} be a disjoint family of
well-ordered sets at such that ord at = at for each E and let a be the ordinal sum of { atl i E It I. The sum Ztat is defined to be ord a.
314
Informal Axiomatic Set Theory
I
CHAP P. 7
The ordinal product of two well-ordered sets a and b is defined to be the cartesian product a X b ordered as follows:
(x, y) < (x', y') iffy < y' or y = y' and x < x'. It is left as an exercise to prove that an ordinally similar set (and hence an alternative definition of the ordinal product of a and b) can be obtained as follows. Let a,, = a X { y } for each y in b, and order a,, in the obvious way. Then the family {avIy C b} is disjoint and its ordinal sum is ordinally similar to a X b. This approach to the ordinal product of a and b has intuitive appeal since it corresponds to adding a to itself b .times.
To define the product of ordinal numbers a and lg we introduce well-ordered sets a and b such that ord a = a and ord b = /3. Let c be the ordinal product of a and b. The ordinal product, a#, is defined to be ord c. For properties of finite products and sums of ordinal numbers we refer the reader to Sections 2.6 and 2.7. Since it is not necessary, for the definition of the product a# of the ordinal numbers a and #, to employ disjoint well-ordered sets whose ordinal numbers are a and P, it is permissible to choose the most easily available well-ordered sets whose ordinal numbers are a and #-namely, a and S! Similarly, for the definition of product for a family of ordinal numbers it is not necessary to use disjoint "representatives." We take advantage of this fact by choosing ordinal numbers to be their own representatives. The first step in defining the product of a family {aelt C X}
of ordinal numbers is to form the cartesian product of this family of well-ordered sets. [We recall that an element of this set is a function f on X such that f Q) C at. I Let 61 be the subset of this cartesian product, which consists of all functions which have only a finite number of values different from 0. We order P in reverse lexiographical ordering. Let f
and f' be two distinct members of 6'. Then they take different values for only a finite number of arguments, and hence there exists a last argument, to, for which f(l o) 0 f'(to). If f (i o) < f'(o), then we set f < f'; if f'(to) < f(Eo), then we set f' < f. It is left as an exercise to prove that this is a well-ordering of 61. We now define the product IItat to be ord 61. Among the immediate consequences of this definition we note that
if X (the indexing set) is the empty set, then IItat = 1, since the Cartesian product of the family is {0 1. Vprther, the product of a nonempty set of ordinal numbers is equal td O if at least one of the factors is equal to 0.
7.8
!
315
Ordinal Arithmetic
Finally, we define exponentiation as iterated multiplication. If a and 13 are ordinal numbers, then we set ap _ lEE$aE,
where at = a for all E E 13. That is, aP is the ordinal number of the set of all functions on (3 into a which assume only a finite number of values different from 0, ordered in reverse lexiographical ordering. Among the laws of exponents which hold there are the following: for all a, a' = a for all a, ao= 1
0P=0
forall13> 1,
1P = I aP+r = agar aar = (as)Y
for all fl,
for all a, for all a,
y, y.
From the first and the fifth of these properties it follows that a .a ... . a (n factors) = an for each natural number n (including n = 0). Since multiplication of ordinal numbers is not commutative, no analogue from elementary arithmetic to the identity (ab)c = a°b° can be expected. A comparison of (w2)2 = w(2w)2 = w22 and w222 = w24 settles the matter.
We conclude our introduction to the theory of ordinal numbers by listing a "few" of them in order. Each number which appears immediately after a sequence of three dots is the least upper bound (indeed, the limit, in the sense explained in Example 2.7.11) of those which
precede it; the letters of the English alphabet which appear denote finite ordinals. The creation by Cantor of this so-called series of ordinals certainly ranks as an outstanding achievement: 0, 1, ...,n, ... w,w+1, --,w+n, -- w2, w2 + 1, ..
w3,
wn+m, ... w2, ...,w2+wn+m, ... w2n, ... wa, ... wn, Wnrnn + wn-lmn_1 + ... + m0, .
(WW)n, .
. .
Wm, ... WWn, ... WW+l
(WW)W, . . . ((WW)W)W, . .
The next ordinal number after all of these is usually denoted by ea. It may be "reached" more directly as the least upper bound of the sequence 1, W, wW, (wW)W, ... ; the proof that it is a set is left as an exercise. Further ordinal numbers, beginning with to, include to, to + 1, . . 0 + w, ... to + w2, ... E0 + W2, ... to + Co.,
... e02, ... eow, ... cow,, ... eo. ... .. .
Cantor called any solution of the equation we = e an epsilon numbe It is left as an exercise to prove that to is the least epsilon number.
316
Informal Axiomatic Set Theory
I
C Ii A P. 7
EXERCISES 8.1. Show that the definition of order which was adopted for the union of well-ordered sets a and b may be stated as follows. If p and a are the given well-
ordering relations in a and b, respectively, then order a U b by p U a U (a X b). 8.2. Show that the sum a + Q of two ordinal numbers is independent of the choice of well-ordered 'sets a and b such that ord a = a and ord b = Q. 8.3. Show that the ordinal product of two well-ordered sets a and b, as defined in the text, is ordinally similar to the ordinal sum of the family {a,,I y C b}. 8.4. Prove that the ordering assigned to the subset tp of the cartesian product of a family {atiE C X} is, in fact, a well-ordering. 8.5. Prove each of the laws of exponents displayed in the text. 8.6. Show that e0 is the least epsilon number.
9. Cardinal Numbers and Their Arithmetic Although in Section 2.3 we gave a definition of the concept of a cardinal number, we emphasized there that we would rely on only that consequence of the definition to the effect that
card a = card b iff a r b. Using just this property of cardinal numbers it is possible to reproduce, with the framework of Zermelo-Frankel set theory,
(i) the definition of the order relation < for cardinal numbers, the proof (after the Schroder-Bernstein theorem is established) that card a < card b if a < b, and that of Cantor's theorem; (ii) the definitions of addition and multiplication for cardinal numbers (Section 2.5) and the proofs of the properties of these operations stated in Theorems 2.5.1 and 2.5.2; (iii) the definition of exponentiation and the proofs of those properties stated in Theorem 2.5.3.
Defining a cardinal number as finite if it is the cardinal number of a finite set and as infinite if it is the cardinal number of an infinite set, we may continue by proving the following results: The arithmetic of finite cardinal numbers is the familiar finite arithmetic and, if u is an infinite cardinal number, then u u = u and u + u = u. We now consider a suitable definition of the cardinal number of a set. From earlier results we know that every set is similar to some ordinal number. In general, a set is similar to many ordinal numbers. The result
7.9
(
Cardinal Numbers and Their Arithmetic -
317
on which the von Neumann definition of the cardinal number of a set
leans is that for each set a, the ordinal numbers which are similar to a form a set. We begin the proof by observing that it is possible to find an ordinal number greater than all ordinal numbers similar to a. An ordinal number t9 which is similar to 6'(a) will serve. Then, for each ordi-
nal number a similar to a, the set a is less numerous than the set 9, and hence card a < card p. Hence, it is not the case that 0 < a, and therefore a < P. In turn, this means that a C S. Thus, P is a set that contains every ordinal number similar to a and the existence of such a set implies that the ordinal numbers similar to a form a set. In view of this result, a natural choice for card a is the least ordinal to which a is similar. This is the motivation for a consideration of the following definition : A cardinal number is an ordinal number a such that if 0 is an ordinal number similar to a, then a < 16. That is, a cardinal number is an ordinal number which is not similar to any smaller ordinal number. If a is a set, then card a, the cardinal number of a, is the least ordinal similar to a. That this definition is satisfactory follows from the fact that we can prove that card a = card b if a '' b. Indeed, since each set is similar to its cardinal number, it follows that if card a =
card b, then a - b. For the converse, we assume that a - b and infer that card a = card b. Since card a is the least ordinal similar to a, certainly card a < card b and, upon interchanging a and b in this argument, also card b < card a. Hence, card a = card b. Since a finite ordinal number (that is, a natural number) is not similar to any different ordinal number, the set of ordinal numbers similar to a finite set is a unit set. Hence, the cardinal number and the ordinal number of a finite set are the same. Notice that we are now entitled to infer from the similarity of Y(a) and 2a, where a is a set, that card P(a) = 2a,
since we now know that 2 is a cardinal number. Also, we may state Cantor's theorem in its familiar form: a < 2a. The above inequality brings to mind one of the last two questions which should be raised regarding the definition of cardinal number. We recalled at the beginning of this section that on the basis of the identity
that card a = card b if a r., b, an ordering relation can be defined for cardinal numbers and that it follows from this definition that card a < card b iff a < b. Now ordinal numbers have already been outfitted with an ordering relation. Fortunately, there is no collision of the two possible meanings of card a < card b, since they coincide. We leave the details as an exercise. The other question concerns the status of Cantor's
318
Informal Axiomatic Set Theory
I
CHAP P. 7
paradox. Its fate is settled in much the same way as the Burali-Forti paradox. Every set of cardinal numbers, as a set of ordinal numbers, is well-ordered. Moreover, we know that every set of cardinal numbers has an upper bound and that for every set of cardinal numbers there is a cardinal number greater than each member of the set (see Section 2.9). It follows that there is no largest cardinal number or, what is equivalent, there is no set that consists precisely of all the cardinal numbers. As the smallest transfinite ordinal number, co is a cardinal number and, when playing the role of a cardinal number, is denoted by bto. Since Theorem 2.9.3 (every set of cardinal numbers is well-ordered), holds in Zermelo-Fraenkel set theory, we can define the alephs in general as in Section 2.9. The immediate successor, Ki, of 14o in the order-
ing of cardinal numbers may be described as the least uncountable ordinal number, or as an uncountable well-ordered set each of whose initial segments is countable. It may come as a surprise to learn that this ordinal number is greater than all of those explicitly named in Section 8, for they are all countable! EXERCISES 9.1. Give definitions of addition and multiplication for an arbitrary family of cardinal numbers by imitating corresponding definitions for ordinal numbers.
9.2. Show that the two possible meanings of card a < card b coincide.
10. The von Neumann-Bernays-Godel Theory of Sets In this section we shall describe the theory in question (and, for brevity, refer to it simply as von Neumann set theory) only to the point where we can indicate the essential differences between it and ZermeloFraenkel set theory. The original version of von Neumann set theory
appeared in von Neumann (1925, 1928a, 1929), and in simplified form in R. M. Robinson (1937). Since a distinguishing feature of this original version was its adoption of the notion of function, rather than that of set, as primitive, it differed considerably from other axiomatizations of set theory. In a series of seven papers, beginning in 1937 (see References), P. Bernays formulated a modification of the von Neumann approach which brought it in much closer contact with Zermelo set theory. In turn, in Godel (1940) the theory is further simplified. One essential difference between the von Neumann theory and the
7.10
(
The von Neumann-Bernays-Godel Theory of Sets
319
Zermelo-Fraenkel theory reflects a difference in attitudes toward the question of how to cope with the "too large" sets of intuitive set theory. In the Zermelo theory it is possible to prove the existence of most of the sets which are necessary for mathematics, but the axioms which are concerned with the existence of sets are so designed that it seems impossible to construct any "troublesome" sets. In brief, the theory S is a conservative one! The von Neumann theory, on the other hand, reflects the attitude that it is not the existence of too large sets as such which leads to contradictions but rather their being taken as members of other sets. In the von Neumann theory a technical distinction is drawn between sets and classes. Every set is a class, but the converse is not true. Those classes which are not sets are called proper classes and their distinguishing feature is that they are not members of any other class. The class of all ordinals, for example, exists, but it is a proper class. Thus the Burali-Forti paradox cannot be constructed, since it requires that the class of all ordinals be a member of a class. The other paradoxes meet with a similar fate. In Godel (1940) three primitive notions are adopted: class, set, and the binary relation of membership. A slight modification of the theory
allows one to reduce the number of primitive notions to one-the binary relation C. Then elements of the union of the domain and the range of C are called classes and elements of the domain are called sets. The axioms of the theory, as stated in Gi del (1940), fall into several groups. The first consists of the axioms of extension and that of pairing. Using lower-case letters as set variables and capital letters as class variables, the axiom of extension is
(u)(uEXHuCY)-4X=Y. The axiom of pairing provides for the existence of the set whose members are just the sets x and y. This is formulated as
(x)(y)(3z)(u)(u C z H u = x V u = y). The eight axioms of the second group are concerned with the existence of classes. These axioms, which are due to Bernays, replace an axiom schema in the original von Neumann theory. From them Bernays proved the general existence theorem (a metatheorem), which asserts that for any formula F(x) which contains no bound class variables there exists a
class Y that contains just those x's which satisfy F(x). This result, which is
referred to as the class theorem, bears a strong resemblance to the principle of abstraction of intuitive set theory; the sole difference is
320
Informal Axiomatic Set Theory
I
CHAP. 7
that "defining conditions" determine classes and not necessarily sets. The class theorem yields as a by-product the fact that classes in the von Neumann theory play the role that formulas do in the ZermeloFraenkel theory. The remaining axioms of the von Neumann theory coincide with the remainder of those for S [that is, (ZF4)-(ZF9) ], with the one important distinction that none of the former are axiom schemas. For example, in place of (ZF8), the axiom schema of replacement, there is the axiom of replacement which is the formula (x) (y) (z) (((x, y) E X A (x, z) C X) --*y = z) -->
(3y)(x)(x Ey - (3w) (w E z A (w, x) E X)) Thus, by way of the theorem schema described above, this axiom yields all instances of the axiom schema (ZF8). This brings us to the second, and last essential difference between the two theories: von Neumann
set theory is finitely axiomatized. That is, no axiom schema of set construction is required; instead, a finite number of specific set and class constructions is adequate.
BIBLIOGRAPHICAL NOTES In Fraenkel (1961) general set theory is developed at a level which is between that of Chapters 1 and 2 and that of this chapter. Fraenkel's excellent book, Abstract Set Theory, is a thoroughly revised (and greatly improved) edition
of an earlier book. The book by Fraenkel and Bar-Hillel (1958) complements Abstract Set Theory in the same way that the present chapter complements our earlier coverage of intuitive set theory. In addition, it considers other approaches (for example, Quine's New Foundations) to set theory. Zermelo-Fraenkel set theory is also expounded in Suppes (1960). An interesting feature of this book is an elegant unorthodox treatment of finite sets by means of Tarski's definition (see Exercise 2.3.14), which allows Suppes to develop the theory of finite sets before the theory of finite ordinals. Another treatment of Zermelo-Fraenkel set theory appears in Halmos (1960). This is a beautiful presentation. An outline of von Neumann-Bernays-Godel set theory is given in Godel (1940) and in Bernays and Fraenkel (1958). The latter book presents a modification of the system developed by Bernays in the series of seven papers men-
tioned in the text. For a high-level development of transfinite arithmetic beginning with the theory of ordinal numbers, H. Bachmann (1955) should be consulted.
CHAPTER
8
Several Algebraic Theories
I T I S PER H A P S in algebra that the axiomatic method has scored its greatest successes. The majority of axiomatic theories which are regarded as belonging to algebra are noncategorical. This is by design, since the goal of algebra is a systematic analysis of various combinations of central features common to a variety of specific algebraic systems. This modern approach to algebra yields theorems which not only illuminate a multitude of classical examples by displaying them in the most general light without foreign hypotheses, but also it contributes formalism and powerful tools which are indispensable to a large part of mathematical research, including that in the theory of numbers, algebraic geometry, functions of several complex variables, integration theory, and topology. Thus, algebra is not merely a branch of mathematics, for it plays within mathematics a role analogous to that which mathematics itself has played with respect to physics for centuries. As is the case with most branches of mathematics, it is foolhardy to attempt a definition of algebra. It is possible, however, to suggest a characterization by describing basic features of those theories which may be called "algebraic theories," that is, axiomatic theories which, it is generally agreed, belong to the province of algebra. Some such features are discussed in Section 1. The theory of Boolean algebras qualifies as an example and serves to illustrate some of the concepts introduced.
The brief introduction to semigroups which appears in Section 2 is included simply because this theory can be used as a vehicle to introduce a variety of definitions that are applicable to the algebraic theories with which the remainder of the chapter is concerned. Each of these theories, apart from that of groups, had its origin in one of the number systems constructed in Chapter 3. That is, each is founded' on the basic properties of one of the system of integers, the system of rational numbers, the system of real numbers. When it is realized that these theories 321
322
Several Algebraic Theories
I
CHAP . 8
form the backbone of modern algebra, the fundamental role played by the familiar number systems in stimulating the development of modern algebra becomes apparent. Exposing the role of the familiar number systems as a source for algebraic theories is one goal of this chapter. The other is to provide vyays and means for characterizing, in turn, these number systems as models of certain algebraic theories. These characterizations are presented in the last three sections of the chapter.
1. Features of Algebraic Theories Ordinarily, algebraic theories are presented as informal theories within the context of set theory. That is, as explained in Section 5.3, an algebraic theory is formulated in terms of a nonempty set X and certain constants associated with X. These constants may be of various types : elements of X, subsets or collections of subsets of X, unary operations on X (that is, functions on X into X), binary relations or operations in X, and so on. Collectively, the constants serve as the basis for imposing
a certain structure on X. The structure is given in the axioms-that is, the properties assigned to X and the constants. It is principally the form of the axioms that distinguishes algebraic theories among axiomatic theories in general. The axioms pertaining to binary operations imitate, in part at least, the basic properties of addition and multiplication and include, possibly, the existence of interrelations such as distributive laws. Those pertaining to any binary relations present may imitate properties of "less than" for number systems. If unary operations are present, they are often called (left or right) operators.
As an indication of the form that axioms pertaining to operators might have, the properties of scalar multiplication in an elementary treatment of vector algebra are suggestive. These include
a(a + f) = as + a$,
(a+b)a=act +ba, a(ba) = (ab)a
for all vectors a and i8 and all scalars (real numbers) a and b. The first of these is a property of individual scalars (left operators). In contrast, the others are interrelations between combinations of operators and combinations of vectors and, as such, presuppose the existence of operations for the set of scalars. In general, a set of operators may or may not have some assigned structural features. The theory of Boolean algebras qualifies as an algebraic theory. If
8.1
I
Features of Algebraic Theories
323
the theory is formulated as in Theorem 6.3.1, then the constants associated with the basic set B are one binary operation and a single operator. We turn next to the description in general terms of two notions which occur so consistently in algebraic theories that they may be considered as serving to further delineate algebraic theories. If we agree that by an algebra is meant any model of some algebraic theory, then one of the notions is that of a subalgebra of an algebra. This requires two pre-
liminary definitions. Let f be an operation in a set X and A be a nonempty subset of X. Then A is said to be closed under f if the restriction off to A X A is an operation in A or, in other words, the range of fjA X A is included in A. If A is closed under f, the operation f IA X A is said to be that induced in A by f. Although f jA X A Of (assuming that A C X), if instead of "f " a familiar symbol like "-I-" or " " is used
for the initial operation, it is customary to designate that operation which it may induce in a subset by the same symbol. Next, let g: X -+- X
and A be a nonempty subset of X again. We shall say that A admits ) is an algebra having X as its g if g [A19 A. Now suppose that (X, basic set and that A is a nonempty subset of X which admits each operator on X and is closed under each operation in X. Then it may be the case that A, together with the constants induced in it by those of X, is ) is a model. In this event, (A, ) ). is called a subalgebra of (X, According to the foregoing, if (X, ) is a model of an algebraic theory, then subsets of X which are closed under the operations in X a model of the theory of which (X,
and so on provide a potential source of further models of the same theory. Another possible means for deriving further models from given models of an algebraic theory is by way of congruence relations. This notion for an arbitrary algebra is a direct generalization of that given in Section 6.4 of congruence relations for Boolean algebras. A congru-
ence relation on an algebra (X,
) is an equivalence relation 0 on X such that if * is a binary operation in X, then for all a, b, and c in X,
(CI) a9bimpliesc*aOc*banda*cOb*c and, if f is an operator on X, then for all a and b in X, (C2) a 0 b implies f(a) Of(b), and, if < is an ordering relation, then for all a, b, c, and d in X, (C3) a 0 b, c 0 d, and a
324
Several Algebraic Theories
I
CHAP P. 8
are sufficient (and, indeed, necessary) conditions that each operation and so on defined for X induces a corresponding constant for X/0 by way of representatives of 8-equivalence classes (that is, if * is a binary operation in X, defining ff * b to be a _*b, and so on). If 0 is a congruence
), then it may be the case that X/0, relation on the algebra ( X , together with the constants induced by those associated with X in the ) way described, is a model of the theory at hand. In this event, (X/9, ). is called a quotient algebra of ( X , In conclusion, it will do no harm to rephrase for algebras in general a remark made earlier for Boolean algebras. Namely, the description of any algebra (X, ) includes (usually implicitly) an equality relation on X and this is taken to be a congruence relation on X. That is, equality is assumed to satisfy whichever of (C1)-(C3) are applicable.
2. Definition of a Semigroup A semigroup (with neutral element) is an ordered triple (X, *, e), where X is a set, * is an associative binary operation in X, and e is a member of X such that
e* x= x* e = x
for all x in X. Our sole purpose in touching on this theory is to derive a few basic properties and introduce some terminology and notations. This will prove to be efficient, since we shall find a variety of applications for these items later. It is with the diversity of the applications in mind that we have adopted the neutral symbol "*" for the operation in X. The property enjoyed by the element a of the semigroup (X, *, e) characterizes this element, since if e' * x = x * e' = x for all x, then e' * e = e
and e' * e = e', whence e = e'. We shall call e the neutral element for the operation in X.
EXAMPLES 2.1. If A is a nonempty set, then (P(A), U, 0) and (6'(A), 0, A) are semigroups.
2.2. If, as usual, N is the set of natural numbers, then 0) and (N, , 1) are semigroups. 2.3. If A is a nonempty set, then (AA, o, iA) is a semigroup.
An algebra (X, ) is often identified by merely its basic set, if no confusion can arise. For example, we shall often use the term "the semi-
8.2
(
Definition of a Semigroup
325
group X" in place of "the semigroup (X, *, e)." If there is need to mention the operation, "X is a semigroup under *" may be used in place of "(X, *, e) is a semigroup." For example, we may say "the set Z of
integers is a semigroup under addition" in place of "(Z, +, 0) is a semigroup." In subsequent instances of semigroups the notation for the composite of a and b will usually be a + b (read: the sum of a and b) or ab (read:
the product of a and b). In the first case we say that we have an additive operation and in the latter case, a multiplicative operation. The neutral clement for an additive operation is always denoted by "0" and called the zero element of the semigroup; the neutral element for a multiplicative operation is usually designated by "I" and called the unit or identity element of the semigroup. One theorem that we have already proved for a semigroup is the general associative law (Theorem 2.2.2), which asserts that all composites that can be associated with , an) of elements of a sernigroup are the same clean n-tuple (a,, a2, ment of the semigroup. For an additive operation this element is denoted by
a,+a2+
+an or
"r ac 2;;'_j
while for a multiplicative operation it is denoted by
a,a2 ... a or
II,""_, a;.
, an are all equal to the same element a, then the composite , an) is denoted by "na" and "a"" in the additive and multiplicative cases, respectively. For n = 1 we agree that both na and a" are simply a. We extend the definition of na and an to all natural numbers by definIf a,, a2, of (a,, a2,
ing Oa to be 0 and a° to be 1-that is, the neutral element in each case. Then, for all natural numbers m and n and all elements a of a semigroup X, (1)
Oa = 0, 1a = a, (m + n) a = ma + na, (mn)a = m(na),
if the operation in X is additive. If the operation is multiplicative, then (2) a° = 1 a' = a, am+n = ama" amn = (a-) n. These formulas follow from our definitions and the general associative law.
A semigroup (X, *, e) is commutative or Abelian iff a * b = b * a for all a and b in X. For commutative semigroups we have the general commutative law stated in Exercise 2.2.4: If a,, a2, ", a" are elements
Several Algebraic Theories
326
of a commutative semigroup and if 1', 2', of the numbers 1, 2, , n, then
I
CH A P. 8
, n' is some rearrangement
a, * a2 * ... *a, = a,. *a2 * . *a",. From this it follows easily that for a commutative semigroup the string of formulas (1) may be supplemented by
n(a+b) =na+nb,
(3)
and those in (2) by (4)
(ab)" = a"b".
Further notation and computational rules enter in connection with our next definition. An element a of a semigroup X is invertible if there exists an element a' of X such that a * a' = a' * a = e. In that event there is just one such element a' with this property. For if with a" we can also demonstrate that a is invertible, then
all =all *e=all *(a* a')=(all *a)*a'=a*a'=a'. The element a' is the inverse of a. If a is invertible and a' is its inverse, so that a * a' = a' * a = e, then these equations demonstrate that a' is
invertible and that a is its inverse. Another important property of invertible elements is proved next.
THEOREM 2.1. If a and b are invertible elements of a semigroup (X, *, e), then a * b is invertible. If a' and b' are the inverses of a and b, then b' * a' is the inverse of a * b. Proof.
It
is sufficient
to show that (a * b) * (b' * a') = e and
(b' * a') * (a * b) = e. The first of these, for example, is shown as follows :
(a*b)*(b'*a') = a*(b*b')*a' = a*e*a' = e. COROLLARY. If a,, a2, group and a,', a2, vertible and aR * aa_, *
a" are invertible elements of a semian their inverses, then al * a2 * * a" is in* al' is its inverse.
The notation involved in discussing further properties of invertible elements is sufficiently different in the additive and multiplicative cases as to warrant separate treatments. Let us consider an additive notation first. If a is an invertible element of a semigroup (X, +, 0), then negative multiples of a can be defined. Namely, we observe that if a' is the inverse
of a (thus, a + a' = a' + a = 0); then ma = (m + 1)a + a' (5)
8.2
Definition of a Semigroup
327
for all nonnegative m. This equation we take as the basis for an inductive definition of ma for negative m. Then we observe that the third formula in (1) above is true for any fixed m and n = 0; it can be proved for all
natural numbers n by induction from n to n + 1 and for all negative n by induction from n + I to n, using the following consequence of (5) :
(m + 1)a = ma + a. One instance of the formula thus obtained is
na + (- n)a = Oa = 0 = (- n)a + na for an.arbitrary n. This means that for all n, na is invertible and (-n)a is its inverse. It follows that m(na) and (mn)a are defined for every m. The equality of these two elements for arbitrary m and n can then be proved by the two inductions used before. Thus the fourth formula in (1) and thereby all formulas in (1) hold for all integers m and n.
If a is an invertible element of (X, +, 0), then, according to (5), (-1)a is the inverse of a. We abbreviate "(-1)a" by "-a" and call it the negative of a. The earlier result that the inverse of the inverse of a is equal to a then takes the form
-(-a) = a and Theorem 2.1 translates into
-(a + b) _ (-b) + (-a) for invertible elements a and b. For an arbitrary b and an invertible ele-
ment a of X, b + (-a) C X; this element will be designated by b - a. Thus, (b - a) + a = b. Further, the element (-a) + b will be denoted
by -a + b, so that a + (-a + b) = b. These definitions lead to the following computational rules which are easily verified :
-(a - b) = b - a, -(-a + b) = -b + a. Finally, if the semigroup is commutative, (3) holds for arbitrary n. All the foregoing definitions and results have multiplicative analogues. The starting point for their derivation is the observation that if a' is the inverse of a, then (6)
ain =
a?%+Ia.
for all nonnegative m. This equation we take as the basis of an inductive definition of a'" for negative m. It is left as an exercise to verify that the
third and fourth equations in (2) above are true for arbitrary integers m and n. According to (6), a-' is the inverse of a. Moreover, (a-')-' = a, (ab)-' = b-'a-1,
328
Several Algebraic Theories
I
CH A P. 8
and, if the semigroup is commutative, (4) holds for all integral values of n.
EXAMPLES 2.4. The semigroup 0) is commutative; 0 is the only invertible element. In contrast, each element of the semigroup (Z, +, 0) is invertible. 2.5. In the multiplicative semigroup Z the only invertible elements are 1
and -1. 2.6. Let A be a nonempty set. Then ((P(A), +, 0), where + is the symmetric difference operation, is a commutative semigroup. Each element B is invertible;
indeed. - B = B. 2.7. The semigroup of all mappings on a set of at least two elements into itself (see Example 2.3) is not commutative. The invertible elements are the one-to-one and onto mappings.
EXERCISES 2.1. Let be an associative operation in a nonempty set X. An element a in X such that x a = x for all x is a right identity element. (a) Give an example of such a system that has more than one right identity element.
(b) Show that if more than one right identity element is present in X, then no identity element is present. 2.2. In a nonempty set X introduce the operation (a, b) -} ab = a. Show that this is an associative operation and that every element is a right identity. When is X a semigroup? 2.3. Show that (N, *, 0), where a * b = a + b + ab, is a semigroup.
2.4. We define NO) to be the set of all objects of the form (a
d) where
a, b, c, d C N. A multiplication is defined for these elements as follows:: a
(c
b
a'
b'
d) (c' d')
_ aa' + bc' ab' + bd'
- (ca' + dc'
cb' + dd'
Show that N(" is a semigroup under this multiplication. What elements are invertible? Defining an element x of a semigroup as idempotent iff x' = x, determine the idempotents of N('). 2.5. Establish each of the identities appearing in (1) and (2) in the text for natural numbers m and n. 2.6. Establish the identities (3) and (4) for commutative semigroups. 2.7. Give a detailed account of the extension of the identities in (1) to the case of arbitrary integers m and n. 2.8. Give a detailed account of the extension of the identities in (2) to arbitrary integers m and n.
8.3
I
Definition of a Group
329
3. Definition of a Group In spite of the repetition which results, we start afresh with the theory of groups for the sake of completeness. Our initial formulation is the one appearing in Exercise 5.4.15. A group is an ordered triple (G, , e), where G is a set, is a binary operation in G, e is a member of G, and the following axioms are satisfied.
G,. G2. G3.
is an associative operation. For each a in G, e a = a.
For each a in G there exists
a' in G such that
Two properties of a group follow directly from the axioms: An element e satisfying G2 is a neutral element for the operation (and hence is unique) and, each element of G is invertible. For proof, let a be a member of G. By G3 there exists an element a' in G such that a'a = e and there exists in element a" in G such that a"a' = e. Then
aa' = e(aa') _ (a"a')(aa') = a"((a'a)a') = a"(ea') = a"a' = e. Then e is a neutral element since, for any a, ea = a by G2 and ae = a, since
ae = a(a'a) = (aa')a = ea = a.
Further, a is invertible since aa' = a'a = e. Thus, we have proved the following result.
THEOREM 3.1. If (G,
,
e) is a group, then it is a semigroup
such that each element is invertible. In accordance with conventions introduced for semigroups, if multi-
plicative notation is used for a group operation we shall write "1" for the identity element and "a-"' for the inverse of a. If additive nota-
tion is used instead, then "0" and "-a" will be used in place of "1" and "a-'." In either case the definitions and properties pertaining to powers and multiples given in Section 2 are available for use.
The converse of Theorem 3.1 is obviously true and consequently another formulation of the theory of groups is at hand: A group is a semigroup such that each element is invertible. We prefer the initial one, however, since it is clearly a weaker formulation, which means
330
Several Algebraic Theories
i
CHAP. g
that there are fewer steps in the verification that a given system is a group. The set of axioms in the explicit formulation of the theory of groups as a semigroup in which each element is invertible is simply the result
of supplying the "left-right" symmetry which the initial formulation lacks. This symmetrical set of axioms is, like {G1, G2, G$}, independent.
A third formulation in which symmetry is an inherent part is given next.
THEOREM 3.2. An ordered pair (G, ), where G is a set and
is
a binary operation in G, defines a group if
Go. G is nonempty, G1. G4.
-
is associative,
each of the equations a x = b and y a = b has a solution in G for all elements a and b in G. Proof. Assume that (G, -, 1) is a group. Then obviously (G, ) satisfies Go and G1. Moreover, G.1 is valid since, for given elements a and 6
in G, a(a 'b) = b and (bar')a = b. For the converse, let (G, ) be a system satisfying Go, G1, and G4. According to Go there exists an element c in G. According to G4 there
exists an element e in G such that ec = c. Moreover, by G4, if a is any element of G, then there exists an element d in G such that cd = a. Hence ea = e(cd) = (ec)d = cd = a,
so e satisfies Ga. As for G3, it is a consequence of the solvability of. xa = e for each a. Hence, (G, , e) is a group.
Each of the equations ax = b and ya = b has a unique solution in a group. This is an immediate consequence of
THEOREM 3.3. For all elements a, b, and c in a group, each of ab = ac and ba = ca implies that b = c. Proof. Assume, for example, that ab = ac. Then a'(ab) = a '(ac), whence b = c. If finiteness is assumed for the set G in Theorem 3.2, then G4 can be replaced by the, in general, weaker cancellation laws.
THEOREM 3.4. A pair (G, ), where G is a finite set and binary operation in G, defines a group iff
is a
8.3
I
Definition of a Group
331
G is nonempty, is associative, G,. each of a b = a c and b a = c a implies that b = c. G5. Proof. In view of Theorem 3.2 it is sufficient to prove that G5 implies G9 in the presence of Go and G1. Let a be an element of G and conGo.
sider the mapping fa: G ; G such that fa(x) = ax. By G5, fa is oneto-one and hence onto, since G is finite. That is, for each b in G, ax = b has a solution in G. The solvability of ya = b is shown similarly.
We forego giving examples of groups until we have given several more definitions. If for group elements a and b, ab = ba, then a and b commute; if every pair of elements of a group commute, then the group is called commutative or Abelian. Examples that we shall encounter will demonstrate not only the consistency and independence of the set of axioms for a group but also the independence of the set of axioms for a commutative group. If the elements of a group are finite in number then the group is finite and the number of elements is the order of the group. If a group is not finite, then it is infinite. Finally, we mention that analogous to the convention introduced for semigroups, we shall frequently use "G" as a name of the group (G, , 1) if the operation involved is unambiguous.
EXAMPLES 3.1. If A is a nonempty set, then the set of all one-to-one mappings on A onto itself, symbolized G(A), together with function composition and the identity map iA, is a group. This conclusion simply summarizes basic properties of one-to-one correspondences. We shall call this group the group of one-to-one transformations on the set A. 3.2. If n is a positive integer, then congruence modulo n is a congruence relation on the additive group of integers. Consequently an operation + is defined in Z., the set of equivalence classes a, by choosing a + b to be a _+b. It is an easy matter to prove that (Z., +, 0) is a commutative group of order n. 3.3. Congruence modulo n is also a congruence relation on the multiplicative semigroup of integers. This leads to the commutative semigroup (Za, -, 1) where, by definition, a - b = ab. The identity element for the operation is 1 and
since 0 has no inverse, the semigroup is not a group. Discarding 0 does not always overcome the difficulty, since the resulting set may not be closed under multiplication; for example, in Z5, 2 3 = 0. This difficulty is absent if n is a prime p, since then as - b = 0 implies in turn that ab 0(mod p), p divides a or b, either a orb is equal to 0. That is, multiplication is an operation in Z; =
332
Several Algebraic Theories
I
CHAP . 8
Zn - {0}. With I the identity element, to conclude that the system is a group, it remains to prove that each element has an inverse or, in other words, that the equation ax with a P` 0 has a solution in Z. Now 331 = 3 with a P 0 is equivalent to ax = 1(mod p) where p does not divide a. If p does not divide a, then a and p are relatively prime and there exist integers r and s such that ra - sp = 1. But then ram I (mod p) or ra = 1, and a has an inverse. Thus, (Zp, , i) is a commutative group. 3.4. For groups of small order a multiplication table, as described earlier for Boolean algebras, is a practical device for exhibiting the group operation, inverses, and so on. As an illustration, consider the set F of six functions fl, f2, , f6 of a complex variable z, where I'
fl(Z) = Z,
f2(z) = I
f4(Z) =
f6 (Z) = I - Z,
1
z'
fa(z) = z
fe(z) _
z'
z
Z
z - I'
with the composite of fi and f; taken to be f; o f;. Since f, is an identity element for the operation and a is associative, (F, f,) is certainly a semigroup. The following multiplication table shows that actually it is a group and, further, that the group is noncommutative. fl
f2
fa
f4
f6
Is
ft
flf2f3f+f6f6
f2
f2
f3
f1
f6
f4
f6
f3
fa
fl
f2
f6
f6
f4
f4
f4'
f5
f6
fl
f2
f3
f5
f6
f6
f4
f3
ft
f2
f6
f6
f4
f6
f2
f3
f,
There is also the possibility of using this device to concoct groups of sma'l order. For this we start with a nonempty set S of letters a, b, , k, which ar to be the group elements, and fill out a multiplication table in such a way that all the group axioms are fulfilled. The table will exhibit an operation in S ig each entry is a member of S. A much stronger requirement is given by condition G4 in Theorem 3.2. The unique solvability of ax = b for all a and b in S means that in each row of the multiplication table each element of S must appear exactly once. Similarly, the unique solvability of ya = b implies that each column in the table is simply S in some order. A table whose rows and columns fulfill these conditions defines a group if the operation is associative. Unfortua nately, it is not easy to check the associative law directly from a multiplication table unless special preparations are made. 3.5. Let C be the set of all rotations about the origin in a Cartesian plane_ An element of G is a mapping of the form (x, y) -+- (x', y'), where x' = x cos 0 - y sin 0, y' = xsin0+ycos0.
8.4
`
Subgroups
333
Here B is the angle of rotation. Then (G, o, :), where o is a function composition and i is the identity map, is a group.
EXERCISES 3.1. (a) For the real number a, let ta: R -} R be such that xta = x + a for each real number x. Show that (T, i), where T = {t.1a E R-}, - is function composition, and i is the identity map on R, is a group. (b) For the real number a, let sa: R --- R be such that xsa = xa for each
real number x. Show that (S, a, :), where S = {s,I a E R - (0)), o is function composition, and i is the identity map on R, is a group. 3.2. For real numbers a and b with a -' 0, let [a, b] be the mapping on R into itself such that x[a, b] = xa + b. Show that A = {[a, b]ia, b C B. and a ,E 0} is a group under function composition.
3.3. Show that {(1 + 2m)/(1 + 2n)lm, n E Z} is a group under ordinary multiplication.
3.4. Show that {cos r + i sin rjr C (9} is a group under ordinary multiplication.
3.5. Write out a multiplication table for Z,'. 3.6. An operation in {e, f} may be defined as follows: ee = fe = e, of = ff = f. Show that this system satisfies the group axioms G, and G2, but not G3. Construct two other systems to complete the proof of the assertion that the set of axioms for a group are independent. 3.7. In the text an Abelian group is defined to be a group having the further property that ab = ba for all a and all b. Prove that an Abelian group can be characterized as an ordered triple (G, , ') where G is a nonempty set, is a binary operation in C, ' is a unary operation in G, and the following property holds:
if (aa')b' _ (rs')t', then b = (tr')s.
4. Subgroups A group H is a subgroup of a group G if H e G and the restriction of the operation in G to H X H is equal to the operation in H. In other words, the subgroups of a group G are the closed subsets that satisfy the group axioms. Let H be a subgroup of the group G and 1' and 1 be the identity elements of H and G respectively. Then V- 1' = 1' and t' 1' = 1', so 1'- 1' = 1.1'. By the cancellation laws it follows that 1' = 1 ; thus the identity element of a group G is the identity element of any Subgroup H of G. This result is a consequence of the following necessary I111c1 sufficient conditions that a subset of a group determine a subgroup. We have derived it independently in order that it be available for use
fr the proof.
334
Several Algebraic Theories
I
CHAP. a
THEOREM 4.1 . A nonempty subset H of a group G determines
a
subgroup of G if (i) H is closed, and (ii) the inverse (in G) of each member of H is a member of H. Proof. Let H be a nonempty subset of G having the two stated prop. erties. Then there is in H an element a of G and hence aa'' = 1 is in H by (i) and (ii). Since lx = x for x in G, 1x = x for x in H and for each a in H there is in H an element a', namely a-', such that aa' = 1. Thus H satisfies G2. Since H is closed under the operation in G, that operation restricted to H X H is certainly an associative operation in H. Hence, H is a group. Conversely, if H is a nonempty subset of G which determines a subgroup of G, then (i) must hold. Since 1 C H, as observed above, the equation ax = I has a solution in H. Since the only solution of this in all of G is a ', (ii) must hold for H.
COROLLARY. A nonempty subset H of a group G determines a subgroup of G if for all a and b in H, ab-' is in H.
THEOREM 4.2. A nonempty subset H of a finite group G deter. mines a subgroup of G if H is closed. Proof.
This follows from the definition of a subgroup and Theo-
rem 3.4.
THEOREM 4.3.
The intersection of a nonempty collection of sub, groups of a group G is a subgroup of G. The proof is left as an exercise. Every group G includes two subgroups, namely G and [ 1 } ; these are
the improper subgroups of G. Any other subgroup of G is a proper subgroup. Proper subgroups can usually be obtained by the following technique. Let S be a subset of a group G. Then the intersection of all subgroups of G which include S is a subgroup of G which includes S. This is called the subgroup of G generated by S and is symbolized by [S]. The set [S] has the following properties: (i) it is a subgroup of G, (ii) it includes S and (iii) is included in any subgroup of G that includes S. It is easily seen that these three properties characterize [S]. This characterization can he' used to obtain an explicit description of the elements of [S] as the finite products ala2 a (n arbitrary), where
8.4
I
335
Subgroups
the inverse of an element of S. To prove this assertion, a; C S or a; is
let H be the set of such products. In view of Theorem 4.1, H is a
of G and, clearly, H Q S. If K is a subgroup of G that
subgroup includes S, then K contains each member and the inverse of each mem-
ber of S. Hence, K Q_ H. Thus H satisfies the properties which characterize (S], whence [S] = H. The subgroup generated by the unit set (a}
will be called the subgroup generated by a and symbolized by [a]. It consists of all integral powers of a; a° is the unit element and a '° is the inverse of a'. The group [a] is commutative since ama" = a'+n = a"am.
A group C is called a cyclic group if there exists an element a of C such that C = [al. For example, the additive group of integers, (Z, +, 0),
and the additive group of integers modulo r, (Zr, +, 0), are cyclic groups; the first is generated by I and the second by 1. The multiplicative group (ZD, , 1) is also cyclic, but to prove this requires a few facts , exhaust from number theory. The cyclic groups Z and ZT, r = 1, 2, of all essentially different cyclic groups in a sense which the collection we now explain. An isomorphism of a group G onto a group G' is a one-to-one mapping f on G onto G' such that for all x and y in G,
f(xy) = f(x)f(y) where, on the left, the operation in G is in force while on the right it is that in G'. Thus, a one-to-one mapping on G onto G' is an isomorphism if the image of a product is the product of the images. If there exists an iso-
morphism f of G onto G', then G' is called an isomorphic image of G. In this event it is clear that f ' is an isomorphism of G' onto G so that if G' is an isomorphic image of G, then G is an isomorphic image of G'. We say then that G and G' are isomorphic groups. For example, the mapping
f : R+ -} R where f (x) = logio x is well known to be one-to-one and onto and, since log10 Ay = logio x + logio y,
it is an isomorphism of the multiplicative group of positive real numbers onto the additive group of real numbers. Isomorphism is an equivalence relation on any collection of groups and, from the standpoint of group theory, members of an isomorphism-equivalence class are indiscernible.
The sense in which the cyclic groups Z and Z, r = 1, 2, cyclic groups can be inferred from the following theorem.
, yield all
336
Several Algebraic Theories
I
CHAP P. 8
THEOREM 4.4. An infinite cyclic group is an isomorphic image of the additive group of integers and a cyclic group of order r is an isomorphic image of the additive group of integers modulo r. Proof. If C = [a] is a cyclic group, then the mapping f: Z -} C such that f(n) = a" is onto C. If it is not one-to-one, then ar = as for some distinct pair of integers r and s. We may assume that r > s. Then ar = 1, so there exists a positive integer p such that ap = 1. Let n be the smallest positive integer such that all = 1. Then 1 = a0, a,
. , a"-1 are distinct from each other, since ar = as with 0 < s, r < n
implies that ar = I with 0 < r - s < n, which contradicts the choice of n. Moreover, all distinct powers of a appear among a0, a, , a"-'. For since any integer m can be written in the form m = nq + r, 0
f(m + n) = am+n = f(m)f(n), we have shown that an infinite cyclic group is an isomorphic image of Z.
Next assume that C = [a] has order r. According to the preceding part of the proof, r is the least positive integer such that ar = 1 and , ar-' 1. It is left as an exercise for the reader to com= C = 11, a, plete the proof by proving that C is an isomorphic image of Zr.
The notion of a cyclic group provides one means of classifying the elements of any group G. If a C G, then a is of infinite order or finite order r, according as [a] is infinite or is finite of order r. In the first 1 if n is any nonzero integer; in the second case, ar = I and case, a" r is the least positive integer such that ar = 1. By virtue of the simplicity of cyclic groups it is possible to determine all subgroups of a cyclic group in a straightforward way. We discuss
this next. Let C = [a) and let H be a subgroup different from 111. Then H contains a power am of a, where m 0 0. Since, if am C H, then
a m C H, it follows that there exists a positive integer m such that am C H. Let s be the smallest positive integer such that as C H. We shall show that H = [as] and that the mapping g on the set of all sub-
8.4
337
Subgroups
groups H 0 [ 11 into _Z+ such that g(H) = s is one-to-one. To prove the first assertion let am be any element of H and write m in the form
m=sq+u, 0
Then a" =
am(al)_Q
C H, and hence, by the minimality of s, u = 0.
Thus am = (a')°. Since; on the other hand, any power of all is in H,
H = [as]. That g is one-to-one is clear, because if g(H) = s = g(H'), then H = [as] = H'. To complete the investigation of the subgroups of C = [a), we consider separately the cases where C has infinite order and has finite order. If C is infinite, then the mapping g is onto Z+, because if s C Z+, then g [a' ] = s, since the smallest positive power of a in [a-') is s itself.
If C has finite order r, then g is onto the set of positive divisors of r which are less than Y. To prove this we observe that 1 = a' E H and then repeat an argument used above to conclude that r is a multiple of s; that is, s divides r. On the other hand, let s be any positive divisor
of r which is less than r. If r = st, then (a')' = 1 and (a')" 0 1 if o < t' < t. Hence t is the order of [as]. If g[a'] = s', then [as"] = [a'], and hence [as"] has order t. It follows in turn that as" = 1, s't > r = st, and s' > s. Since s' < s by the definition of s, we have s' = s. If C is infinite, the one-to-one correspondence g can be extended to one between the set of all subgroups and the set of natural numbers by choosing 0 as the image of [ 1) . If C has finite order r, then g has a corresponding extension whose range is the set of all positive divisors of r upon choosing r as the image of [ 11. In the finite case, if H corresponds to s, so that H = [as], then the order of H is r/s. Hence another oneto-one correspondence between the subgroups of C and the positive divisors of r results if with each subgroup we associate the order of that subgroup. We summarize our results in the next theorem.
THEOREM 4.5. A subgroup H of a cyclic group C is cyclic. If C = [a] and H 7-1 111, then H = [as], where s is the least positive integer such that a' C H. If C is infinite, then the subgroups [as] of C are in one-to-one correspondence with the set of natural numbers. If C is finite of order r, its subgroups are in one-to-one correspondence
with the positive divisors of r. Alternatively, in the finite case the order of a subgroup is a divisor of r.and corresponding to each divisor t of r there is exactly one subgroup of order t; it is generated by a'".
A subgroup of the group of one-to-one transformations on a set A is called a transformation group on A. Since the theory of groups had
338
Several Algebraic Theories
I
C H A P. 8
its origin in the study of certain groups of this type, a representation problem arises : Is every group isomorphic to a transformation group? This question has an affirmative answer, which was first supplied by Cayley. We state it as our next theorem.
THEOREM 4.6 . For every group G there is an isomorphic transformation group.
As the set on which the transformations shall be defined, we take the set G itself. Consider the mapping ta: G -'- G defined by the group element a as ta(x) = ax for all x in G. Proof.
Since the equation ax = b has a solution in G for given a and b in G, this map is onto G. Since the cancellation laws hold, to is one-to-one. Thus, to is a member of the group of one-to-one transformations on the set G. We show now that { tala E G } is a transformation group L on G. Since Q. ° tb)(x) = ta(tb(x)) = ta(bx) = a(bx) = tab(x),
to o tb = tab and L is closed. Further, t.-' C L, since it is easily shown that to 1 = to Hence L is a group by Theorem 4.1. Next we prove that L is an isomorphic image of G under the correspondence a -'- .. By definition of L, this map is onto L. It is one-
to-one since, if a and b are distinct elements of G, then al 0 bl, and hence to 0 tb The validity of the relation to G tb = tab completes the proof.
EXERCISES 4.1. Prove the Corollary to Theorem 4.1. 4.2. Find two proper subgroups of each of the groups defined in Exercise 3.1. 4.3. Prove Theorem 4.3. 4.4. Complete the proof of Theorem 4.4. 4.5. Let G be the subset of the set A in Exercise 3.2, consisting of those map-
pings with a= f 1 and b C Z. (a) Show that G determines a subgroup of A. (b) Is G Abelian? Is G cyclic?
(c) Determine the orders of [1, 1) and [-1, -1 ]. (d) Specify all values of a and b fqr which [a, b] is a member of H, the subgroup of G that is generated by [1, 2] and [-1, 0]. (e) Specify two members of G which, taken together, generate G.
8.5
1
Coset Decompositions and Congruence Relations
339
4.6. Show that a group of even order has an odd number of elements of order 2. 4.7. Show that if a, b and ab are group elements each of order 2, then ab = ba.
4.8. Prove that if a and b are elements of a group, then ab and ba have the same order.
4.9. Let a and b be elements of a group such that ba = ambn for integers m and n. Show that the elements ambn-Q, am_sb", and ab-' have the same order. 4.10. Let a and b be elements of a group such that b-'ab = ak for some integer k. Show that b-'a'br = a°k'. 4.11. Show that in an Abelian group the product of an element a of order n and an element b of order m is an element of order mn, provided that m and n are relatively prime.
5. Coset Decompositions and Congruence Relations for Groups Let G be a group and H a subgroup. A subset of G of the form {gh1h C H), where g is a fixed element of G, is abbreviated to gH and called a left coset of H in G. Left cosets, along with their "right" analogue, are distinguished types of subsets of a group, as we shall show. Their basic properties include the following.
(I) For any subgroup H of 0, each element of G is a member of a left coset of H. Two left cosets of H are either disjoint or equal. (II) All left cosets of H have the same cardinal number as the set H.
To prove (I) we observe first that, since the unit element I of G is in H, an element g of G is a member of the left coset gH. Next, sup-
pose that two cosets aH and bH have a common element c. Then c = ah, = bh2, and hence a = bhs, where h8 C H. Hence ah C bH for all
h in H, which means that aH a bH. Reversing the roles of a and b gives bH S aH and hence aH = bH. Property (II) is established by the mapping h --i- gh on H into gH. From (I) it follows that there exists a family {g;Hhi C I) of left cosets of H that is a partition of the set G. This is the left coset decomposition of G modulo H. Clearly the set G is the union over a left coset decompo-
sition of G. The cardinal number of the left coset decomposition of G modulo H is the index of H in G, symbolized (G: H). In view of (II)
we have the following relation among the cardinal numbers and (G: H) :
0 = (G: H) I.
Several Algebraic Theories
340
I
CHAP . 8
Now the cardinal number of any group G may be written as an index, indeed (G: { 1 D. This is usually shortened to (G: 1); With this notation the above relation may be written as (G: 1) = (G: H) (H: 1).
It is left as an exercise to prove the following generalization : If G is a group, H is a subgroup of G, and K is a subgroup of H, then K is a subgroup of G and (G: K) = (G: H) (H: K). If G is a finite group of order n and H a subgroup of order m, we have n = (G: H)m, which implies that m divides n. This is a famous result due to Lagrange. We state it along with two immediate consequences as our next theorem.
THEOREM 5.1. The order of a subgroup of a finite group divides the order of the group.
COROLLARY 1. The order of an element of a finite group divides the order of the group.
COROLLARY 2. A group whose order is a prime is cyclic. If G is a group and H a subgroup, then a subset of G of the form {hgIh E H} where g is a fixed element of G is abbreviated to Hg and called a right coset of H. Properties (I) and (II) above hold for right cosets. The family {Hg;l j C J} of right cosets of H that is a partition of G is the right coset decomposition of G modulo H. It is left as an exercise to show that the set of inverses of the members of a left coset of H is a right coset of H and that, consequently, the left and right coset decompositions of G modulo H are similar sets. Therefore the index (G: H) can also be determined from the right coset decomposition. For later applications we introduce some notation which extends that used for cosets. Let A and B be subsets of a group G. By AB we shall
mean labia C A and b C B. If one of these subsets, for instance A, is simply { a }, then we shall write aB instead of { a } B. The extension of
this notation to more than two subsets is clear. In additive notation we shall write A + B in place of AB. In particular, a left coset modulo a subgroup H will be written as a + H and a right coset as H + a.
8.5
I
Coset Decompositions and Congruence Relations
341
EXAMPLES 5.1. Referring to the group Fwhose multiplication table is given in Example is a subgroup. The left coset decomposition of F modulo H is 3.4, H {H, f2H, faH) _ { {fl, f4}, {fs, f6}, if., f6) ) and the right coset decomposition modulo H is {H, Hfz, Hfa) = { {h, f4}, {f2, ffi}, {f3, f6} } .
It should be observed that these are different partitions of F. In a commutative group, the left cosets and right cosets of a subgroup are identical, of course. For example, in (Z,2, +, 0) the left and right coset decomposition modulo the sub-
group H= {0,4,$} is {H,1 +H,2+H,3+H}. 5.2. In the multiplicative group C* of nonzero complex numbers rei6 (r > 0, 0 real), the subset R+ of all positive real numbers is a subgroup. The coset decomposition of C* modulo R+ can be described geometrically as the collection of rays, with initial point deleted, issuing from the origin in the complex plane. If instead of R+ we start with the subgroup U of all complex numbers such that r = 1, then the coset decomposition of c* modulo U can be described geometrically as the collection of all circles with positive radii and centered at the origin in the complex plane.
Given a group G and a subgroup H, let 0 be the equivalence relation on G corresponding to the left coset decomposition of G modulo H. Thus, by definition, a B b if a and b are in the same left coset of H or, what is easily proved to be the same, if a 'b E H. The relation 0 has the further property that a Ob implies that ca 0 cb for all c in G. That is,
0 satisfies one of the two requirements [see (C1) in Section 11 for a congruence relation on G. We shall call 9 a left congruence relation on this account. How left congruence relations on G and subgroups of G are related is described next.
LEMMA 5.1 . Let G be a group and B be a left congruence relation on G. Then H=Ix E G(x 0 l} is a subgroup of G and a 0 b iff a 'b C H (or, alternatively, if a and b are members of the same left coset of H).
Conversely, if H is a subgroup of G, then the relation B such that a 0 b if a -'b C H is a left congruence relation on G. The correspondence of subgroups to left congruence relations is a one-to-one correspondence between the set of left congruence relations on G and the set of subgroups of G. Proof. Let 0 be a left congruence relation on G and consider H = {x E Gjx 6 11. Since 1 E H, this set is nonempty. Assume that a,
Several Algebraic Theories
342
i
CHAP. 8
b E H. Then b 0 1 and hence ab 0 a. Since a 0 1 and 0 is transitive, it follows that ab 0 1, whence H is closed. If a C H, so that a 8 1, then a-'a 8 a-', whence a' 0 1 or a' C H. Therefore H is a subgroup. Next, if a 0 b, then, in turn, a -'a 0 alb, a 'b 0 1, a -'b C H. Each of these steps is reversible, so that a 0 b if a-'b C H.
Turning to the converse, the fact that a 'b E H if a and b are in the same left cosct of H, coupled with the fact that the left coset decomposition of G modulo H is a partition of G, implies that the relation 0 defined in the lemma is an equivalence relation on G. That, in addition, a 0 b implies ca 0 cb, is a consequence of the identity
a 'b = (ca)-'(cb). The proof of the last assertion of the lemma is left as an exercise.
The preceding lemma has an analogue for right congruence relations (that is, equivalence relations 0 such that if a 0 b then ac 0 bc) for a group G. They determine and are determined by right coset decompositions of G modulo subgroups H of G. Now let 8 be a congruence relation on G (that is, simultaneously a left and right congruence relation). As a left congruence relation, 0 determines a subgroup H of C such that the equivalence class determined by an element g in G is gH.
As a right congruence relation, 8 determines the same subgroup H (note that II is defined independently of left congruency) and the equivalence class determined by g is Hg. Hence, for all g in G, Jig = gH or, what is equivalent, g'Hg = H. A subgroup II of C such that g -111g = H for all g in G is a normal or invariant subgroup of G. Thus a congruence relation 8 on a group G determines a normal subgroup H of C. Indeed, from Lemma 5.1 it is immediate that the congruence relations on G are in one-to-one correspondence with the normal subgroups of C. If to the congruence relation 0 on G corresponds the normal subgroup H of G, it is customary to denote the quotient set G/8 by G/H. We shall do this. Further, we shall often write the element gH of G/II as ff. We already know (see Section 1) that an operation is defined by G/II by the rule
and proceed to show that (G/H, -, 1) is a group, the quotient or factor
group G modulo H. The associativity of the operation in C/II is inherited from that of the operation in G, the element 1 is clearly an identity clement, and, finally, a-'-' is a solution of the equation TO = I. The operation in G/H admits of an alternative description. Goscts of H
8.5
I
Coset Decomposition and Congruence Relations
343
are subsets of G, and hence can be composed using the operation in G as described prior to Example 5.1. With H normal the product (aH) (bH)
is equal to abH, as the reader can prove. But the element abH of G/H is the product of the elements aH and bH of G/H. Thus, the operation in G/H may be interpreted as one for (restricted) subsets of G. EXAMPLES 5.3. Suppose that C is an additive commutative group. Then our foregoing results take the following form. If 0 is a congruence relation on G, then H = {a C Gja0O}
is a'subgroup of G and aOb if -a + b (or, equivalently, a - b) is in H. Conversely, if His a subgroup of G, then the relation 0 such that aOb if a - b C H is a congruence relation on C. In the quotient group G/H (or what is often called the difference group, G - H, in this case) the operation reads
(a+H)+(b+H) = (a+b)+H. 5.4. To assist the reader in acquiring familiarity with the additive notation introduced in the preceding example, we reestablish the fact that 0) (see Example 3.2) is a group. The normal subgroup corresponding to congruence modulo n in the additive group of integers is the cyclic group [n]. Its cosets are
[n],1+[n],...,(n-1)+[n]
and these are the elements of (Z/[n], +, [n]). 5.5. It is left as an exercise to show that the intersection of a collection of normal subgroups of a group is a normal subgroup. For a group G we may then define the normal subgroup generated by a subset S as the intersection of all the normal subgroups that include S. It is left as another exercise to prove that the normal subgroup generated by S is the subgroup generated by the subset T of G consisting of all elements of the form g 'sg for some g in G and some s in S.
To describe the relationship of a quotient group G/H to G, a definition is needed. A homomorphism of a group G onto a group G' is a mapping f on G onto G' such that for all x and y in G, f(xy) = f (x)f (y). That is, a homomorphism onto differs from an isomorphism onto only in that a homomorphism need not be one-to-one. If there exists a homom )rphism of G onto G', then G' is called a homomorphic image of G.
By virtue of the definition of the operation in a quotient group it is clear that if G is a group and G/H a quotient group, then G/H is a homomorphic image of G under the natural mapping on G onto G/Hthat is, the mapping p on G onto G/H such that p(x) = xH. We con-
344
Several Algebraic Theories
I
CHAP. 8
sider next the converse situation. Let G' be a given homomorphic image
of G and f the accompanying homomorphism. Then the equivalence relation 0 on G associated with f, namely, a 0 b iff f(a) = f(b), is a congruence relation on G. The corresponding normal subgroup K of G, namely, {a C Gjf (a) = 11, is called the kernel of the homomorphism f. The quotient group G/K is isomorphic to G'. Indeed, the relation g, which we define as {(x, f(x))Ix C G/K},
is a function on G/K onto G' such that
g(xy) = g( ) = f(xY) = f(x)f(Y) = g(x)g(y) That is, g is an isomorphism. Further, if p is the natural mapping on G onto G/K, then f = g c p. That is, any homomorphic image of a group G can be duplicated to within an isomorphism by some quotient group of G. We state our results in our next theorem.
THEOREM 5.2. If G is a group and K a normal subgroup, then the quotient group G/K is a homomorphic image under the natural mapping on G onto G/K. Conversely, if the group G' is a homomorphic image of G, then those elements which are mapped onto I determine
a normal subgroup K of G and G/K is isomorphic to G'. If f : G - G' is the given homomorphism, then f = g o p where p is the natural mapping on G onto G/K and g is an isomorphism of G/K onto G'.
EXAMPLES 5.6. We illustrate the above theorem by using it to derive again Theorem 4.4 concerning the structure of cyclic groups. Let G be a multiplicative cyclic group generated by a. The mapping m a°' is a homomorphism of the additive group of integers onto G. Hence C is isomorphic to Z/K, where K is the kernel of the homomorphism and, in particular, a subgroup of Z. Now it is easily proved that
the only subgroups of Z are the cyclic groups [n]. If K = [0], then m -} a'" is an isomorphism and G is isomorphic to Z. Otherwise G is isomorphic to Z/[n], a cyclic group of order n. It follows immediately that two cyclic groups are isomorphic if they have the same order. For this reason it is common to speak of "the" cyclic group of infinite order and "the" cyclic group of order n. 5.7. Every subgroup of a commutative group is normal and consequently determines a quotient group. Thus, the subgroup R+ of all positive real numbers of the multiplicative group C* pf nonzero complex numbers determines a quotient group; C*/R+ is isomorphic to the additive groups of real numbers. Again, the quotient group of C* modulo U, the subgroup of complex numbers
8.5
'
Coset Decomposition and Congruence Relations
345
of absolute value 1, is isomorphic to the multiplicative group R* of nonzero real numbers. 5.8. We note that with the above theorem a homomorphism can be shown to be an isomorphism by proving that its kernel is {1}. 5.9. If f : G -+- G' is a homomorphism of the group G onto the group G', then
f(l) = 1', the identity element of C' and f(a ') = (f(a))-'. 5.10. Suppose that G is a group, G' is a set in which a binary operation is defined, and f is a mapping on G onto G' such that f(ab) = f(a)f(b). Then G' is a group.
EXERCISES 5.1. Verify the relation (G: K) = (G: H)(H: K), given in the text. 5.2. Establish the two Corollaries to Theorem 5.1. 5.3. Prove the assertion made in the text that if G is a group and H is a subgroup, then there exists a one-to-one correspondence between the left coset decomposition of G modulo H and the right coset decomposition of G modulo H. 5.4. Let G be a group and H and K be subgroups of finite orders. Show that if these orders are relatively prime, then H (1 K = {1} . 5.5. Let G be a group having H and K as subgroups. Show that any left coset of H (1 K is the intersection of a left coset of H and one of K. Use this to deduce
that if H and K have finite index in G then so has H (l K. 5.6. Let H and K be two finite subgroups of a group G. Show that the subset HK of G contains precisely (H: 1)(K: 1)/(H (1 K: 1) distinct elements. 5.7. Let G be a group having H and K as subgroups. Show that HK is a subgroup iff HK = KH. .5.8. Supply the missing part of the proof of Lemma 5.1.
5.9. Let G be a group and H a subgroup. Under what circumstances is xH - - Hx a mapping on the left cosets of H onto the right cosets of H? 5.10. Show that if for a subgroup H of a group G, g -'Hg C H, for all g in C, then H is normal in C. 5.11. Show that if H is a subgroup of a group G, then g'Hg, for g C G, is a
subgroup isomorphic to H. Let N = n {g'Hg!g C G} and show that N is a normal subgroup of C, indeed the largest normal subgroup of G included in H. 5.12. Prove that if H is a normal subgroup of a group G, then (aH) (bH) = abH. 5.13. Establish the assertions made in Example 5.5. 5.14. Let if be a collection of distinct subsets Si of a given group G with the following properties.
(a) Every element of G is in at least one Si. (b) No Si is a proper subset of an S,. (c) The product of any two members of if is included in a member of if. Show that if is the coset decomposition of a normal subgroup of G.
346
Several Algebraic Theories
I
CHAP . 8
5.15. Let C* be the set of nonzero complex numbers z = re2:ir, U be the set of complex numbers of absolute value 1, and R, C, and R* have their usual meanings. Investigate each of the following mappings-deciding which are homomorphisms, which are isomorphisms, and so on.
(a) f : (R+,
, 1) - - (R, +, 0) (b) f: (C*, , 1) -} (U, , 1)
where f (x) = In x, wheref(z) = eYrip,
(c) f : (R, +, 0) -*- (U, , 1) (d) f: (C*, , 1) ->- (R+, , 1)
where f (,p) = esrip,
wheref(z) _ IzI*'
5.16. If G is a group, elements of the form x .ty =1xy are called commutators.
Prove that the subgroup C generated by the set of all commutators of G is a normal subgroup, that G/C is Abelian, and, if N is any normal subgroup of C such that GIN is Abelian, then C C N.
6. Rings, Integral Domains, and Fields A ring (with identity element) t is an ordered quintuple (R, -}-, , 0, 1),
where R is a set, + and
are binary operations in R, 0 and I are dis-
tinct members of R, and the following conditions are satisfied. Rt. ring).
(R, +, 0) is a commutative group (the additive group of the
R2.
(R, , 1) is a semigroup with identity element (the multi-
plicative semigroup of the ring). Rs. The- following distributive laws hold :
a(b + c) = ab + ac,
(b + c)a = ba + ca.
EXAMPLES 6.1. The statement that (Z, +, , 0, 1) is a ring summarizes many of the basic properties of the system of integers. To be precise, it is a concise formulation of properties (1)-(5), (7), and (8) in Theorem 3.3.1 of this system.
6.2. The system of rational numbers and that of the real numbers provide further models of the theory of rings.
6.3. The set Z[V'5] of all real numbers of the form m + n1'5, where m, n C Z, together with the familiar operations and 0 and 1, is a ring. 6.4. (Zr, +, , 0, 1) (see Examples 3.2 and 3.3) is an example of a finite ring, that is, a ring such that the basic set has a finite number of elements. t The usual definition of a ring does not require the existence of an identity element. However, since those rings which interest us have an identity element, we have incorporated this requirement into our definition at the outset. The assumption that 0 and I are distinct elements of R serves to rule out the extreme and trivial case of a ring such that R consists of a single element.
8.6
Rings, Integral Domains, and Fields
347
6.5. If (B, (, ', 0, 1) is a Boolean algebra, then it is possible to introduce operations in B such that the resulting system is a ring. For addition in B we choose the symmetric difference operation; that is, if a, b C B, we define a + b = (a ( b') U (b (
a').
For multiplication in B we take (1 and henceforth use the customary ring notation ab for a (l b. Then (B, +, , 0, 1) is a ring. The reader is asked to prove this and derive properties of such a ring in the exercises for this section.
Many of the computation rules of ordinary arithmetic carry over to arbitrary rings. First of all, the definitions and properties in Section 2 pertaining to powers of an element and those pertaining to multiplica-
tion apply to the additive group and the multiplicative semigroup, respectively, of any ring. In addition to the earlier rules for multiples we have the rules
n(ab) = a(nb) = (na)b.
(1)
These follow from the general distributive laws
aE;- ibi = E;-iabj, (X:=i b:) a which, in turn, are easily proved by induction. We call attention to the
fact that the multiple na of a ring element a should not be confused with a ring product. However, since we are assuming that a ring always has an identity element, we can write
na = la + la +
+ 1a(n summands) = (1 + 1 +
+ 1)a = (nl)a
and the last is a product. The distributive laws hold for subtraction in a ring: (2)
a(b - c) = ab - ac, (b - c)a = ba - ca.
To prove, for example, the first of these, we must show that a(b - c) + ac = ab. But this follows directly from the first distributive law in Ra,
since (b - c) + c = b. For b = c identities (2) yield the following important properties of the ring element 0: (3)
aO = Oa = 0,
for all a in R. In particular, Oa is equal to the ring element 0 whether "0" in Oa is the ring element or the natural number zero. If in (2) we set b = 0, we get
a(-c) = -ac, (-c)a = -ca,
348
Several Algebraic Theories
I
CHAP. 8
and if in the first of these identities we replace a by -a we obtain
(-a)(-c) _ -(-a)c = (- -a)c, whence
(-a)(-c) = ac. An element a of a ring Rt is called a left (or right) zero-divisor if there exists in R an element b 0 0 such that ab = 0 (or ba = 0). By (3) the element 0 is both a left and right zero-divisor, since by assumption our rings contain more than one element. A proper zerodivisor is a zero-divisor which is different from 0. A ring has a proper zero-divisor if it contains a pair a, b of nonzero elements such that ab = 0. We shall say that a ring is without zero-divisors if it has no proper zero-divisors. Since an element of the ring R is an element of the semigroup (R, , 1), the definition of an inverse of a ring element is at hand. A ring element
is called a unit if it has an inverse. According to Section 2, if a has an inverse, it is unique; the inverse of a will be denoted by a-'. Again according to Section 2, if a and b are units then also a-' and ab are units, which implies that the set of units of a ring form a group. The element 0 is not a member of the group of units of a ring since for every element a
in R,aO=Oa=0F& 1. Various specialized types of rings are obtained by imposing conditions on the multiplicative semigroup at hand. For example, a ring is said to be commutative if its multiplicative semigroup is commutative. A commutative ring R (with identity element) having no proper zero-
divisors is called an integral domain. The latter condition means simply that the set R* of nonzero elements of R is closed under multi-
plication. A ring R is called a division ring (or skew field) if R* is closed under multiplication and (R*, , 1) (where now the domain of is restricted to R* X R*) is a group. Finally, a division ring is called a field if multiplication is a commutative operation. Referred back to the
definition of a ring, the field (R, +, -, 0, 1) is a ring such that the set R* = R - {0} is closed under multiplication and (R*, , 1) is a commutative group. EXAMPLES 6.6. For any ring R we now define the ring Rc2) of 2 X 2 matrices with elements in R. The elements of R(2) are all arrays or matrices (a)
_ (au al)
ali a22/ f Henceforth we shall often call the ring (R, , -h, 0, 1) simply "the ring R."
8.6
I
Rings, Integral Domains, and Fields
349
of two rows and columns with elements a;; in the ring R. The element a;; located at the intersection of the ith row and jth column of (a) will be called the i,j-element of (a). Two matrices (a) and (b) are defined to be equal if aq = b11 for all i and j. Addition of matrices is defined by the formula au aiz1/ /(bit b1z _ all + bu a1 + b12 azj
ass/
\b21
bzz)
azl + bat
1722 + b22
It is easily proved that (R(2), +, 0), where 0 is the matrix all of whose elements are 0, is a commutative group. The negative of (a) is the matrix having -a;; as its Q -element. Multiplication of matrices is defined by the formula
(au \a21
a12\ (b11 azz/ \b21
b12\
= anbll + alzbzl
bzz/
as,bnl + azzb21
anb12 + alzbzz azlblz + azzbn>.
That is, the i, j-element of the product is the sum of the products of the elements of the ith row of (a) and the corresponding elements of the jth column of (b). The matrix (1
`0
0) 1
is an identity element for this operation and (R(2), , 1) is a semigroup (see Exercise 2.4). Further, the distributive laws R3 hold, so Rc2 is a ring. This ring is not commutative, since
(01 00)(0
0)=(0 0) and
(0 0) (0
0O)=(O
0)'
The second equation exhibits two proper zero-divisors in R(z). 0 6.7. A characterization of integral domains among commutative rings can be given in terms of the (restricted) cancellation law for multiplication:
ac = be and c 0 0 imply that a = b. Indeed, if for elements a, b, and c in any ring without zero-divisors, ac = be and c # 0, then (a - b)c = 0 where c 0 0. It follows that a - b = 0, whence a = b. Conversely, if the above cancellation law holds in a ring, then ab = 0 and b 0 0 imply that ab = Ob and b 0 0, whence a = 0. In summary, a commutative ring is an integral domain if the cancellation law for multiplication holds.
The system of integers is an integral domain. This statement summarizes parts (1)-(9) of Theorem 3.3.1. 6.8. A ring R such that every element is idempotent (a'- = a) is commutative and each element is equal to its negative. To prove this we notice that for all elements a and b of such a ring
a+b = (a+b)z=az+ab-l--ba+bz=a+ab+ba+b, whence (4)
ab + ba = 0.
Several Algebraic Theories
350
I
CHAP P. 8
Setting b = a in this identity yields the identity a$ + a$ = 0. Since as = a, it follows that a + a = 0 or, in other words, each element is its own negative. In particular, the negative of ab is ab and this fact, together with (4), implies that ab = ba, thereby completing the proof. 6.9. The ring Z. of integers modulo n is a field if the modulus is a prime (see Example 3.3). 6.10. According to Theorem 3.4.1 the system of rational numbers is a field. This is a restatement of parts (1)-(10) of that theorem. According to Theorem 3.6.1, the system of real numbers is a field.
EXERCISES 6.1. Show that in the definition of a ring R (with identity element 1) the requirement that 0 0 1 may be replaced by the requirement that R contain an element different from 0.
6.2. Prove that the set Z[V'5J of all real numbers of the form a + b/ where a, b E Z, together with addition, multiplication, 0, and 1, is a ring. 6.3. Which of the following sets, together with addition, multiplication, 0, and 1, is a ring?
(a) The set of all real numbers of the form a +
b E Z.
(b) The set of all real numbers of the form a + b/ +cY where a, b,
cCZ. (c) The set of all rational numbers which can be expressed in the form m/n, where m is an integer and n is a positive odd integer.
6.4. Suppose that (R, +, , 0, 1) is a ring and that in R we introduce new operations 0 and 0 by way of the following definitions.
aO+ b=a+b - 1,aOb=a+b-ab. Show that (R, O, 0, 1, 0) is a ring. Describe the ring which results from this ring if new operations are introduced in R by repeating the same definitions.
6.5. Referring to Example 6.5, prove that (B, +, , 0, 1) is a ring, all of whose elements are idempotent. 6.6. By a Boolean ring is meant a ring (with identity), all of whose elements are idempotent. According to Exercise 6.5, a Boolean algebra determines a Boolean ring. Using the results in Example 6.8, show that, conversely, a Boolean ring determines a Boolean algebra upon defining
a U b = a + b + ab, anb = ab. Further, show that the processes of deriving a Boolean algebra from a Boolean ring and of deriving a Boolean ring from a Boolean algebra are inverses of each other. Thereby a one-to-one correspondence between Boolean algebras and Boolean rings is established, a result which was first proved by Stone (1936). 6.7. Prove that a finite integral domain is a division ring.
8.7
(
Subrings and Difference Rings
351
6.8. Referring to Example 6.6, prove that if R is a commutative ring and (a)(b) = 1 for (a), (b) C Rte>, then (b)(a) = 1. 6.9. We assume it known that the set Q of complex numbers forms a field. Show that the set of all matrices of Q(2) having the form ( ab al where X is the complex conjugate of x, forms a division ring which is not a field.
6.10. If a is a ring element, then an element b of that ring, such that ab = 1, is called a right inverse of a. Prove that the following conditions on a are equivalent.
(a) a has more than one right inverse. (b) a is not a unit. (c) a is a left zero-divisor. 6.11. Prove that if a ring element has more than one right inverse, then it has infinitely many. (Hint: Consider the set of ring elements b + (I - ba)a", where ab = I and n = 0, 1, 2, ..)
6.12. Prove that a ring R is an integral domain if for all a, b, and c in R and b -' 0, ba = cb implies that a = c.
7. Subrings and Difference Rings A ring S is a subring of a ring R if S C R and the restriction of addition and multiplication in R to S X S are equal, respectively, to addition and multiplication in S. Having chosen to restrict our attention to rings with an identity element, we shall insist further that a subring S of a ring R contain an identity element. It follows that if a subset S of a ring R is a subring, then (sec the Corollary to Theorem 4.1) it must satisfy the following conditions.
(i) If a, b C S, then a - b C S. (ii) If a, b C S, then ab C S. (iii) There exists an element 1, in S such that 1,x = A. = x for all x in S.
Conversely, it is clear that these conditions are sufficient to insure that a subset S of a ring R form a subring. It is possible for the identity element 1, of a subring S to be different from the identity element 1 of R (see Example 7.4 below). In that event 1, is a zero-divisor of R. For by assumption there exists in R an element a such that 1,a = b 96 a. Since 1,b = 1,(1,a) = 1,a = b,
Several Algebraic Theories
352
I
CHAP P. 8
it follows that l,a = 1,b, and hence l,(a - b) = 0. Thus, 1, is a (proper) zero-divisor of R since a 54- b. As a corollary there is the fact that if R is an integral domain, then the identity element of a subring is necessarily the identity element of R. A field S is a subfield of a field F if S C F and the restriction of addition and multiplication in F to S X S are equal, respectively, to addition and multiplication in S. Since a field is an integral domain, the identity element of S is the identity element of S. This also follows from the fact that the multiplicative group of S must be a subgroup of the multiplicative group of F. This condition, together with the condition that S be a subgroup of the additive group of F, characterizes the notion of a subfield. Hence, S is a subfield of F if the following conditions hold.
(i) a, b C S imply that a- b C S. (ii) a, b C S and b s' 0 imply that ab-I E S. EXAMPLES 7.1. The set of all matrices in R(2 (see Example 6.6) of the form
C 0\ determines a subring of RO). 7.2. The field of rational numbers is a subfield of the field of real numbers. 7.3. The intersection of any nonempty collection of subfields of a field F is a subfield of F. 7.4. Let A and B be rings with identity elements IA and IB, respectively, and
let R be the set of all ordered pairs (a, b) where a C A and b C B. We define operations in R as (a, b) + (a', b') _ (a + a', b + b'), (a, b)(a', b') _ (aa', bb').
It is an easy calculation to prove that R is a ring having (1A, IB) as identity clement. Further it is clear that RA = {(a, 0)Ia C A} is a subring of R having (1A, 0) as identity element. Thus the identity element of RA is distinct from that of R.
The definition of a congruence relation for an algebra (Section 1) takes the following form in the case of a ring. A congruence relation 0 on a ring R is an equivalence relation on R such that for all a, b, and c in R, a 0 b implies that c + a,O c + b, (Cm1) a 0 b implies that ca 0 cb, (Cmr) a 0 b implies that ac 0 bc.
8.7
I
Subrings and Difference Rings
353
The right-hand analogue of (C,) is superfluous since addition is cornmutative. The condition that multiplication preserve equivalent elements has been written in two parts for easy reference. Now (Ca) means that 0 is a congruence relation on the group (R, +, 0). Hence, (i) 0 determines (and, is determined by) the subgroup S = { s C Ris 0 0 } of R (see Example 5.3), (ii) a 0 b if a - b C S, and (iii) 0-equivalence classes are left (= right) cosets a + S of S in R. Addition can be defined in R/S in terms of representatives [that is, (a + S) + (b + S) = (a + b) + S] and (Cmt) and (Cmr) are additional necessary and sufficient conditions that multiplication can be defined similarly. Let us translate these into conditions for S. From (Cmi) we infer that if r E R and s C S, then rs C S, since s C S means s 0 0, and hence rs 0 rO or rs 0 0. Conversely, if a
subgroup S' of R has the property that r C R and s' C S' imply that rs' C S', then the additive congruence relation 0' which S' determines satisfies (Cmi), since if a 0'b then, in turn, a - b C S', c(a - b) C S', ca - cb C S', ca 0 'cb. Similarly, (Cmr) holds for a relation 0 which satisfies (CB) if the subgroup S corresponding to 0 has the property that r C R and s C S imply that sr C S. There follows the existence of a one-to-one correspondence between the congruence relations on R and the subgroups S of the additive group of R such that r E R and s C S imply that rs, sr C R. A subset S of a ring R such that S is a subgroup of the additive group of R and, for all r in R and s in S, both rs and sr are in S, is called an ideal of R. Thus, a nonempty subset S of R is an ideal if
(i) sCSandICSimplythats - ICS, (ii) s C S and r C R imply that rs, sr C S. The results obtained above may now be summarized by the statement that the congruence relations on R are in one-to-one correspondence with the ideals of S. As one might suspect, ideals are the analogue for rings of normal subgroups for groups. Every ring R has at least two ideals, namely, the entire ring and {0}. The ideal R of R corresponds
to the universal relation on R and the ideal 101 corresponds to the equality relation on R.
EXAMPLES 7.5. If R is a commutative ring (with identity element) and a C R, then Ra = {ralr C R} is an ideal called the principal ideal generated by a. Since R = Rl and {0} = R0, both R and {0} are principal ideals. 7.6. A field F has only two ideals, F and {0}, for if I is an ideal of F and
354
Several Algebraic Theories
(
CHAP. 8
I s {0}, then I contains a nonzero element a and hence I contains a -'a = 1, whence I = F. 7.7. If a commutative ring R has only two ideals, then it is a field. For let a be a nonzero element of R and consider Ra. This principal ideal contains la = a and hence is different from {0}; hence, it is equal to R. But this implies that the equation xa = 1 has a solution for every a 0 0. 7.8. We recall that in Section 6.4 we defined the notion of an ideal of a Boolean algebra. It is left as an exercise to prove that the ideals of a Boolean algebra B coincide with the ideals of the corresponding Boolean ring B (see Exercise 6.6).
Now we can get to the whole point of this discussion. Let R be a ring and S be an ideal of R which is distinct from R. Then we know that operations can be introduced in R/S, the collection of cosets a + S of S in R (that is, the 0-equivalence classes where 0 is the congruence relation corresponding to S) by the following definitions:
(a+S)+(b+S) =(a+b)+S, (a + S) (b + S) = ab + S.
Further, we know that (R/S, +, S) is a commutative group. Also, since
S 0 R by assumption, 1+ S 0 S and 1+ S is an identity element for multiplication. Finally, it is a straightforward exercise to prove that (R/S, +, , S, 1 + S) is a ring, the so-called difference (quotient, residue class) ring of R modulo the ideal S. EXAMPLES 7.9. It is an easy matter to determine all ideals of the ring Z of integers. Since an ideal of Z is a subgroup of (Z, +, 0) it has the form [r], that is, the set of all multiples of r (see Section 4). But it is clear that each such subset is an ideal, indeed, the principal ideal Zr. (That is, Zr in ring notation is [r] in group notation.) Since Zr = Z(-r), it follows that Zr for r = 0, 1, 2, exhaust the ideals of Z. The difference ring Z/Zr is Z if r = 0. Since Z1 = Z we exclude the
value I for r. If r > 2, Z/Zr has r elements
0=Zr,I =1+Zr, ,1=r-1+Zr. This is the ring we denoted by Z. earlier. If r is a composite number, say r = mn with m > 1 and n > 1, then m # 0 and n s 0, but MR = i = 0. This shows that Z/Zr is not an integral domain if r is composite. On the other hand, if r is a prime then we know (see Example 3.3) that Z/Zr is a field. 7.10. Some properties of rings carry over to each of their difference rings. For example, if R is a commutative ring then R/S is commutative. But if R is an
8.7
I
Subrings and Difference Rings
355
integral domain, then the same need not be true of a difference ring, as the preceding example shows.
A homomorphism of a ring R onto a ring R' is a mapping f on R onto R' such that for all x and y in R,
f(x + y) = f(x) +f(y'), f(xy) = f(x)AY) If there exists a homomorphism of R onto R', then R' is called a homomorphic image of R. A homomorphism of R onto R' which is one-toone is called an isomorphism and R' is called an isomorphic image of R. If f is an isomorphism of R onto R', then f-1 is an isomorphism of R onto R' and hence each ring is an isomorphic image of the other. In
this event we shall refer to R and R' simply as isomorphic rings. By virtue of the definition of operations in a difference ring it is clear that a
difference ring R/ S of a ring R is a homomorphic image under the natural mapping a --} a + S on R onto R/S. We go on to show next that, conversely, every homomorphic image of a ring R is isomorphic to a difference ring of R. Let f : R --} R' be a homomorphism of the ring R onto the ring R'. Then f is a homomorphism of the additive group R onto the additive group R', and hence (Theorem 5.2) if S is the kernel of f (thus S is the inverse image of the zero element of R'), f = g o p, where p is the natural map on R onto (the additive group) R/S and g is an isomorphism of R/S onto R' [indeed, g(a + S) = f (a) ]. The further property off, that it preserves multiplication, implies that S is an ideal of R. Indeed, if r E R and s C S, then f(rs) = f(r)f(s) = f(r)O' = 0', whence rs C S. Similarly, if r C R and s C S, then sr C S. Hence, g establishes R' as an isomorphic image of R/S. We summarize our results in the next theorem.
THEOREM 7.1. The difference ring R/S of the ring R modulo the ideal S of R is a homomorphic image of R. Conversely, any homomorphic image of R is isomorphic to the difference ring R/S, where S is the kernel of the homomorphism regarded as a homomorphism of the additive group R. We conclude this section with the introduction of some terminology
which will have applications later. A ring R is said to be imbedded in a ring S if S includes an isomorphic image R' of R. If R is imbedded in S then S is called an extension of R. If R is imbedded in S it is possible to construct a ring isomorphic to S which actually includes R as a sub-
ring. One rarely bothers to do this, however, since usually it is not
356
Several Algebraic Theories
I
CHAP P. 8
necessary to distinguish between isomorphic rings. Instead, one "identifies" R with R' which, practically speaking, means that henceforth one regards S as actually including R. Alternatively, one can think of
discarding R, using R' in its place, and appropriating the names of elements of R for use as names of the respective image elements in S. It was this latter point of view which was adopted in Chapter 3 in the successive extensions of the natural number system to the real number system.
EXERCISES 7.1. Prove that the intersection of a nonempty collection of subfields of a field F is a subfield of F. 7.2. If a and b are distinct elements of a field F, we define a new addition
8 and a new multiplication O in F as
x©y=x+y-a,x 0 y = a + (x - a)(y - b)(b - a)-'. Prove that (F, (D, (D, a, b) is a field. 7.3. Prove the assertion made in Example 7.8. 7.4. Prove that under a homomorphism the zero and identity element of a ring map onto the zero and identity element, respectively, of the image ring and that negatives map onto negatives. Remark. For the remaining exercises assume that the definition of a ring is modified by discarding the requirement that an identity element be present. Then, for example, the set of even integers forms a ring. Further, assume that by an integral domain is meant simply a ring (in the above sense) with no proper zero-divisors.
7.5. Show that a ring A can be imbedded in a ring with identity element. Hint: In B = Z X A introduce the operations
(m, a)+(n,b) _ (m+n,a+b), (m, a)(n, b) _ (mn, na + mb + ab), where na and mb are the nth multiple of a and the mth multiple of b, respectively. Prove that B with these operations forms a ring having (1, 0) as identity element
and that A is imbedded in B. 7.6. If the ring A of Exercise 7.5 is an integral domain, then the ring B need not be an integral domain. Establish this fact by taking for A the ring of even integers. Remark. The next three exercises are devoted to proving that it is possible to imbed an integral domain in an integral domain with an identity element. 7.7. Let A be an integral domain, containing elements a and b, with b , 0, such that ab + mb = 0 for some integer m. Prove that ca + me = 0 = ac + me
for all c in A.
8.8
1
A Characterization of the System of Integers
357
7.8. Let A be an integral domain and let B be the ring obtained from A and Z by the construction of Exercise 7.5. The mapping on A into B such that a -* (0, a) demonstrates that A is imbedded in B and the mapping on Z into B such that m -3- (m, 0) demonstrates that Z is imbedded in B. Let us identify
A with its image and Z with its image. That is, we shall write simply a for (0, a) and m for (m, 0). Then, by virtue of the definition of addition in B,
B= {m+almCZand aCA}. Show that C= {bCBiba=Ofor all aEA} is an ideal of B and that B/C is an integral domain with identity element. 7.9. Prove that the set A' = (a + C E B/Cia C A) forms a subring of B/C isomorphic to A.
8. A Characterization of the System of Integers The statement that the system of integers is an integral domain summarizes many properties of, but does not characterize, this system. The latter assertion is substantiated by the existence of finite integral domains (see Example 6.9). An additional property of Z, which one might at least suspect would serve to distinguish it among integral domains in general, is the presence of a simple ordering relation which is preserved under addition and under multiplication by positive integers.
Since this ordering relation can be formulated in terms of the set of positive integers it is natural to consider integral domains which include
a distinguished subset having properties (11)-(13) of Theorem 3.3.1, in connection with an attempt to characterize the system of integers. This is the motivation for our next definition.
An ordered integral domain is an integral domain D which includes a subset D+ with the following properties. O,.
0.
If a,bCD+,then a+bCD+. If a,bCD+, then abC D.
For each element a of D, exactly one of a = 0, a C D+, -a C D+ holds. Os.
The elements of D+ are called the positive elements of D. The elements
a such that -a C D+ are called the negative elements of D. Further, the members of D+ U {01 are called the nonnegative elements of D. The relation less than, symbolized by <, is defined in an ordered domain by
a
As usual, a < b means that a < b or a = b and b > a means that a < b. It is clear that a > 0 iff a C D+ and that a < 0 if -a C D+. In
Several Algebraic Theories
358
I
CHAP P. 8
terms of less than, properties 0,-03 of D can be restated in the following form: 01. 02. 03.
Ifa>0andb>0,then a-{-b>0. Ifa > O and b > 0, then ab > 0. If a E D, then exactly one of a = 0, a > 0, a < 0 holds.
Additional properties of less than include the following.
04 If a < b and b < c, then a < c. 05 For all a and b in D, exactly one of a < b, a = b, b < a holds. 06. 07.
If a < b, then a + c < b + c. Ifa < b and c > 0, then ac < bc.
O6.
If a 96 0, then a2 > 0.
To prove 04 let us assume that a < b and b < c. Then b - a and c - b are positive, and hence, by 01, so is their sum c - a. But this means that a < c. Proofs of 05-07 are left as exercises. To prove O6 let us assume that a 76 0. By 03, either a > 0 or a < 0. If a > 0, then a2 > 0
by 02. If a < 0, then -a > 0 and (-a)2 > 0, by 02. But (-a)2 = 0. for any ring element. So, in all cases, if a 5& 0 then a2 > 0. From 04 and 0s it follows that less than is irreflexive and transitive and hence < is a partial ordering relation. Supplementing this observation with O6, 06, and O7, we infer that < is a simple ordering relation which is preserved by addition and by multiplication with positive elements. It is left as an exercise to show that, conversely, if D is an integral domain which is endowed with a simple ordering relation < which is preserved under addition and under multiplication by elements a such that 0 < a, then D is an ordered domain.
At this point the reader who has studied Chapter 3 will recognize that we have established for the ordering relation in an arbitrary ordered domain all but one of the properties which we proved for the ordering relation in Z. The exception is concerned with the well-ordering of the nonnegative elements.
We continue to imitate the developments in Chapter 3 by defining the absolute value of an element x of an ordered domain as Ixl _
x, if x > 0,
- .-x,ifx<0.
It is left as an exercise to prove that the absolute value function on an
8 .8
I
A Characterization of the System of Integers
359
arbitrary ordered domain has all the properties which hold in the case of familiar ordered domains (see Theorem 3.4.4).
If D and D' are ordered domains and f is a one-to-one mapping on D onto D' which preserves addition and multiplication and maps positive elements onto positive elements, then f is called an orderisomorphism of D onto D'. It is left as an exercise to prove that an order-isomorphism f of D onto D' does preserve ordering, that is,
x < y iff f (x) < f(y), and that f-' is an order-isomorphism of D' onto D. If there exists an order-isomorphism of D onto D' we shall say that D is order-isomorphic to L)' or that D and D' are order-isomorphic. Illustrations of orderisomorphisms occur in Chapter 3, where we proved that Z is order-
isomorphic to a subset of 0 and, in turn, that Q is order-isomorphic to a subset of R. Further, if we stretch the basic definition under consideration a little, we can reformulate Theorem 2.1.8 in terms of an order-isomorphism.
We turn now to the derivation of certain structural properties of
ordered domains which yield as a by-product a characterization of the ordered domain of integers. In preparation for the first result the reader should review the discussion of integral systems in Section 2.1.
THEOREM 8.1. An ordered domain D includes a unique subset consisting of 0 and positive elements which, together with the function s such that xs = x + 1 and 0, forms an integral system. Proof. Setting Do = 101 U D+, we note that (i) 0 E Do, (ii) x E Do implies that xs C Do, (iii) xs v-1 0 for all x in Do, and (iv) xs = ys implies that x = y. Hence, (Do, s, 0) is a unary system which satisfies condition I, (that is, s is a one-to-one mapping on Do into Do - {01) for an integral system. Hence the collection a) of all subsets of Do, which together with s and 0 satisfy I,, is nonempty. Let ND be the intersection of the collection D. Then (ND, s, 0) is a unary system satisfying I,. We claim, further, that (ND, s, 0) satisfies 12, and therefore is an integral system. To prove this, consider any subset M of No
such that 0 E M, and if x C M then xs E M. Clearly, M E a) and therefore No C M, whence M = No. To prove the uniqueness of ND suppose that I is a subset of D consisting of 0 and positive elements and such that (I, s, 0) is an inte-
gral system. Then 1 E 2) and so Nn e I. Since 0 E ND and x C No implies that xs E ND, it follows that ND = I.
360
Several Algebraic Theories
I
CHAP. 8
Since for the integral system (ND, s, 0) defined in the above theorem, addition, multiplication and less than satisfy the defining properties of addition, multiplication, and less than, respectively, in N, it follows from Theorem 2.1.8 that an ordered domain D includes a unique subsystem which is order-isomorphic to the system of natural numbers. This is not the end of the matter. In order to state the final result it is convenient to make a definition. If D is an integral domain and E is a subring of D, then it is clear that E is an integral domain which we shall call a subdomain of D. If D is ordered, then so is E. The refinement of the preceding theorem can now be stated as
THEOREM 8.2. An ordered domain D includes a subdomain order-isomorphic to Z. Proof. Let ND be the subsystem of D which is order-isomorphic to
the system of natural numbers. If a, b C ND, then D contains a - b, the solution of x + b = a. Let ZD = (a - bla, b C ND). Then for all
a - b, c - d C ZD, (1)
a - b=c - d if a+d=b+c,
(2)
(a - b) + (c - d) = (a + c) - (b + d),
(3) (4)
0
(a - b) (c - d) = (ac + bd) - (ad + bc).
a - bCND - (0).
Recalling the definition of an element of Z (see Section 3.3), it follows
from (1) that if a, b C ND and a a' and b -- b' under the isomorphism between ND and N, then the correspondence a - b -} [(a', b')J; is a mapping on ZD into Z. Indeed, it is seen immediately that this
is a one-to-one and onto mapping. Moreover, (2) and (3) imply that this mapping preserves operations, and (4) implies that positive elements map onto positive elements, whence order is preserved. In summary, ZD is order-isomorphic to Z.
THEOREM 8.3. An ordered domain D with the property that the set Do of nonnegative elements of D is well-ordered is orderisomorphic to Z. Proof. Again let ND be the subsystem of D which is order-isomorphic
to N. We shall prove first that, by virtue of the added assumption, ND exhausts the set Do of nonnegative elements of D. Indeed, assume to the contrary that Do - ND 96 0. Then this is a set of positive elements and has a least member a. Now a Pp 1 (since 1 C ND), so a > 1 since 1 is the least positive element in D (see Exercise 8.5
in this section). Then a - 1 E Do - ND, since if a - 1 E ND then
8.9
I
A Characterization of the System of Rational Numbers
361
(a - 1) + I = a C ND, contrary to the choice of a. However, since
a = (a - 1) + I and 1 > 0, it follows that a - 1 < a and this contradicts the fact that a is the least element of Do - ND. Thus, the assumption that Do - ND is nonempty leads to a contradiction, so we may conclude that Do = ND. According to Theorem 8.2, D includes along with ND an ordereddomain ZD which includes ND and is order-isomorphic to Z. Our proof is completed by showing that ZD exhausts D. For this we use the fact that if d C D, then exactly one of d = 0, d > 0, d < 0 holds. In the first two cases d C ND while in the last -d C ND, and therefore - (-d) = d C ZD. Thus, D = ZD. As we learned in the foregoing theorem, the system of integers may be characterized to within isomorphism as the only ordered domain with the property that the set of its nonnegativb elements is wellordered. What amounts to the same, the fourteen properties of Z listed in Theorem 3.3.1 characterize Z to within an order-isomorphism. EXERCISES 8.1. Prove properties 05-07 of the ordering relation in an ordered domain. 8.2. Let D be an integral domain in which there is defined a simple ordering
relation < such that if a < b then a +c < b + c and if a < b and c > O then ac < bc. Prove that D is an ordered domain.
8.3. Let D be an ordered domain. Prove the following properties of the absolute value function on D. (i) Ia + bI 5 IaI + Ibl. (ii) Iabl = IaIIbI
8.4. Prove that if D and D' are ordered domains and f is an order-isomorphism of D onto D', then f[D+] = (D')+, f preserves ordering, and f-' is an order-isomorphism of D' onto D. 8.5. Let D be an ordered domain whose nonnegative elements form a wellordered set. Prove that I is the least positive element of D. 8.6. Prove that a$ = b2 implies that a = b in an ordered domain. 8.7. Prove that the cancellation law for multiplication can be deduced from the other assumptions for an integral domain if the domain is ordered.
8.8. In an ordered domain prove that a2 - ab + b2 >0 for all a and b.
9. A Characterization of the System of Rational Numbers In Section 6 a field was defined as a ring F such that the' set F* of nonzero element is closed under multiplication and (F*, , 1) is a com-
Several Algebraic Theories
362
I
CHAP. g
mutative group. The latter condition implies that each equation of the form bx = a, with both a and b nonzero has a unique solution, namely b-'a (= ab-'). We shall also designate this element by a
or a/b.
b
The equation bx = 0 with b P` 0 also has a unique solution, namely, x = 0, since b is not a zero-divisor. For this reason we make the definition
b(=0/b) =0 ifb00. Computations with field elements written in the form alb may be caried out exactly as with elements of the field of rational numbers. For example,
ac
a
c
b
d -bd
The first of these, for instance, is simply the identity (ab-1)-1 = a 'b written in the new notation. Another important rule is the following: a b
__ c
iffad =bc.
To prove this let us assume first that alb = c/d, that is, that ab-' = cd-1. Multiplication by bd yields ad = be. Conversely, if ad = bc, then multiplication by b-'d-' gives ab-' = cd-' or, otherwise expressed, alb = c/d. Since our only concern with the theory of fields is to obtain a char-
acterization of the field 0 of rational numbers, we turn directly to a consideration, in abstract form, of the relationship of 0 to the ring Z which was used to construct 0. The obvious feature of this relationship
is that 0 is an extension of Z in which division by nonzero elements can be carried out (that is, the equation bx = a has a solution for b P` 0). What conditions if any, we ask, must a ring R satisfy in order that there exist an extension of R in which division by nonzero elements can be carried out? In other words, what rings can be imbedded in some field?
8.9
I
A Characterization of the System of Rational Numbers
363
Obvious necessary conditions are that the ring be commutative and that it have no proper zero-divisors. Collectively, these conditions mean that the ring is an integral domain. We shall prove that, conversely, these conditions are sufficient. Although there is no reason to separate the finite case from the infinite one in proving that an integral domain can be imbedded in a field,
it is worthy of note that there is nothing to prove in the finite case, since a finite integral domain is a field (see Exercise 6.7). Further, we argue, the proof which must be supplied in the infinite case has already been given. Indeed, if the construction in Section 3.4 of the field of rational numbers from Z is reviewed, suppressing all mention of positive elements and positiveness, it will be found that only properties of Z as an integral domain are employed. That is, the construction described in Section 3.4 may be carried out starting with any integral domain D and the result is a field QD [that is, a system having properties (1)-(10) of Theorem 3.4.11, which includes an isomorphic image of D. We interrupt our discussion to state this as our next theorem.
THEOREM 9.1. An integral domain can be imbedded in a field. The extension QD of an integral domain D which is secured by the construction in Section 3.4 is called the field of quotients (or quotient field) of D. An element of QD is an equivalence class of ordered pairs (a, b), where a, b C D and b 0 and the subset of QD which is isomorphic to D consists of those equivalence classes having representatives
of the form (a, 1). The isomorphism in question maps a onto [(a, 1)]. We shall identify a and [(a, 1) ] which implies, since an arbitrary element [(a, b) ] of QD can be written as [(a, 1)][(b, 1)1-', that the elements
of QD consist of all quotients alb where a, b C D with b s 0 and alb = c/d if ad = bc. The field QD is the smallest field in which D is imbedded, in the sense that any field Fin which D is imbedded includes a subfield isomorphic to Qo. To prove this, let us assume that D is imbedded in F. We shall prove that QD is also imbedded in F. Let D' be the isomorphic image of D in F and consider the subset F' of F where
F' = {a'(b')-'Ia', b' C D' and b' ; 01. It is a routine exercise to prove that F' is a subfield of F. Assuming that this has been done, we go on to show that F' is an isomorphic image of Qo under the mapping f on QD onto F' such that
f(a/b) = a'(b')-',
364
Several Algebraic Theories
I
CHAP P. 8
where x' is the image in D' of x in D under the given isomorphism of D onto D'. From the definition of F', f is onto P. Further, f is one-to-one
since if a'(b')-' = c'(d')-' then, in turn, a'd' = b'c', ad = bc, alb = c/d. Finally, we note that alb + c/d = (ad + bc)/bd ->- (ad + bc)'((bd)')-'
_ (a'd' + a'(b')-' + =
b'c')(b')-'(d')-l c'(d')-'
and
(a/b)(c/d) = ac/bd -+- (ac)'((bd)')-' = a'c'(b')-'(d')-' (a'(b')-') (c'(d')-'). Hence, f is an isomorphism of QD onto P. A field is said to be an ordered field if, when considered as an integral domain, it is an ordered domain. In the event that an integral domain D is ordered, then its field of quotients, QD, is an ordered field. That is, QD includes a subset QD which is closed under addition and multiplica-
tion and has the property that if x C QD, then exactly one of x = 0, x C Qv, -x E Qn holds. Our candidate for Qn is {a/b C QDjab > 01. It is closed under addition since if a/b, c/d C Q, , then (ad + bc)bd = abd2 + b2cd > 0,
since ab > 0, cd > 0, and so on, whence alb + c/d E Q+. It is closed under multiplication, since if ab > 0 and cd > 0, then abcd > 0. Finally,
it is immediately seen that if alb C QD, then exactly one of ab = 0, ab > 0, ab < 0 holds. Hence, QD has the three required properties and the field QD is ordered. We note that what we have done is to make use of the given ordering of D to define an ordering of its quotient field. Since we have identified
the element a in D with the element all of QD, it is clear that a is a positive element of D if a is a positive clement of QD. That is, our ordering of the quotient field is an extension of the given ordering of D. We can prove further that the ordering which we have introduced for QD is the only ordering which extends that of D. For this we recall that in an ordered domain a nonzero square is always positive. If the quo
tient alb is positive, then the product (a/b)b2 = ab must be positive) and conversely. Hence, in any ordered field,
alb > 0 if ab > 0. This completes the proof of
8.9
I
A Characterization of the System of Rational Numbers
365
THEOREM 9.2. The quotient field QD of an ordered integral domain D is ordered upon defining alb as positive if ab is a positive element of D. This is the only way in which the ordering of D can be extended to an ordering of Qn. In an ordered field the relation of less than is defined as in any ordered
domain; that is, a < b if b - a is positive. In addition to the properties 0; Os in Section 8, there are the following for the ordering relation of an ordered field.
0<1/aiffa>0. alb < c/d iff abd2 < b2cd.
0 1/a> 1/b.
ai+a2--.. +an>0.
Our next theorem yields a characterization of the field of rational numbers.
THEOREM 9.3. An ordered field F includes a subfield orderisomorphic to the field of rational numbers.
Since an ordered field is an ordered domain, Theorem 8.2 is applicable and we may conclude that an ordered field F includes Proof.
a subdomain D order-isomorphic to Z. From the argument after Theorem 9.1 it follows that F includes an isomorphic image of the quotient field of Z; that is, F includes an isomorphic image of Q.
This result gives a characterization of Q as the smallest ordered field (to within isomorphism, naturally). The statement that 9 is an ordered field summarizes properties (1)-(13) of Theorem 3.4.1. The "smallness" of 0 is the content of (14) of that same theorem. If F is an ordered field, then the ordered subfield of F which is iso-
morphic to 9 is called the rational subfield of F. It should be clear that it consists of just those elements of F having the form
ml /ni, where 1
n$0.
is the identity clement of F and m and n are integers with
We conclude this section with the introduction of one further notion for ordered fields. The ordering of an ordered field F is said to have the Archimedean property if for every pair a, b of elements of F with
366
Several Algebraic Theories
I
C H A P. 8
a > 0, there exists a positive integer n such that na > b. The origin of this definition is the property of the ordered field 0 which is stated in Theorem 3.4.3 and of the ordered field R stated in Theorem 3.6.3. Although in the statement of Theorem 3.4.3, "nr" is interpreted to be a product of field elements, such a product has an interpretation in any field as an nth multiple, and this is the interpretation intended in the general case. Since in the case of 0 the interpretation of nr as a field product and as the nth multiple of r coincide, the ordering of the field of rational numbers has the Archimedean property in the sense of the general definition. If the ordering of an ordered field F has the Archimedean property, we shall refer to F as an Archimedean-ordered field. If F is Archimedcan-ordered, then its rational subfield is dense in F in the same sense that Q is dense in R (Theorem 3.6.2). We prove this next.
THEOREM 9.4. If F is an Archimcdean-ordered field and a and b are in F and a < b, then there exists an element c of the rational subfield Q of F such that a < c < b. Proof. Consider first the case where a > 0. Since b - a > 0, there exists a positive integer n such that n(b - a) > 1, so (1)
nb>na+1.
Also, there exists a positive integer m such that ml > na. Supposing m to be the smallest such positive integer,
ml >na> (m-1)1, since 1 is positive. In view of (1) it follows that
nb> (m-1)1 +1 =ml >na. Hence, b > ml/nl > a, which is the desired conclusion. If a < 0, then there exists a positive integer p such that pl > -a, and then a + pl > 0. By the first part of the proof, there is an element c in Q such that a + pl < c < b + pl. Hence, a < c - p1 < b, where c - p1 E Q. EXERCISES 9.1. Prove those properties of less than which are stated immediately following Theorem 9.2. 9.2. Prove that the positive elements of an ordered field are not well-ordered by the given ordering relation. 9.3. Let P be the set of all sequences (ak) = (ao,
. ., a,, a1,
.
. .)
8.10
I
A Characterization of the Real Number System
367
of rational numbers having only a finite number of nonzero members. We define (ak) = (bk) if ak = bk for all k. We introduce operations into P by the following definitions:
(ak) + (bk) = (Sk)
where Sk = ak + bk,
(ak)(bk) = (Pk) where pk = E
aib;.
i+j-k 0, 1), where 0 = (0, 0, . . ., 0, . . .) and 1 = (a) Prove that (P, (1, 0, . , 0, ), is an integral domain. (b) Defining P+ to be the set of all elements (ak) of P such that the last non-
zero member of (ak) is a positive rational, show that P is an ordered domain. (c) Using Theorem 9.2, the quotient field Qp of P is an ordered field. Show that this ordering does not have the Archimedean property by proving that if x = (0, 1, 0, , 0, ), then for no positive integer n is ni > x.
9.4. Prove that the ordering of an ordered field F has the Archimedean property if for each element a of F there exists a positive integer n such that
nl > a. 9.5. Prove that if F is an Archimedean ordered field, then for each element
a in F there exists a positive integer n such that -nl < a and there exists a positive integer n such that 1/nl < a if a is positive.
10. A Characterization of the Real Number System An ordered field F is called complete if every nonempty subset of F
which has an upper bound has a least upper bound. According to Exercise 1.11.15, an ordered field F is complete if every nonempty subset of F which has a lower bound has a greatest lower bound. Thus the notion of completeness takes a symmetric form which is seemingly lacking in its definition. According to Theorems 3.6.1 and 3.6.4, the real number system is a complete ordered field. In this section we shall prove that these properties of R characterize it to within isomorphism. As the first step in this direction we prove three results about complete ordered fields.
THEOREM 10.1. If F is a complete ordered field, then the ordering has the Archimcdean property. Proof. Assume to the contrary that there exists a pair a, b of elements of F with a > 0 such that for all positive integers n, b > na. Then b is an upper bound of { na C Fln C Z } 1. Since F is complete, this set has a least upper bound c. Then every positive multiple of a is less than or equal to c, so that (m + 1)a < c for every positive integer m.
368
Several Algebraic Theories
I
CHAP P. 8
This implies that ma < c - a, so c - a is an upper bound for Ina C Fln C Z+}. Since c - a < c, this contradicts the property of c of being the least upper bound.
COROLLARY. If F is a complete ordered field, then its rational subfield is dense in F. Proof. This follows from Theorem 9.4.
THEOREM 10.2. Let F be a complete ordered field and Q be its rational subfield. For a member c of F let Ac _ {a C Qla < c} and B,, _ {b E Qjb > c}. Then both the least upper bound of A, and the greatest lower bound of B, exist and lub A. = c = glb Bc. Proof. By the Corollary above there is in Q an element a such that
c - I < a < c, so A. is nonempty. Also, c is an upper bound for A., and hence the least upper bound of A. exists and is less than or equal to c. To prove equality we assume that lub A. < c and derive a contradiction. If lub A, < c, then there exists an a' C Q such that lub A. < a' < c. This is a contradiction since, on one hand, it implies
that a' E A, and, on the other hand, it asserts that a' > lub A,. The proof that the greatest lower bound of B, exists and is equal to c is similar.
We are now in position to prove the main theorem of this section, namely, that to within isomorphism there is only one complete ordered field.
THEOREM 10.3. Any two complete ordered fields are orderisomorphic.
Let F and(F' be complete ordered fields and Q and Q' their respective rational subfields. Then Q and Q' are order-isomorphic since each is order-isomorphic to the field of rational numbers. If f is the isomorphism of Q onto Q' we shall write x' for f(x) and X' for, f [X] if X C Q. Further, we shall denote members and subsets of Q' by primed letters and their counterimages in Q by the same letters. Proof.
without primes.
The strategy of the proof is to define an extension of f having F. as domain and which can be proved to be an order-isomorphism of
8.10
1
A Characterization of the Real Number System
369
onto P. To this end, consider an element c of F. Defining A. and B. as in Theorem 10.2, we know that lub A. = c = glb B0. If b' E B'C,
then for each a' E A',, a' < b' since a < c and c < b. Hence, b' is an upper bound for A', so the least upper bound of A' exists and is less than.or equal to Y. ,Since this holds for each b' in B',, lub A., is a lower bound for B',, and then the greatest lower bound of B', exists and lub A', < glb B'. We establish equality here by showing that the other possibility leads to a contradiction. Indeed, the assumption that lub A', < glb B', implies that there exists a d' in Q' such that
lubA'
a
for every a in A., and every b in B.. Since either d < c or c < d, either d C A, or d C B., which, in view of (1), yields the contradiction d < d. Thus, we have proved that lub A' = glb B".
In case c C Q, it is clear that lub A,' < c' < glb B', and hence tub A', = c' = glb B. (2) In case c C F - Q, we define c' by (2). It is this extension of f which we shall prove is an order-isomorphism of F onto F. We show first that this mapping preserves ordering. Let cl, c2 C F and cl < c2. Then there exist a, b E Q such that
cl
ci
such that al < cl and as < C2. Further, al + as < cl + as < cl + cz, so that al + as < cl + c2. Hence
370
Several Algebraic Theories
i
CHAP. 8
ai + a' = (a, + a2)' < (c, + c2)' and, consequently, a', < (c1 + c2)' - a2'-
Since ai is an arbitrary element of A' we infer that c; = lub A,', < (c, + c2)' - a$) which implies that a2 < (c, + C2)' - Ci for all as in A. Hence, in turn, c$ = lub A' < (c, + c2)' - ci, c' + Cl'
(Cl + C2)'-
A similar argument, in which cl and cs are interpreted as greatest lower bounds, establishes the reverse inequality. Thus, we have proved that Ci + C2 = (CI + c2)'.
The proof that the mapping c -'- c' preserves multiplication is somewhat more complicated. We consider first the case of positive ele-
ments. Suppose that c, and c2 are positive elements of F and let a; and as be positive elements of A', and A', respectively. Then at and at are positive elements of Q such that a, < c, and as < c2. Further, a,a2 < C1C2 < tics, so that alas < c,c2. Hence
a',a2' = (alas)' < (cic2)'. Thus, for each positive element as in A' , a, < (CIC2)'(a2')
for all positive ai in A. Hence c; = lub A',, < (c,c2)'(a2)-1, which implies that as <
(clc2)'(4)-'
for all a2' in Ate, and then c2 = lub A'C. < (c,ct)'(c;)
Thus, C;cg < (c,c2)'. A similar argument, in which c; and c2' are in. terpreted as greatest lower bounds, establishes the reverse inequality Thus (3)
where c, > 0 and C2 > 0.
elcs = (C1C2)',
8.10
f
A Characterization of the Real Number System
371
Finally, we extend (3) to all cl and C2. If one or both of c1 and c2 is equal to 0, then (3) is true trivially. If c, > 0 and c2 < 0, then the restricted version of (3) applies to c, and -c2. This, together with the fact that c c' is an isomorphism of the additive group F onto the additive group F', justifies the following computation:
414 = Ci(-(-C8)) _ -(C'(-Cz)) -(C1(-C2))' _ - (- (C1C2))' = - (- (CIC2)') = (C1C2)'-
The proof of (3) for the case c, < 0 and c2 < 0 is left as an exercise. There are other characterizations of R ; these stem from other methods of extending Q. to obtain a system with the least upper bound property (that is, the existence of least upper bounds for nonempty sets having an upper bound). Before describing one of these we call attention to the
point of view adopted in the constructions of Chapter 3. There, in order to correct a "deficiency" of N, of Z, and of Q, we constructed in turn a new system designed to avoid the deficiency at hand and simultaneously to include a subsystem isomorphic to the parent system. The
characterizations of Z, 0, and B. obtained so far in this chapter establish the fact that in each case we obtain a minimal extension with the
desired property (as asserted in the introduction to Chapter 3). An alternative point of view for these constructions includes taking into account from the outset the desired feature of minimality of the extensions. For instance, in the extension of N to Z, this point of view manifests itself by adjoining to N a suitable disjoint set to serve as the negatives
of the nonzero natural numbers. Similarly, the third extension is approached as the problem of constructing a minimal extension of 0, considered merely as a dense chain, having the least upper bound property. The first step in the solution is the construction of an extension of Q (that is, a dense ordered chain which includes 0 and which preserves the given ordering of the elements of Q), having the least
upper bound property. The second step is the proof that, within isomorphism, there is only one such extension E which is a part of any suitable extension and which has the following two properties : (i) an element of Q which is a least upper bound of a subset S of 0 continues
to be a least upper bound of S in E, and (ii) every element of E is a least upper bound of some set of rationals having an upper bound in Q. Finally, such a minimal set is selected and the operations of addition and multiplication extended to it from its subsystem 0. The result is a complete ordered field and, hence, R.
372
Several Algebraic Theories
I
CHAP P. 8.
Apart from this approach leading to, what is from our viewpoint, a characterization R, it is of interest that there exist extensions of Q which lack either property (i) or (ii) above. Such extensions when equipped with operations become fields which fail to have the Archi. medean property (and so are called non-Archimedian ordered fields).
EXERCISES 20.1. Prove that every Archimedean-ordered field is isomorphic to a subfield of R. 10.2. Supply the missing parts of the proof of Theorem 10.2.
BIBLIOGRAPHICAL NOTES Section 2. A more comprehensive introduction to the theory of semigroups appears in C. Chevalley (1956). Sections 3-5. There are several excellent textbooks devoted to group theory. W. Ledermann (1953) is an introductory account of the theory of finite groups, More complete accounts of the entire theory appear in M. Hall, Jr. (1959) and A. G. Kurosh (1955). Sections 6-7. Accounts of the topics treated appear in every textbook of modern algebra. Sections 8-10. Most of the notions discussed are treated in textbooks devoted to modern algebra. The proof of Theorem 10.3, that any two complete ordered fields are order-isomorphic, is taken from E. J. McShane and T. A. Botts (1959).
CHAPTER
9
First-order Theories
IN THIS CHAPTER we give an introductory account of modern investigations pertaining to formal axiomatic theories-that is, axiomatic theories in which there is explicitly incorporated a system of logic. particular attention is paid to those theories for which the logical base is the predicate calculus of first order. These are described in Section 4 after disposing of a necessary preliminary in Sections 2 and 3, namely, an axiomatization of the first-order predicate calculus. Section 7 gives an account of the notions of consistency, completeness, and categoricity for first-order theories, using results obtained in Section 6. After a brief introduction to recursive functions in Section 8, the notion of decidability for first-order theories is examined in Section 9. In this section there is sketched a proof of the famous theorem, due to Church, which asserts the unsolvability of the decision problem for the first-order predicate calculus. In Section 10 appear two other famous theorems about formal axiomatic mathematics. These are the Godel theorems of 1931. One asserts that a sufficiently rich formal theory of arithmetic is either inconsistent or contains a statement that can neither be proved nor refuted with the means of the theory. The other asserts the impossibility of proving the consistency of such a theory, if, indeed, it is consistent. Such results may be interpreted as establishing definite limitations for the axiomatic method in mathematics. Section 11 is concerned with a brief discussion of the Skolem paradox for a formulation of set theory as a first-order theory.
1. Formal Axiomatic Theories In order to achieve precision in the presentation of a mathematical theory, symbols are used extensively. A formal theory carries symbolization to the ultimate in that all words are suppressed in favor of symbols. Moreover, in a formal theory the symbols are taken to be merely marks
which are to be manipulated according to given rules which depend only on the form of the expressions composed from the symbols. Thus, 373
374
First-order Theories
I
CHAP. 9
in contrast to the usual usage of symbols in mathematics, symbols in a
formal theory do not stand for objects. One further distinguishing feature of a formal theory is the fact that the system of logic employed is explicitly incorporated into the theory.
We require additional properties of the formal theories which we shall discuss. These involve an auxiliary notion which we dispose of first. In nontechnical terms, an effective procedure is a set of instructions that provides a mechanical means by which the answer to any one of a class of questions can be obtained in a finite number of steps. An effective procedure is like a recipe in that it tells what to do at each step and no intelligence is required to follow it. In principle, it is always possible to construct a machine for the purpose of carrying out such instructions. The formal theories with which we shall be concerned are axiomatic theories. In such theories formulas are certain strings (that is, finite sequences) of symbols. We require the following properties of formulas.
(I) The notion of formula must be effective. That is, there must be an effective procedure for deciding, for an arbitrary string of symbols, whether it is a formula. (II) The notion of axiom must be effective. That is, there must be an effective procedure for deciding, for an arbitrary formula, whether it is an axiom. (III) The notion of inference must be effective. That is, there must be an effective procedure for deciding, for an arbitrary finite sequence of formulas, whether each member of the sequence may be inferred from one or more of those preceding it by a rule of inference. In such a formal axiomatic theory the notion of proof is effective; that
is, there is an effective procedure for deciding, for an arbitrary finite sequence of formulas, whether it is a proof. Such an effective procedure does not furnish a method for discovering proofs. It merely enables one to decide whether a purported proof is, in fact, a proof. We do not require the notion of theorem to be effective. If there can be found for a theory an effective procedure for deciding, for an arbitrary
formula, whether it is a theorem, the theory often loses its appeal to mathematicians. For the implication of the notion of theorem being effective is that one can devise a set of preassigned instructions for a
9.2
1
The Statement Calculus
375
machine such that it could check formulas of the theory to determine whether they are theorems. Mathematical logicians have shown that for many interesting axiomatic theories the notion of theorem is not effective. We emphasize that this means the nonexistence of effective procedures for "theoremhood" has been proved for some theories and not merely the nondiscovery to date of effective procedures. It follows that human inventiveness and ingenuity is necessary in mathematics. A problem which must be faced in presenting a formal axiomatic theory is how to specify the system of logic to be used. One obvious way is to give the rules of inference. In all interesting systems the set of rules is infinite, and there arises the problem of how to specify the set in such a way that one can determine whether a particular rule is in the set. The solution we shall employ calls for specifying a finite set of rules of inference and adding logical axioms to those of the axiomatic theory for the purpose of generating theorems which express further logical principles. That is, the solution calls for the fusion of an axiomatized system' of logic with an axiomatic theory to produce a formal axiomatic theory. Of the systems of logic which might be used in this connection, we shall choose the predicate calculus of first order. Our justification for this choice is that it formalizes most of the logical principles accepted by most mathematicians and that it supplies all the logic necessary for
many mathematical theories. In the next two sections we describe an axiomatization.
2. The Statement Calculus as a Formal Axiomatic Theory In view of the role of the statement calculus in a theory of inference (Section 4.4), the goal of an axiomatization is a formal axiomatic theory
in which the theorems are precisely the tautologies. This was first achieved by Frege, in 1879. Since then, many formulations have appeared. That which we shall present is the simplification of Frege's formulation due to -.ukasiewicz. The primitive symbols (or formal symbols) are
a B e a1 B1 e1 ...
The symbols in the second row are called statement variables. Th three dots, which are not symbols, indicate that the list continues without end. We define formula inductively as follows.
First-order Theories
376
I
CHAP. 9
(I) Each statement variable alone is a formula. (II) If A and B are formulas, then (A) - (B) is a formula. (III) If A is a formula, then -1 (A) is a formula. (IV) Only strings of primitive symbols are formulas. A string is a formula only if it is the last line of a column of strings, each either a variable or obtained from earlier strings by (II) or (III).
As in the definition of formula, we shall use capital English letters as
variables for arbitrary formulas. It can be proved that the notion of formula is effective. In applications the statement variables are replaced
by the prime formulas, and hence are interpreted as designating the values of the prime formulas (that is, the truth values T and F). In terms of the definitions made in Section 4.3, a truth value may be assigned to any formula A for a given assignment of values to the variables of A. When writing formulas, the conventions described earlier regarding the omission of parentheses will be followed. Also, we introduce the following abbreviations for certain formulas:
AV B for -,A->B, A A B for (A ---). --I B),
A- B for (A-'B)A(B-'A). The axioms for the theory are the following formulas, where A, B, and C are any formulas:
(PC1) A -' (B -' A), (PC2) (C ---> (A B)) - ((C --> A) --j (C - B)), (PC3) (-, A -> -, B) -> (B -> A).
Writing the axioms with variables for arbitrary formulas means that each of (PCi)-(PC3) includes infinitely many axioms, one for each assignment of formulas to the variables occurring. [This agreement is signaled by referring to each of (PC1)-(PC3) as an axiom schema.] For example, by virtue of (PC1), each of (a
631)
((B
(a -+ (Bi))
is an axiom. Even though there are infinitely many axioms, the notion of axiom is effective, since each axiom must have one of three forms. The only rule of inference is modus ponens (see Example 4.4.6) : From formulas A and A -> B the formula B may be inferred. The exact form which the definition of proof (Section 5.1) takes for
9.2
I
The Statement Calculus
377
the statement calculus is as follows. A (formal) proof is a finite column of formulas, each of whose lines is an axiom or may be inferred from two preceding lines by modus ponens. A (formal) theorem is a formula which occurs as the last line of some formal proof. We shall symbolize the assertion that A is a theorem by I- A.
An illustration of a formal proof is given next. It is a proof of the formula
a a a. It follows that F a --> a. (1)
(2) (3) (4)
(5)
(a --> ((a3 -- a) -- (0) - ((a -> ((B --+
a --+ ((a3 -> a) -> a)
(a -> ((B --> a)) -> (a - a)
a -' (a3 -' (t)
a
a
(0) --j (a - a)) Axiom schema (PC2) Axiom schema (PC1) 1, 2 modus ponens Axiom schema (PC1) 3, 4 modus ponens
When a proof is given, an analysis is usually given in parallel, as above. This is not required, however, because there is an effective procedure for supplying an analysis. We observe that we can just as easily prove I- a3 -+ a3 or F- (e A a) -->
(e A a) by repeating the above sequence of formulas with M or e A a
in place of a. Indeed, if in the above formal proof we substitute any formula A for the statement variable a, we get a formal proof of the formula A -' A. But if, instead, we substitute the variable "A" for a (and, "B" for a3) we get a proof schema of the theorem schema "A - A." A theorem schema, like an axiom schema, has the merit that a theorem results when the same formula is chosen for all occurrences of any letter that appears in it. We now extend- the definition of theorem to that of deduction from assumptions. If r is a (possibly infinite) set of formulas and A is a formula, then we define D(r, A) to be the set of those finite columns X of formulas whose last line is A, such that each line of X is either an axiom or an element of r or else may be inferred from two earlier lines of X by modus ponens. If, for given I' and A, D(P, A) is nonempty, then A is said to be deducible from assumptions I', symbolized
r i- A, and a member of D(r, A) is called a (formal) demonstration of A from I. Basic conditions which these definitions satisfy include the following.
First-order Theories
378
I
CHAP. -9
(i) If there is an effective procedure for deciding whether a given formula is a member of the set r, then, for each A, there is an effective procedure for deciding whether a column of formulas is
or is not a member of D(r, A) [that is, is a demonstration of A from r I. (ii) P F- A whenever A is a member of I' or an axiom.
(iii) if rF-Aand rF-A->B,then rF-B. (iv) If r F- A, then, for each set A of formulas, r u A F- A. (v) If I' F- A and r is the empty set, then F- A.
(vi) If r F- A, then there exists a finite subset r, of r such that r1F-A. Condition (iii), for example, follows from the fact that if X C D(r, A) and Y C D(r, A --> B), then (X, Y, B), the column consisting of the formulas of X in order, followed by those of Y in order, followed by B, is a member of D(I', B). If in a formal axiomatic theory the notion of deducibility is analyzed into simple steps and the axioms (or, axiom schemas) are few in number, then formal demonstrations and formal proofs of even quite an
elementary character tend to become long. However, having once given an explicit definition of what constitutes a deduction from assumptions (and, hence, a formal proof) it is not always necessary to appeal directly to the definition. The alternative is to establish theorems,
called derived rules of inference, which assert the existence of proofs under various conditions. An illustration of such a rule for the statement calculus is provided by (iii) above. A useful instance of (iii) is the derived rule If F- A and F- A --> B, then F- B. An application of this rule or of the generalization [which follows from (iii) and (iv) ],
If r F- AandF- A -'B,thenrI- B, is commonly called "modus ponens" because of the similarity of each to the rule of inference, which has a like form. Another derived rule, one which plays a crucial role in the proof that the formalized statement calculus fulfills its intended role (and which appears later in an extended form), is given in
THEOREM 2.1 (the deduction theorem for the statement calculus). If r is a set of formulas and A and B are formulas, then
r u JAI F- B implies I' F- A --, B.
9.2
I
379
The Statement Calculus
Proof. t
Assume that r, A F- B and let the column X=(C1,C2,...,Q
be a formal demonstration of B from r u { Al. For each i = 1, 2,
, n we define by induction a column Y; as follows. Case 1.
If C; is an axiom or an element of r, let Y; be the column
(A-->C1),A- Q. If C, is A and Case 1 does not hold, let Y. be the column whose lines in order are the proof, given earlier, of A --* A. Case 2.
Case 3. If C, is inferred from two earlier lines C, and C, -- C, of X by modus ponens, and the preceding cases do not hold, and j is the least index for which there is such a C,, let Y, be the column
(Y j, Yk, (A - (Ci `i C,)) - ((A - C,) - (A - C,)), --' (A - C,),A--+C,). (A -, Here k is the least index for which Ch is C, -+ C,.
It is left as an exercise to prove by induction that for each i = 1, , n, Y, is in D(r, A -- Q. Since C. is B, this gives the desired result that r F- A -> B. 2,
COROLLARY. If A1j A2, F- At --* (A2
,
A. F- B, then
(... (Am - B) ...
Repeated application of the theorem gives the corollary. The converse of this result is the next theorem. Its proof is left as an exercise.
THEOREM 2.2. If F- A, -+ (A2
(
(Am --> B)
)), then At,
A2, ... , A. F- B. In view of property (vi) of deducibility, Theorems 2.1 and 2.2 accomplish the reduction of the notion of deducibility to that of provability. A comparison of these theorems with the Corollary to Theorem 4.4.1 shows the parallel between this result and the reduction of the notion of valid consequence to the notion of validity. It follows that if we can show that a formula A is a theorem if it is a tautology, we will have demonstrated the equivalence of the informal and the formal statement calculus, both by themselves and when applied under a set of assumption
formulas. We do this in the next two theorems. First, it may be noted t Hereafter we shall abbreviate "r u JAI F- B" to "r, A F- B" and "JAI, A,,
f- B" to "A,. A,, . ,A, F- B."
, A. 1
380
First-order Theories
I
CHAP. 9
that in the present circumstances we understand a tautology to be a formula such that for each assignment of truth values to its constituent statement variables, it is assigned truth value T in accordance with the truth tables for -i and -'. The theorem which asserts that every tautology is a theorem is an example of a completeness theorem in the positive sense, as discussed in Section 5.4. It can be derived easily from the following lemma.
LEMMA 2.1. Let A be a formula of the statement calculus in which occur only statement variables from the list P1j P2, , Pk. Define p to be Pi or -' Pi according as P; takes the value T or F and A' to be A
or -1A according as A takes the value T or F for an assignment of truth values to P1j P2, , A. Then
P., P2, ...,P' -A'
(1)
for every assignment of truth values to P1, P2,
, P.
The proof is by induction on the number of symbols in A, counting each occurrence of -, or -+ as a symbol. If n = 0, then A is some Pi. Then A' is P; and (1) is immediate. Assume the lemma true for all formulas with less than n symbols and consider A with n Proof.
symbols. Case 1.
(2)
A is of the form -1 B. Then, by the induction hypothesis, PP, P2, ... , P.' I- B'
, P. for all assignments of truth values to P1, P2, Subcase 1.1. B takes the value T. Then A takes the value F, B' is B,
and A' is -, A, that is, -, -s B. Now F- B -' -, -, B (see Exercise 2.3)
and (2) reads P;, P2,
P2i ,P,'F- -i
, P,, F- B; then, by modus ponens, p,
B, which is (1).
Subcase 1.2. B takes the value F. Then A takes the value T, B' is -1 B, and A' is A, that is, -, B. Then (2) gives P,, P2f , P,' FB, which is (1). Case 2. A is B --j C. Then, by the induction hypothesis, (3) (4) Subcase 2.1.
P1, P2, ...,P.F- B', P1, P2, ... , P,r F- C'.
C takes the value T. Then A takes the value T, C' is C and A' is A, that is, B --> C. Hence, (4) is P,, P2, , P,' F- C and (PCI) gives F- C -1 (B --> C), so that, by modus ponens, P;,
P2, ,P,' F- B-C, which is (1). Subcase 2.2.
B takes the value F. Then A takes the value T, B' is
9.2
1
The Statement Calculus
381
, P' , F- -1 B -1 B, and A' is A, that is, B --> C. Hence, (3) is P, Ps, and this, with I- -- B --+ (B --> C) [see Exercise 2.3], yields (1) again by modus ponens. Subcase 2.3. B takes the value T and C takes the value F. Then A takes the value F, B' is B, C' is -i C, and A' is -1 A, that is, (B --a C). , P,', I- B and (4) is P,, Pa, , Pk F- -1 C. Hence, (3) is P;, P2i
These, together with the theorem B --- (-1 C - -1(B -' C)), yield P;, P27
,
Pk 1- -1 (B -- C), which is (1).
THEOREM 2.3 (the completeness theorem for the statement calculus). If A is a tautology, then A is a theorem; that is, if i A, then F- A.
, Pk be the distinct statement variables occurLet P1, P2, ring in A and define P;, Pa, , Pk and A' as in Lemma 2.1. Since A, A' is always A, and then, by Lemma 2.1, P,, P2, , P,t F- A for i Proof.
(5)
, P,. In particular, P,, P2, ... , Pk_ 1, Pk F- A,
(6)
P1, P2, ... , Pk_ 1, -, PA: I- A
every assignment of truth values to P1, P2,
for every assignment of truth values to P1, P2, deduction theorem it follows that P1, P2, ... P.'_ 1 Pk -+ A, (7) (8)
Pi, P2f
, Pk_l. From the
. , P. -I F- -, Pk -> A.
These deductions, together with the theorem (Pk--+A) -' ((--1 Pk -+A)
A),
, Pk _ 1 F- A. Thus the which the reader may prove, give P,, P2, assumption Pk is eliminated. Repeating this process k - 1 times
eliminates all the assumptions, so that F- A.
The converse of the completeness theorem is easily proved as we show next.
THEOREM 2.4. If A is provable, then A is a tautology; that is, if F- A, then r- A. Proof. We observe first that each instance of an axiom schema is a
tautology; that is, the theorem is true for the axioms. Further, by Theorem 4.3.3, if r- A and K A -+ B, then t= B. Since every theorem is either an axiom or comes from the axioms by one or more uses of modus ponens, every theorem is a tautology.
First-order Theories
382
I
CHAP. 9
That is, the notions of validity and provability for the statement calculus are coextensive. This result was proved first in 1921 by the American logician, Emil Post. There is more to be said about the foregoing result. We first remark that we assume it clear that the process provided in the definition (Section 4.3) for determining the truth value of a formula A for a given assignment of truth values to the statement variables in A is effective. Since any A has only a finite number of variables, and hence only a finite number of sets of values of its variables, this leads to an effective procedure for deciding whether A is a tautology or not. Hence, since H A if K A, there is an effective procedure for determining whether a formula of the statement calculus is a theorem; that is, the notion of theorem is effective. More generally, the notion of provability is effective; that is, there is an effective procedure for obtaining a proof of a theorem (which is known to be such because it has been shown to be a tautology). This follows from the fact that the procedures given in the proofs of Theorem 2.3 and Lemma 2.1 are effective. We shall substantiate this, in part, by showing that the proof of Lemma 2.1 provides an effective procedure for finding a proof of A' from assumptions P;, P,, , P. If A has no occurrence of --+, this is provided directly. If A has occurrences of ---, the proof provides directly an effective reduction of the problem of finding a proof of A' to the two problems of finding proofs of B' and C' from assumptions P,, P,, , P. The same reduction can then be repeated upon the latter two problems, and so on. Since the reduction process terminates after a finite number of repetitions, there results an effective proof of A' from PP, P,, P. A similar analysis can be made of the proof of Theorem 2.3. Our next theorem follows directly from the Corollary to Theorem 2.1 and Theorems 2.2-2.4. Its application to obtaining derived rules of inference for the statement calculus is illustrated in the examples which follow.
THEOREM 2.5.
A,, A2,
A.
(AQ
, A. 1- B iff ... (Am --i B) ...)
is a tautology.
EXAMPLES 2.1. Theorem 2.5 enables one to establish derived rules of inference with appropriate tautologies for justification. Below are listed a few such rules, with the tautology which justifies each placed opposite.
9.2
I
The Statement Calculus
383
A f- A V B, A V B, A I- B, A, B I- A,
K A -i, (A V B). A V B -+ (--i A -+ B). A -- (B A).
-, B -' -, A I- A -' B, -,B-> -,A, A l- B,
i
K (- B -- A) --> (A -- B). (-1B- -1A) (A --+ B).
2.2. As an illustration of imbedding a system of logic in an axiomatic theory, an idea which was proposed at the end of Section 1, we outline how the statement calculus can be imbedded in an axiomatic theory. This may be accomplished by
(i) including among the formation rules for formulas of the theory the following :
If A and B are formulas, then so is (A) -* (B), If A is a formula, then so is -,(A); (ii) adding to the axioms of the theory the three axiom schemas we have chosen for the statement calculus (where "formula" is now taken in the extended sense of "formula of the theory"); (iii) adding modus ponens to the rules of inference. Formulas of the theory may then be regarded as formulas of a statement calculus in which the role of the statement letters is played by those formulas which are not of the form (A) -4 (B) or --, (A) (that is, formulas which cannot be decomposed into further formulas using --+ and --, in the way shown). As a result of the imbedding, every tautology will be a theorem of the theory. More important, the statement calculus is available as a theory of inference. This theory is adequate to provide the logical skeleton of various kinds of proofs that are encountered frequently. A few examples follow. (a) To establish that a formula B of a theory in which the statement calculus is imbedded is a theorem, it is sufficient to prove that -1 B -* -,A and A are theorems. This procedure is justified by the fifth instance of Theorem 2.5 in Example 2.1. Similarly the rule -, B --> -,A I- A -' B justifies a proof by contraposition. (b) Let us use "C" to denote a contradiction. In formal terms, the proof of a formula A by contradiction may be stated as
If -1 AF- C,thenI- A. This rule stems from the tautology (-, A -)- C) --j A. In practice, such a proof may take the following form. One shows that -, A F- B and H -1 B and infers that -, A I-- B A -, B, and then F- A. (c) To establish that a conditional A -* B is a theorem with a proof by contradiction, the following rule is often used:
If A, -, B I- C (a contradiction), then l- A -' B. The reader may justify this.
384
First-order Theories
I
CHAP. 9
(d) A "proof by cases" is not uncommon in mathematics. Such a proof of a formula B begins with the enumeration of a finite set A,, A2, , A,,, of formulas which are exhaustive in the sense that I- A, V A2 V V Am. Then proofs of At -+ B, A2 - B, , A,,, -4 B are provided and it is concluded that B is a theorem. The rule at hand is
If I- At V A2 V
V Am, H A, -+ B,
, !- Am_, -+ B,
and H A. -- B, then I- B.
Upon combining Theorem 2.5 and the Corollary to Theorem 4.4.1 we obtain
THEOREM 2.6.
A,, A2,
, A. l B iff A,, A2,
, A. I- B.
As the reader may verify, the implication
(1) IfA,,A2, ,Amt= B, then A,, A2,
A. F- B,
which is included in the theorem is equivalent to the completeness theorem. We wish to show that (1) can be extended to (2) For any set r of formulas, if r K B, then r I- B, which is known as the strong completeness theorem for the statement calculus. We begin with some definitions. A set r of formulas of the statement calculus is called inconsistent if for some formula B we
can deduce both B and -1 B (and, hence, B A -i B) from r. If r is not inconsistent, then it is called consistent. We extend a definition given in Section 4.5 by calling any set r of formulas (simultaneously)
satisfiable if there exists truth-value assignments to the statement variables such that each member of r receives truth value T. In more detail, a truth-value assignment to the formulas of the statement calculus is simply a mapping v on the set of formulas onto IT, F} such that (i) for each formula A, v(-,A) is T or F according as
v(A) is F or T, and (ii) v(A --- B) = F if v(A) = T and v(B) = F. Then r is simultaneously satisfiable if there exists a v satisfying (i), (ii),
and (iii) for all A in r, v(A) = T.t For the case where r consists of a single formula A, satisfiability and validity are connected by (3) K A iff ( -i A } is not satisfiable,
and provability and consistency are connected by t The same description of a truth-value assignment may also be used to clarify the meaning B" used in (2) above.
of the notation "t
9.2
I
The Statement Calculus
385
(4) JA) is consistent if not H -, A. Using (3) and (4), it is easily shown that the completeness theorem is equivalent to (5) Every consistent formula is satisfiable.
In a similar manner we shall prove that the strong completeness theorem (2) is equivalent to (6) Every consistent set of formulas is simultaneously satisfiable.
Further, we shall provide a proof of (6), and thereby (2) will be established.
In order to prove the equivalence of (2) and (6), we shall need the following generalizations of (3) and (4). (7) r K A if r u { A } is not simultaneously satisfiable.
(8) If r is consistent and cc r, then not r F- -, C. To prove (7), assume first that not r K A. Then there exists a truthvalue assignment v such that v(C) = T for each C in r and v(A) = F. Then it is clear that v demonstrates that r u I-, A} is simultaneously satisfiable: For the converse, assume that r K A. If r is a consistent set of formulas, then for each truth-value assignment such that every member of r takes the value T, A also takes the value T, and hence r u { -, A)
is not simultaneously satisfiable. If r is inconsistent, then it is not simultaneously satisfiable (for this the reader is asked to either supply a proof or look ahead to the proof of Theorem 6.1) and, trivially, IF U { -, A } is not simultaneously satisfiable.
The proof of (8) is left as an exercise. We continue by proving the equivalence of (2) and (6). Assume that (2) holds and let r be a consistent set of formulas. If c E r, then, by (8), not r I- -1 C. Hence, by
(2), not r K -, C. From (7) it then follows that r u { -, , c) = r is simultaneously satisfiable. For the converse, we assume that (6) holds and that r r-: B. Then (7) implies that. r u { -, B) is not simultaneously satisfiable, so that, by (6), r u { B} is inconsistent, whence r F- B.
Finally, to complete our objective we prove (6). This is our next theorem.
THEOREM 2.7. If r is a consistent set of formulas of the statement calculus, then r is simultaneously satisfiable. Proof. Since the primitive symbols of our system are denumerable, and its formulas are certain strings of primitive symbols, it is possible
First-order Theories
386
I
CHAP. 9
to enumerate the formulas. Let some enumeration be given, so that we may speak of "the first formula," "the second formula," and so on, referring to this enumeration of the formulas. We shall use this enumeration to derive from r a maximal consistent set of formulas, that is, a set I' such that r is consistent and, if A is any formula such that r U {A} is consistent, then A E P. as follows: Given r, we define an infinite sequence Po, P,, r2, Po = P and, if the (n + 1) th formula is A, then r.+, = Pn U { A) if this is a consistent set. Otherwise r, = P,,. It follows by induction that Po, Ti, P2, . - are consistent sets, since Po is consistent. Let r be the union of the sets Po, P,, P2, . Then P is a consistent set. For the contrary assumption implies the inconsistency of some finite subset of r and hence that of some P;, contrary to what was observed above. Moreover, r is a maximal consistent set. For let A be any formula such that r U JA) is consistent. Say that A is the (n + 1) th formula. The consistency of r U {A} implies that r. u {A} is a consistent set. Hence, by the definition of r.+,, A is a member of Pn+, and hence a member of P. We list next five consequences of the maximal consistency of P.
(i) AEPifTI- A. (ii) If B is any formula, then exactly one of the pair B, -, B is in P;
(iii) If B E P, then A -' B C P for any formula A. (iv) If A a P, then A -* B E P for any formula B.
(v) IfAEPandBFQP,thenA - BvP.
To prove (i), let us assume first that A E P. Then P I- A since A I- A,
For the converse, assume that r I- A. This means that Ti I- A for some finite subset Ti of P. Then the set r U { A } is consistent. For the contrary assumption implies that there exists a finite subset F2
of r and a formula B such that P2, A I- B A -, B. But then P,, P2 I- B A -, B, which contradicts the consistency of P. Finally, the maximal consistency of P implies that A E P. The proofs of (ii)-(v) are left as exercises.
Now consider the mapping v on the set of formulas onto IT, F} such that v(A) = IT
lF
ifAEP, ifAvP.
This qualifies as a truth-value assignment since v(-, A) is T or F according as v(A) is F or T in view of (ii) above, and v(A --' B) = F
.9.3
I
Predicate Calculi of First Order
387
if v(A) = T and v(B) = F in view of (iii)-(v). Thus 1, and consequently the subset r of r, is simultar.eously satisfiable. EXERCISES 2.1. Complete the proof of Theorem 2.1. 2.2. Prove Theorem 2.2. 2.3. Provide a proof of each of the following formulas of the statement calculus (where A and B are any formulas).
(a) -,A -(A -- B). (b)
(d) (A - B) --> (-1B - -,A).
(e) B-' (--C --.
(B
C)).
(B --' A) -' ((-, B - A) --+ A). 2.4. The theorem "If a and b are numbers such that ab = 0, then a = 0 or b = 0" is usually proved by assuming that ab = 0 and a ; 0 and deducing
(c) A - -1
A.
(f)
that b = 0. Show how to obtain a formal proof from such an informal argument.
2.5. Show that the completeness theorem is equivalent to proposition (5). 2.6. Prove proposition (8).
2.7. Referring to the proof of Theorem 2.7, show that I' has properties (ii)-(v).
2.8. Referring again to the proof of Theorem 2.7, it should be clear that the possibility of proving by induction the existence of a maximal consistent set of formulas which includes a given consistent set rests with the assumption that
the set of statement variables is denumerable. Discarding this assumptionthat is, admitting the possibility of an uncountable set of statement lettersprove the existence of a maximal consistent set which includes a given consistent set of formulas using Zorn's lemma.
3. Predicate Calculi of First Order as Formal Axiomatic Theories Predicate logic of first order, in addition to having notations of the statement calculus, also has individual variables (and, possibly, individual constants), quantifiers, and predicate variables or predicate constants. Statement variables are not necessarily included, but there must .be a complete set of connectives for the statement calculus. Various different predicate calculi of first order are distinguished according to just which of these notations are introduced. In this section we shall .
present a particular formulation of each of the predicate calculi of first order. By being sufficiently ambiguous, they can be treated simulEaneously without confusion. Later, certain of these will be assigned special names. Where it is unnecessary to distinguish the va:ious predicate calculi, we speak simply of "the predicate calculus of first order."
388
First-order Theories
I
CHAP. 9
The axiomatization of the predicate calculus of first order which we present is taken from Church (1956). The axioms and rules of inference are essentially those in Russell (1908) but with Russell's axioms for the statement calculus replaced by (PCi)-(PC3) of Section 2. The primitive symbols are and certain sets of symbols as follows.
(i) Individual symbols, some of which are classed as variables and others of which may be classified as constants. The set of variables must be infinite. (ii) Statement symbols, some of which may be classed as variables and the others as constants. (iii) For each positive integer n, a set of n-place predicate symbols, some of which may be classed as variables and the others as constants. Formula is defined inductively as follows.
, xn are indi(I) If P is an n-place predicate symbol and x1, x2, xn) is a formula. (Such a vidual symbols, then P(xi, x2, formula is called prime.) (II) If A and B are formulas, then so is (A) -+ (B). (III) If A is a formula, then so is -, (A). (IV) If A is a formula and x is an individual variable, then (x)A is a formula. (V) Only strings of primitive symbols are formulas. A string is a formula only if it is the last line of a column of strings, each
either a prime formula or obtained from earlier strings by (II)-(IV). As in part (I) of the definition of formula, we shall use lower-case letters, with or without subscripts, from the latter part of the alphabet,
for individual variables and, as in parts (II)-(IV) of the same definition, we shall use capital English letters from the first part of tlis alphabet for arbitrary formulas. It can be proved that the notion at, formula is effective. We make the same conventions regarding the omiss sion of parentheses when writing formulas, and introduce the same abb
breviations for certain formulas as in the statement calculus. Further.
9.3
Predicate Calculi of First Order
389
we introduce (3x)A as an abbreviation for -t (x) -i A. Any occurrence of the variable x in the formula (x)A is called bound. Any occurrence of a symbol which is not a bound occurrence of an individual variable
according to this convention is called free. The valuation procedure of Section 4.8, with the following modification is applicable to formulas:
The statement constants arc to denote one of the truth values T or F
and the statement variables are to have IT, F) as their range. An individual constant (like an individual variable with a free occurrence in a formula) is assigned a value in the domain under consideration and to a predicate constant is assigned a particular logical function. As earlier, the valuation procedure leads to the notion of a valid formula. The axioms for the predicate calculus are given by the axiom schemas (PC1)-(PC3) of the statement calculus, with "A," "B," and "C" now ranging over formulas of the new theory plus at least the following two schemas.
(PC4) (x)(A -+ B) -p (A --> (x)B), where x is an individual variable with no free occurrences in A.
(PC5) (x)A -+ B, where x is any individual variable, y any individual symbol, and B is obtained by substituting y for each free occurrence of x in A, provided that no free occurrence of x is in a part of A that is a formula of the form (y)C. With the applications in mind, it is desirable to include the possibility that there is present in the predicate calculus of first order the formal analogue of the notion of equality. As it is intuitively understood, "x = y" means that x and y are the same object or that "x" and "y" are the names of the same object. For mathematical purposes, all that is required of equality is that (i) it be an equivalence relation, and (ii) it have the following substitution property: If x = y and B is the result of replacing one or more occurrences of "x" in a statement A by occurrences of '!y," then B has the same meaning as A. Now the properties of symmetry and transitivity can be derived from those of reflexivity and substitution. We take this into account in defining a predicate calculus of first order with equality. Such a predicate calculus is one of the sort described thus far with the addition of (i) the 2-place predicate .Constant "_" to the formal symbols, (ii) the clause, "if x and y are
individual symbols, then (x = y) is a formula" to the definition of formula, and (iii) the following axiom schemas.
First-order Theories
390
I
CHAP. 9
(PC6) If x is an individual symbol, then x = x. (PC7) If x and y are individual symbols, and A is a formula, then (x = y) -+ (A - B), where B is obtained from A by replacing some free occurrence of x by a free occurrence of y. When it is necessary to distinguish between a predicate calculus with equality and one in which there is no 2-place predicate constant satis.
fying (PC6) and (PC7), the latter will be called a "predicate calculus without equality." For the predicate calculus of first order there are two formal rules of inference.
Modus ponens: To infer B from any pair of formulas A, A -a B. Generalization: To infer (x)A from A, where x is any individual variable. A (formal) proof is a finite column of formulas, each of whose lineq is either an axiom or may be inferred from two preceding lines by modus ponens or may be inferred from a single preceding line by generalize-
tion. As in the statement calculus, a (formal) theorem is a formula which occurs as the last line of some formal proof. Again we shall symbolize the assertion that A is a theorem by I-A.
In order to extend the earlier definition of a deduction from a set Q assumption formulas to the predicate calculus, we make an auxiliary definition. A column Y of formulas is called a subcolumn of the coQ umn X of formulas if the formulas of Y appear among those of X the same order which they have in Y. Then, if r is a set of formula1 and A is a formula, we define D(r, A) to be the set of those finite col umns X of formulas whose last line is A such that each line of X is eitl 1 an axiom or an element of r, or else may be inferred from two preceding
lines by modus ponens, or may be inferred from a preceding line A of X by generalization on any variable-provided that B is the last line of a subcolumn of X which is a formal proof. If, for given r and A, D(r, A) is nonempty, then A is said to be deducible from assumpp tions r, symbolized
and a member of D(r, A) is called a (formal) demonstration of from r. Basic properties of deducibility include the six listed in Section. 1i (prior to Theorem 2.1) for the same notion at the statement calculul
9.3
I
Predicate Calculi of First Order
391
level. Furthermore, the earlier deduction theorem can be extended to the predicate calculus. This general form, which we establish next, was first proved by J. Herbrand (1930).
THEOREM 3.1 (the deduction theorem for the predicate calculus). If P is a set of formulas and A and B are formulas, then
r, A I- B implies I' I- A -, B. Proof. The proof is that given for Theorem 2.1, with the following additional case inserted after Case 3.
Case 4.
If Ci is inferred from an earlier line C; of X by general-
ization on some variable and C; is the last line of a subcolumn Z of X which is a formal proof (and, if the preceding cases do not hold), let Yi be the column (Z, Ci, Ci -' (A -' Ci), A ---> Q. Of course the presence of this case necessitates an extension of the
final step of the earlier proof-namely, the proof by induction that each Yi is a member of D(r, A -> Ci).
Using Theorem 3.1 we derive next another property of deducibility. For this we first define inductively the conjunction, ATM, of any string A1, A2, , A. of formulas : / \1Ai is A1;
Ai+'Ai is Ai+1 A A4AA,
j = 1, 2,
, m - 1.
LEMMA 3.1. A1, A2, ,AmI-BiffF-/\,Ai-4 B. Proof.
Hints for constructing a proof are given in Exercise 3.1.
We use this lemma to prove the following important result.
THEOREM 3.2. If r F- A and x is an individual variable not free in any formula of r, then r F- (x)A.
Assume that I' I- A and that x is a variable not free in any formula of F. By property (vi) of deducibility (Section 2) there exists a finite subset Pl = {AL, A2, , Am} of P such that Proof.
, A. F- A. Then Al, Ai - A is a formal theorem by Lemma 3.1. Let X be a proof of this theorem. Since x is not free in /\1mA;,
41, A2,
the column
X
/ --) A) x)(ArAi (x)(A Ai -a A) --a (MA i -+ (x) A)
NnAi - (x)A
392
First-order Theories
I
CHAP. 9
is easily seen to be a formal proof. Hence, by Lemma 3.1, rt I- (x)A and, in turn, by property (iv) of deducibility, r I- (x)A, as required. We interrupt our discussion at this point to note that our presentation
of the notion of deducibility ha§ been taken from R. Montague and L. Henkin (1956). Apparently their development was motivated by the
observation that one of the standard definitions of D(r, A) in the literature fails to satisfy property (iv) of deducibility. In this paper the following further result is obtained. Suppose that I-i and 1-s are relations, each satisfying the conditions (ii)-(vi) plus the two conditions of Theorems 3.1 and 3.2. Then r I- t A if r 1- s A for each formula A and each set of formulas. Thereby the relation I- is characterized by this set of seven conditions.
As another aspect of the notion of deducibility we note that if A(x) is a formula in which the variable x has a free occurrence, then in a demonstration which involves A(x) as an assumption formula, one is not permitted to generalize on this x. That is, x is treated as a constant. Intuitively we may say that a free occurrence of a variable in an assumption formula is employed to denote an arbitrary but fixed individual. In informal mathematics, when a variable x is employed in this way, one says that x has the conditional interpretation. In contrast, if x has a free occurrence in a formula A (x) which is an axiom of the theory under consideration, then A(x) is intended to mean the same
as (x)A(x). In this circumstance one says that x has the generalityinterpretation. If A is any formula and its free variables in order of , xn, then by the closuret of A we mean first free occurrence are xt, x2, the formula (X1) (X2) . (xn)A, sometimes abbreviated
VA.
Under the generality interpretation of free variables, A and VA are synonymous.
The deduction theorem can be extended so as to give the generality interpretation to some or all of the variables having free occurrences in one or more assumption formulas. For example, there is the following result : If r, A 1- B, then r H VA -' B. For proof, assume that r, A I- B, Now VA I- A by repeated use (possibly) of (PC5) together with modus ponens. So, by the property of deducibility stated in Exercise 3.2 (b),
r, VA I- B. Hence, by Theorem 3.1, r i- VA -' B. f In harmony with the definition of closure,. a formula containing no free variables, that is a statement according to an earlier definition, is often called a closed formula.
9.3
1
Predicate Calculi of First Order
393
The reduction of the notion of deducibility to that of provability in the case of the predicate calculus can be shown in a manner parallel to the corresponding reduction in the statement calculus, since the corollary to Theorem 2.1 and Theorem 2.2 carry over to the predicate calculus. Or, more simply, Lemma 3.1 may be called upon. It follows that if we can show that K A if F- A, we will have demonstrated the equivalence of the informal and the formal predicate calculus, both by themselves and when applied under a set of assumption formulas. As in the statement calculus, the proof is easy in one direction.
THEOREM 3.3. If F A in the predicate calculus, then t-- A. As in the proof of the corresponding assertion for the statement calculus (Theorem 2.4), we observe first that the assertion is true for each instance of each axiom schema. In this connection, Theorem 4.8.1 is pertinent. Further, by virtue of Theorem 4.3.3 (extended to the predicate calculus) and Theorem 4.8.2, if C is any theorem which Proof.
has been obtained from a theorem B by application of a rule of inference, and i B, then C is valid. Hence, if any formula A is a theorem, then A is valid.
The converse of this result is a consequence of a theorem first proved by K. G6del (1930). Although it is not his most celebrated theorem, it is a remarkable result. We state it as the next theorem. A proof is given in Section 6.
THEOREM 3.4 (G6del's completeness theorem for the predicate calculus). For each formula A in the predicate calculus, if iz A, thenF- A.
We conclude this section with an assignment of names to certain predicate calculi of first order. The pure predicate calculus of first order is that in which the primitive symbols include an infinite list of statement variables and, for each positive integer n, an infinite list of n-place predicate variables, but no statement constants, no individual
constants, and no predicate constants. A predicate calculus of first order in which at least one kind of constant appears is called an applied
predicate calculus of first order. EXERCISES 3.1. Prove Lemma 3.1 by induction. In the inductive step the following tautologies are useful:
First-order Theories
394
I
CHAP. 9'
(A'i A; -> (Al+i -+ B)) - (A{+1 A; -- B),
(1V'
B) -' (IV, A: -' (Ai+l - B)).
3.2. Establish the following additional properties of the relation I- .
(a) If A is a formal theorem and r is any set of formulas, then r I- A. (b) If t I- A and if A I- B for every formula B in t, then A I- A. 3.3. Show that the ordering of lines in a formal proof can be avoided, by proving that the theorems of the predicate calculus constitute the smallest set of formulas containing certain formulas and closed under certain operations.
4. First-order Axiomatic Theories A first-order theory (or, a theory with standard formalization) is, a formal theory for which the predicate calculus of first order suffices as
the logical basis. Those with which we shall be concerned are also axiomatic theories. An intuitive understanding of the essence of such theories is desirable before technical details are discussed. As our starting point for this we take the description in Section 5.3 of an informal theory as one whose primitive notions consist of a set X, certain of its members (individuals) and certain subsets of X" for various choices of n (primitive relations
and operations in X). Now, in place of relations or operations in X, predicates may be used. For example, in place of an n-ary relation p in X we may introduce the n-place predicate P such that a (prime)
, in X to x;, i = 1 , 2,
formula P(xi, xs,
is assigned the value T for an assignment of u , n, iff (u,, uz, , un) E p. In place of an n-ary
operation in X (that is, a function f on Xn into X) we may introduce the (n + 1)-place predicate Q such that a formula Q(xi, x2, - -, xn+1) is assigned the value T for an assignment of u; in X to x; iff f (x,, x2,
, xR)
= xn+1.
It is possible (as we shall show) to cope with n-ary operations in a more natural manner by introducing a further class of primitives, called "operation symbols" ; these are the direct formal analogues of functions whose domain is X". Now, with a specific informal theory in mind, suppose we formulate the applied predicate calculus whose individual symbols are variables and a set of constants in one-to-one correspondence with those individuals of Xwhich are primitive and whose only predicate symbols are constants which, in an interpretation having X as domain',
denote the primitive relations and operations in X. (Alternatively, operation symbols may be used in place of predicates which are intended'
to denote operations in X.) Finally, as axioms we take the axiom
9.4
I
First-order Axiomatic Theories
395
schemas of the predicate calculus with equality together with the formalixations of those (mathematical) axioms of the informal theory. As rules of inference we take those of the predicate calculus. The result is a first-order axiomatic theory! We now turn to a precise description of a first-order theory Z. The primitive symbols are the following.
(I8) An infinite sequence of individual variables, ao, a,, a2, . (II.) A set of logical constants consisting of (a) the logical symbols of the predicate calculus, parentheses, and a comma, (b) the equality symbol (III.) A set of mathematical constants consisting of (a) a set of individual constants, (b) for each positive integer n, a set of n-place predicate (or relation) symbols, (c) for each positive integer n, a set of n-place operation symbols.
The equality symbol, although regarded as a logical constant, is included in the set of 2-place predicate symbols. Statement symbols may also be
included; any such may be regarded as 0-place predicate symbols. In this same spirit, individual constants may be regarded as 0-place operation symbols.
The description of Z further includes the definition of a term. This is given inductively as follows.
(Ii) An individual variable and an individual constant are each terms. (III) If r,, r2, then A(r1, r2,
r are terms and A is an n-place operation symbol, is a term.
,
(III,) The only terms are those given by (Q and (III). Although some repetition is involved we give an inductive definition of formula. (If) If A is an n-place relation symbol and r1, r2, , r, are terms, then A(rl, r2, is a formula. (Such a formula is called
,
prime.) In particular, if r and s are terms, then (r = s) is a prime formula.
First-order Theories
396 (II f)
(
CHAP P. 9
If A and B are formulas, then so are -t (A), and (A) -' (B).
(111t) If A is a formula and x is a variable, t then (x)A is a formula.
(lVf) Only strings of primitive symbols are formulas. A string is a formula only if it is the last line of a column of strings, each either a prime formula or obtained from earlier strings by (IIf) or (III f). We carry over to Z all of the abbreviations, conventions, and definitions employed in the predicate calculus. Further, (r = s) will be abbreviated to r = s and -t (r = s) to r s. The only part of the foregoing for which we did not prepare the reader is the notion of a term. Under the intended interpretation, a term is the name of an object of the domain, that is, an individual. In addition to variables and individual constants being terms, strings composed from variables and individual constants using operation symbols should be terms, since in the intended interpretation they denote function values.
The theory Z becomes an axiomatic theory when the axioms are given and provability is defined. The axioms are of two kinds, logical and mathematical. As the logical axioms for % we take all instances
of the axiom schemas for the predicate calculus of first order with equality, with the following modifications. We now permit as the "y of (PC5) any term r such that when it is substituted for (the free occur-
rences of) x in A, no occurrence of a variable in r becomes a bound occurrence. As the mathematical axioms we select some set of closed formulas (that is, statements) of St; these axioms are intended to provide
the mathematical content of the theory. As rules of inference we take those of the predicate calculus of first order. The definitions of provability and deducibility remain unchanged from the predicate calculus but these notions are strengthened by the added mathematical axioms.
EXAMPLES 4.1. The formulation of group theory given in Exercise 5.4.15 leads to the following description as a first-order axiomatic theory. To the logical constants (including the equality symbol) we adjoin one individual constant e and one 2-place operation symbol - The terms of the theory are defined as follows:
Each variable and each constant is a term, and if r and s are terms, then r is a term. The formulas of the theory are those as defined in a predicate calculus;
plus (r = s), where r and s are terms. The mathematical axioms are f Generally we shall use letters "x," "y," as (metamathcmatical) variables which range over the variables of the theory under discussion.
9.4
1
First-order Axiomatic Theories
397
(x) (y) (z) (x 0 z) = (x y) z), (x) (e
x = x),
(x) (3,v) (y
x = e).
Alternatively, if we start with the formulation which is implicit in Exercise 5.2.7, we are led to the following description. The only mathematical constant is a
binary operation symbol , and the mathematical axioms are (x) (y) (z) (x . (y z) = (x . y) (x) (y) (3z) (x = y z),
(x)(z)(3y)(x = y
z),
z) -
Each of the foregoing is a formulation of the elementary theory of groups. The word "elementary" signals that the first-order predicate calculus is the system of logic employed and that theorems of the theory are restricted to those which can be expressed by first-order formulas. Not all of group theory, as a
mathematician knows this discipline, can be formalized by the elementary theory of groups. The state of affairs is that in any first-order theory one can quantify only with individual variables, and this is inadequate to formalize certain theorems.
4.2. We shall call the arithmetic of the system of natural numbers, when formalized as a first-order theory, elementary number theory, and symbolize it by N. One version (based on Peano's axioms) is the following. The mathematical constants consist of the individual constant 0, two 2-place operation symbols + and -, and the 1-place operation symbol'. The mathematical axioms consist of the following six axioms and one axiom schema. (x)(y)(x = y'-, x = y),
(x) (x + 0 = x), (x)(x
0 = 0),
(X) (X ,96 0),
(x) (y) (x + y' = (x + y)'), (x) (y) (x . y' = x - y + x), A(0) A (x)(A(x) -, A(x')) --> A(x), where x is any variable, A(x) is any formula, and A(0), A(x') are the results of substituting 0, x' respectively for the free occurrences of x in A(x). The intended interpretation of the mathematical constants is the obvious one. It is intended that 0 be the integer zero, that x' be the successor of x, that x + y be the sum of x and y, and that x y be their product. The axiom schema expresses the principle of mathematical induction to the extent possible in a firstOrder theory.
4.3. Since we have assumed that the equality relation is incorporated in a first-order theory, it is possible to replace an n-ary operation symbol in such a theory by an (n + 1)-ary relation symbol. The following example indicates how
First-order Theories
398
I
CHAP. 9
this can be done. Suppose that + is a 2-place operation symbol in a first-order
theory Z. This may be replaced by the 3-place relation symbol S [where S(x, y, z) is to be read "z is the sum of x and y"] and the inclusion of the axioms (x) (Y) (3z)S(x, y, z), (x) (y) (z) (u) (S(x, y, z). A S(x, y, u) - z
= u), which express the existence and uniqueness, respectively, of the sum of any two elements. Thus, for theoretical considerations, we may assume that no operation symbols are present in a first-order theory. In a similar manner, individual constants may be eliminated from a first, order theory. For example, to eliminate the individual constant c, we introduce a new unary relation symbol C and the axioms (3x)C(x),
(x) (y) (C(x) A C(y) -'' x = y) When operation symbols and individual constants are eliminated from a theory, it is necessary to modify formulas in which they appear in an appropriate way. For example, A(c) becomes (x)(C(x) --+ A(x)).
4.4. The agreement that the mathematical axioms of a first-order theory be closed formulas (rather than simply formulas) may seem to entail some loss of generality. That this is not the case is an immediate consequence of the follow, ing result: A formula A of Z is a theorem if its closure is a theorem. Indeed, if A is a theorem, then generalization on each variable, in turn, which is free in A, yields VA as a theorem. Conversely, if VA is a theorem, then any universal quantifier in front of A can be removed, using (PC5). 4.5. The deduction theorem of the predicate calculus of first-order and its converse have the following application to first-order theories: If r is a set of , A. are formulas of Z, then a formula B is deformulas of and A,, A2,
ducible from r u (A,, A2,
, Am) if r F- M A; --' B. In particular, B is a
theorem of T (that is, deducible from the set A of logical axioms and the set of mathematical axioms) if B is deducible from A alone or there exist mathemat, ical axioms A,, A2,
, A. such that A F- / l A; -+ B.
The last fact, taken together with the definition of deducibility, implies that a formula B is a theorem of Z if B is deducible from the set of mathematical axioms (as a set of assumption formulas) in the theory ', which coincides with T except that its axioms are just the logical axioms of Z. In this way the in, vestigation of various properties of first-order theories can be reduced to that of theories having a common set of axioms-to wit, those of the predicate calculus with equality. We shall take advantage of this possibility later.
For first-order theories it is assumed that the mathematical constants
have an interpretation in some nonempty domain D. Roughly, this means that each individual constant is interpreted as denoting a fixed
.9.4
I
First-order Axiomatic Theories
399
member of D, that each individual variable has D as its range, that ,,-place relation symbols have interpretations as subsets of D", and that p-place operation symbols have interpretations as functions on D" into D.
We turn now to a detailed description of this, along with a valuation procedure (which extends that given in Section 4.8). As the starting point for the valuation procedure for a first-order we assume that all mathematical constants of Z can be artheory
,C)
ranged without repetition in an a-termed sequence (Co, C1, , is,, ) for some ordinal a. f Let D be a nonempty set and let (eo, et, be a sequence which has the same number of terms as the foregoing sequence.
The nature of each e, depends on the nature of the corre-
sponding constant C,. If C, is an m-place relation symbol, then e, is a subset of Dm; if C, is an m-place operation symbol, then e, is a function on Dm into D; if, finally, C, is an individual constant, then G, is an element of D. The sequence
z _ (D) eo) Gt, ... a,. ... ) is called an interpretation of Z having D as its domain: this is a precise version of the definition of the same notion given in Section 5.2.
If Z is an interpretation of T with domain D, we wish to define next the circumstances under which a (denumerable) sequence (do, dt,
, d,,,
) with d; C D, in brief, a D-sequence is said to
satisfy a formula A (of T-) in D . For this we need a preliminary concept. To each D-sequence d = (do, dt, , dR, ) and each term r we associate an element r(d) of D by the following recursive rule.
(i) If r is the individual variable at, then r(d) = dk. (ii) If r is the individual constant C;, then r(d) = e,. (iii) If r is the term C;(ri, r2, -, where C; is an n-place operation symbol and ri, r2, , r,, are terms, then
r(d) = e,(rt(d), rs(d), ..., r.(3)). The element r(d) of D is called the value of the term r f o r the D-sequence W. Using this concept, f o r each D-sequence d = (do, dl,
, d,,,
)
and each formula A we specify whether or not d satisfies A by the following recursive rule.
(I,) If A is a prime formula C;(ri, r2,
, r,), where C; is an
, r,, are terms, then n-place relation symbol and rt, r2, satisfies A if (ri(d), r2(d), , r,,(d)) C e;.
t The reader who is not familiar with the notion of an ordinal number may assume, with little loss of generality, that the set of mathematical constants is countable.
First-order Theories
400
I
CHAP. 9
(II.) If A is a prime formula r = s, where r and s are terms, then d satisfies A if r(d) is the same element of D as s(d). (III.) If A is -, B, then d satisfies A if d does not satisfy B. (IV.) If A is B -> C, then d satisfies A if d satisfies C or satisfies neither B nor C. (V.) If A is (ak)B, then d satisfies A if for every d C D we have , d,-,, d, dk+,, ) satisfies B. that (do,
As an illustration of the definition, we apply it to show that if C, is a unary operation symbol and C2 is a binary relation symbol, then for any D, a D-sequenced satisfies (1)
(3a3)C2(as, Cias)
if there exists a d C D such that (d2, eld) C C2. The reader should justify each of our steps. In unabbreviated form, (1) is (2)
-1 (a3)
C2(a2, C,a3).
(do, d,, d2, , d,,, ) satisfies (1) if it does not satisfy Then (a3) -, C2(as, Cia3). This means there exists d C D such that
(do, d,, d2j d, dg, ... )
does not satisfy -,C2(a2, C,a3), and hence satisfies C2(a2, Cia3). Thus (d2, e,d) C e2. The proof of the following theorem is left as an exercise for the reader.
THEOREM 4.1. Ifd = (do,d,,
d,,, )and
d' _ (do, d i, ... , dn, ... ) are D-scquences and if A is any formula such that for every variable ak with free occurrences in A, dk = d1ei then d satisfies A iff d' satisfies A.
From this theorem it follows that if A is a statement, its satisfaction
by a D-sequence does not depend on any element of the sequence; that is, a statement either is satisfied by every D-sequence or by no D-sequence. We shall call a formula true in an interpretation `.J with domain D if it is satisfied by every D-sequence. If A is a statement, then either A is true in `.D or --A is true in Z. If A is true in `D, then we shall say that Z is a model of A. A formula which contains free variables is true in `.J if its closure is true in Z. It follows that the set of all formulas of which are true in `.J is characterized by the statements which it contains. This accounts for the fact that statements occupy a central role in the study of first-order tlicories (and those theories introduced in the next section).
9.5
I
Metamathematics
401
The foregoing amounts essentially to nothing more than an alternative description of the earlier valuation procedure for the predicate calculus- t Agreement with this statement will come as soon as it is recognized that an interpretation `.J with domain D of a theory X includes the equivalent of an assignment of logical functions (relative to D as domain) to the predicate symbols of Z. The circumstances under which a formula A of $ is classified is true in l is a slight extension of
those under which a formula receives truth value T relative to some assignment of logical functions.
Having made contact with the earlier valuation procedure, we shall take over some of the terminology introduced in Sections 4.8 and 4.9. If D is a nonempty set and A is a formula of T, then we shall say that A is valid in D if it is true in every interpretation with D as domain; A is valid, symbolized f= A,
if it is valid in every D. Further, a formula A is said to be a consequence of a set r of formulas, symbolized F i
A,
if for every interpretation D and every D-sequence 7 such that I satisfies each formula of r, we also have that d satisfies A. t In the case where all formulas are statements, r t= A if A is true in every interpretation in which each member of r is true. If we understand by a model of a set r of formulas an interpretation which is a model of each member of r, then r K A if every model of r is a model of A.
EXERCISES 4.1. Formulate the theory of simply ordered commutative groups (see the Exercises for Section 5.3) as a first-order theory. 4.2. Prove Theorem 4.1.
5. Metamathematics The principal reason for formulating intuitive theories as formal axiomatic theories and, in particular, as first-order theories, is that such fundamental notions as consistency and completeness can be discussed in a precise and definitive way. This is possible because the notion of t The justification for presenting two descriptions of a valuation procedure is the author's belief that each is the natural one in the setting in which it finds application. $ We note that a valid formula may be characterized as one which is a consequence of the empty set of formulas.
First-order Theories
402
I
C H A P. 9
proof is made explicit. Before turning to theorems related to such matters it is desirable to have some understanding of how such matters are studied and why such methods are used. In this section we shall describe the admissible methods for the study of formal theories as advocated by
the school of formalists (founded by Hilbert) and then prove some theorems in accordance with these methods. A formal theory is a completely symbolic language built according to certain rules from the alphabet of specified primitive symbols. When
a formal theory becomes the object of study it is called an object language. To discuss it, which includes defining its syntax, specifying its axioms and rules. of inference, and analyzing its properties, another
language-the metalanguage or syntax language-is employed. Our choice of a metalanguage is the English language. In general terms the
contrast between a metalanguage and the object language which is discussed in terms of this metalanguage is parallel to the contrast between the English language and the French language for one whose native tongue is English and who studies French. At the outset, vocab-
ulary, rules of syntax, and so on, are communicated in English (the metalanguage). Later, one begins to write in French. That is, one forms
sentences within the object language. To give a concrete example, consider the elementary theory of groups as formulated in the preceding
section. The statement "The elementary theory of groups is an undecidable theory" is about group theory and written in the English language-that is, in the metalanguage. In contrast, "(a) (b) (c) (a - b = a
c --+ b = c)"
is a statement of group theory-that is, of the object language. A theorem about a formal theory is called a metatheorem and is to be distinguished from a theorem of the theory. It is easy to make this distinction since a theorem of the theory is written in the symbolism of the theory, whereas a metatheorem is written in English. In the preceding paragraph the statement in English regarding group theory is a metatheorem, and that written in terms of , =, and so on, is a theorem of group theory. Since the proof of a metatheorem requires a system of logic, a description of the system of logic should be available for the prospective user of the metatheorem. One possibility is to formalize the metalanguage as we have formalized the predicate calculus. But this entails the use of a metametalanguage, and the beginning of an unending regress is established. The alternative, which was proposed by
9.5
I
Metamathematics
403
Hilbert, may be summarized roughly : In the metalanguage employ an informal system of logic whose principles are universally accepted. More generally, Hilbert took the position that a metatheory (that is, the study of a formal theory in the metalanguage selected) should have the following form. First of all, it should belong to intuitive and in-
formal mathematics; thus, it is to be expressible in ordinary language with mathematical symbols. Further, its theorems (that is, the metatheorems of the formal theory) must be understood and the deductions must carry conviction. To help ensure the latter, all controversial principles of reasoning such as the axiom of choice must not be used. Also, the methods used in the metatheory should be restricted to those called finitary by the formalists. This excludes consideration of infinite sets as "completed entities" and requires that an existence proof provide an effective procedure for constructing the object which is asserted to exist.
Mathematical induction is admissible as a finitary method of proof, since a proof by induction of the statement "For all n, P(n)" shows that any given natural number n has the property expressed by P by reasoning which uses only the numbers from 0 up to n; that is, induction does not require one to introduce the classical completed infinity of the natural numbers. Finally it is assumed that if, for example, the English language is taken as the metalanguage, then only a minimal fragment will be used. (The danger in permitting all of the English language to be used is that one can derive within it the classical paradoxes, for example, Russell's paradox.) By metamathematics or proof theory is meant the study of formal theories using methods which fit into the foregoing framework. In brief, metamathematics is the study of formal theories by methods which should be convincing to everyone qualified to engage in such activities. Before discussing some metamathematical notions and proving some
metatheorems, we outline the reasons which led Hilbert to formulate metamathematics as he did. The introduction of general set theory with its abstractness. and its treatment of notions (such as the completed infinite), which are inaccessible to experience, yet with its fruitful applications to concrete problems of classical mathematics, provided the stimulus for investigations of the foundations of mathematics in the sense that this subject matter is now known. The discovery of contradictions within set theory served to strengthen and accelerate these investigations. The initial reaction to the antinomies of intuitive set theory was a reconstruction of set theory as an axiomatic theory, placing
404
First-order Theories
I
CHAP. 9
around the notion of set as few restrictions to exclude too large sets as appear to be required to prevent the known antinomies (see Chapter 7). Some felt that even if this venture should prove to be successful, it would not provide a complete solution to the problem because, they
argued, the paradoxes raised questions about the nature of mathe. matical proofs and criteria for distinguishing between correct and incorrect proofs for which satisfactory answers had not been provided. Russell, for example, judged the cause of the paradoxes to be that each involves an impredicative procedure. t This led Russell to formulate a system of logic (his ramified theory of type, 1908) in which impredicative procedures are excluded and, with Whitehead, to attempt to develop mathematics as a branch of logic (Principia Mathematica). Both the logistic
school and the advocates of the axiomatic approach to set theory, initiated by E. Zermelo, were in need of proofs of the consistency of their theories. It was recognized that the classical method of providing
a proof-the exhibition of a model within the framework of a theory whose consistency was not in doubt-could not be applied. Further, finite models were clearly inadequate, and no conceptual framework within which an infinite model might be constructed could be regarded as "safe" in view of the antinomies. It was Hilbert who contributed the
idea of making a direct attack upon the problem of consistency by proving as a theorem about each such theory that contradictions could not
arise. Hilbert recognized that in order to carry out such a program, theories would have to be formalized so that the definition of proof would be entirely explicit. To this end he brought the notion of a formal axiomatic theory to its present state of perfection. To prove theorems about such theories-in particular, to attack the problem of consistency -Hilbert devised metamathematics. By restricting the methods of proof to be finitary in character, he hoped to establish the consistency of theories such as N with the same degree of impeachability as is provided by
proofs of consistency via finite models when the latter technique is possible (as in group theory for instance). So much for the raison d'etre of metamathematics. We shall anticipate the results appearing in Section 10 by mentioning now the impossibility of metamathematics fulfilling the role which Hilbert intended for it. This was f A procedure is said to be impredicative if it provides a definition of a set A and a specific object a such that (i) a C A and (ii) the definition of a depends on A. For example, the procedure which leads to Cantor's paradox, is impredicative: The collection a of all subsets of the set A of all sets is both a member of A and depends upon A for its definition.
9.5
I
Metamathematics
405
established as a consequence of theorems proved by Godel (1931). The specific circumstances were these. Hilbert's program slowly took form during the period 1904-1920, and in the 1920's he and his co-workers undertook its execution. Their initial goal was to prove the consistency of elementary number theory. This was a natural objective in view of the fundamental role of elementary number theory plus the possibility of the reduction of other portions of classical mathematics to that of N, via models (see Section 5.4). After some partial successes, the endeavor came to a halt in 1931 with the demonstration by Godel of the impossibility of proving the consistency of any formal theory which includes the formulas of N by constructive methods, "formalizable within the theory itself." Regarding such methods, it suffices for the moment to say that so far as is known, they incorporate all methods which Hilbert was willing to permit in metamathematics. This state of affairs does not
foretell the doom of metamathematics but has served to indicate its limitations. In what follows, we shall occasionally see methods of proof which lie outside the domain of metamathematics. When this is done we shall call attention to the fact. For our first example of a metamathematical notion we choose consistency. The definition in Section 5.4 (a theory is consistent iff for no formula A both A and --1 A are provable) is applicable to any formal theory having the symbol -i for negation. It is metamathematical since
it refers only to the formal symbol -i and the definitions of formula and provability. A metatheorem concerning a class of theories to which the definition is applicable is proved next.
THEOREM 5.1. Let Z be a formal theory which includes the statement calculus. Then T is consistent if not every formula of Z is a theorem.
Suppose that T is inconsistent and that A is a formula such that both F- A and F- --1 A. Now A -+ (--I A -' B) is a theorem for any B since it is a tautology. Hence B (that is, any formula) is a theorem by two applications of modus ponens. For the converse, Proof.
assume that every formula of Z is a theorem. Then if A is any formula, both A and --,A are theorems. Thus, Z is inconsistent.
Henceforth it will be assumed that all formal theories include the statement calculus so that Theorem 5.1 will always hold. Our next result is a metatheorem about the statement calculus.
406
First-order Theories
,
CHAP. 9
THEOREM 5.2 . The statement calculus is a consistent theory. Let A be a theorem. Then, in turn, A is a tautology, -, A is not a tautology, and -,A is not a theorem.
Proof.
The foregoing is a metamathematical proof. To substantiate this assertion we note first that the computation process for filling out a truth table for a given formula (regarded as a truth function) is meta.. mathematical. Hence the property of being a tautology is a metamathe.. matical property of formulas of the statement calculus. It follows that the proof of Theorem 2.4 (if A is a theorem, then A is a tautology) is metamathematical. Since the proof in question relies solely on Theorem 2.4, it also is metamathematical. A similar chain of reasoning (now using Theorem 3.3) gives a proof of the consistency of the predicate calculus as soon as a formula which is
not valid is exhibited. Although the valuation procedure on which the proof of Theorem 3.3 depends is not effective in general, we apply it relative to a fixed interpretation whose domain is finite. Under these circumstances it is admissible in metamathematics, so the proof is meta, mathematical. The idea behind the proof is the fact that an n-place formula, with or without quantifiers, behaves like a statement in the sense that it assumes either the value T or F, when valuated in a domain of just one element. We begin the proof by defining for each formula A
the associated statement calculus formula (a.s.c.f.) as the formula obtained from A by deleting all quantifiers, deleting all individual variables, and treating the predicate variables as statement variables. Now we observe that the a.s.c.f. of each axiom of the predicate cal. culus is a tautology and that the two rules of inference preserve the property of having a tautology as an a.s.c.f. Hence, a formula is prow; able only if it has a tautology as its a.s.c.f. Consider now the formula d1(a) A -i&'(a), where t' is a 1-place predicate variable and a is all individual variable. Its a.s.c.f. is d' A -1d' and is not a tautology, andhence the original formula is not provable. An application of Theorem 5.1 completes the proof. We state this result as
THEOREM. 5.3 . The predicate calculus of first order is a consistent theory.
Sometimes the notion of completeness, in the sense of one or more of the definitions given in Section 5.4, may be treated in metamathematics. For example, Theorem 2.3, which asserts the completeness of the state-
9.5
I
Metamathematics
407
ment calculus in a positive sense (as this was explained in Section 5.4), belongs to metamathematics. On the other hand, Godel's completeness theorem for the predicate calculus is outside the realm of metamathematics. The statement calculus is also complete in a sense which exhibits a negative approach to a sufficiency of theorems. The next result, which belongs to metamathematics, is of this sort.
THEOREM 5.4. If A is any formula of the statement calculus, then either it is a theorem or else an incz)nsistent theory results by adding as additional axioms all formulas resulting from A by substituting arbitrary formulas for its statement variables. Let A be a formula which is not a theorem, and let us augment the axiom schemas of the statement calculus with all formulas resulting from A by substituting arbitrary formulas for its statement variables. Since A is not a theorem, it is not a tautology. Therefore, it takes the value F for some row of its truth table. Referring to one such row, we Proof.
choose an instance of A as follows. Substitute a V -id for the prime formulas of A which are T, and substitute a A -1 a for those prime formulas which are F. The resulting axiom, B, will always take the value F. Then --, B is a tautology, and hence a theorem. Thus, both B and -, B are theorems.
One might apply the definition of negation completeness (given in Section 5.4), with "statement" replaced by "formula," to both the statement calculus and the predicate calculus. Neither is complete in this sense. For the statement calculus this conclusion follows from the consideration of any formula A whose truth table has neither all T's not all F's; for clearly neither A nor -,A is a theorem. This is a reflection of the fact that in the statement calculus no formula corresponds to a particular statement. We may substitute any statement for a statement variable. For a similar reason the predicate calculus is not negation complete. As an example, neither al(a) nor its negation is a theorem because neither is valid. In this case Q'(a) does not stand for a particular statement (which one expects to be true or false) but for any statement in which a' is interpreted as a 1-place predicate and a as an individual. Actually, the metamathematical notion of negation completeness is intended for only formal axiomatic theories such as N and there its
restriction to closed formulas is essential in order that it have the intended significance. For example, in N we would not want either
408
First-order Theories
I
CHAP. 9
(3y) (x = y - y) (which expresses, under the generality interpretation of
the free variable present, "every natural number is a square") or -, (3y)(x = y y) ("every natural number is not a square") to be provable. However, one of (x) (3y) (x = y y) and -, (x) (3y) (x = y - y) should
be true, and hence provable. As background for the final metamathematical notion which we shall discuss, we recall the definition of an effective procedure as given in Section 1. In brief, an effective procedure-or, as it is often called, a decision procedure-is a method which can be described in advance
for providing in a finite number of steps a "yes" or "no" answer to any one of a class of questions. Such a class of questions can be identified with a predicate in the metalanguage in the obvious way. For
example, the predicate "p and q are relatively prime" embraces the class of questions concerning the relative primeness of pairs of integers.
Thus we may speak of a decision procedure for a predicate. (Incidentally, the Euclidean algorithm provides a decision procedure for the predicate mentioned.) The problem of discovering a decision procedure for a predicate is called the decision problem for that predicate and, if a decision procedure is found, the predicate is said to be effectively decidable; if there does not exist such a procedure the predicate is undecidable. Although we require of a formal axiomatic theory that there be a decision procedure for the notion of proof, we do not require the same for provability. In contrast to the question of whether a given sequence of formulas is a proof (which requires merely the examination of a displayed finite object), the question of whether a given formula is a theorem requires looking elsewhere than within the given object for
an answer. Further, the definition of a proof sets no bounds on the length of a proof, and to examine all possible proofs without bound on their length is not a procedure which yields an answer to the question in a finite number of steps in the event the formula is not a theorem. This being the state of affairs, the decision problem for provability has special significance for formal theories. Accordingly, it is often called the decision problem for a theory. A theory for which the decision prob-
lem can be answered in the affirmative is said to be decidable; otherwise, it is undecidable. An example of a decidable theory is the statement calculus, for since a formula is a theorem iff it is a tautology, the
method of truth tables provides a decision procedure. Some other decidable theories are described at the end of Section 9.
So long as attention is restricted to results of a positive character
9.6
Consistency and Satisfiability of Sets of Formulas
I
409
concerning decidability, an intuitive understanding of this concept suffices. It is up to the creator of a theorem which asserts that some theory is decidable to provide and establish a decision procedure. The situation changes radically, however, if one proposes to prove a result of a negative
character, namely, that a theory is undecidable. Clearly, a precise definition of a decision procedure is indispensable in this connection. A definition which is generally agreed on is given in Section 8.
6. Consistency and Satisfiability of Sets of Formulas In this section we derive some properties of a class of theories for which the axioms are those of the pure predicate calculus (without and, later, with the axioms of equality). By being sufficiently ambiguous in the description of these theories we obtain results which have applications to both the predicate calculus and, in view of the remark made in Example 4.5, first-order theories. The applications to first-order theories consist of more definitive results concerning consistency, completeness, and categoricity than were obtained earlier for informal theories.
We begin by fixing our attention on a particular theory Zo which may be the pure predicate calculus of first order, or some first-order theory with the symbol for equality deleted, or some theory in between these extremes. If Sto is not the pure predicate calculus, then it is determined by some definite choice of primitive symbols which includes individual symbols (including a denumerable set of individual variables ), possibly some predicate symbols, and possibly some operation symbols, f but without equality. The definition of an interpretation of Zo may be obtained from that given for this notion in the case of a a0, at, a2,
first-order theory. The definition of satisfaction of a formula by a D-sequence is obtained from the earlier one by deleting references to equality. Then it is clear that the remaining definitions given for firstorder theories apply to Zo. In order to launch our discussion of Zo, further definitions are needed.
These extend some for the statement calculus given in Section 2. A formula A is satisfiable in a nonempty set D if there exists an interpretation `
of To with domain D such that A is satisfied in Z. Notice that
satisfiability of A in D hinges on the possibility of making some assignments of values to the free variables in A such that there results satist Of course, we assume that the union of the set of predicate symbols (some of which may as variables and others as constants) and the set of operation symbols is nonempty.
410
First-order Theories
I
CHAP. 9
faction by a D-sequence which exhibits this choice of values. A formula of To is satisfiable if it is satisfiable in some D. Just as the notion of the
validity of a formula may be regarded as the analogue, for to, of the notion of being a tautology in the statement calculus, so may satis. fiability be regarded as the analogue of not being a contradiction. It is clear that a formula is satisfiable (in a given domain) if its negation is not valid (in that domain), and a formula is valid (in a given domain) if its negation is not satisfiable (in that domain). A set of formulas is simultaneously satisfiable if each formula is satisfiable in some domain by some D-sequence. The definitions of an inconsistent and of a con. sistent set of formulas of To read the same as for the case of the state. ment calculus. It is left as an exercise to prove that a set of formulas is consistent if every finite subset is consistent. The main results which we shall derive in this section concern properties of a set r of formulas of to. Since, for applications to first-order theories, r will be the set of mathematical axioms, and since, as men. tioned in Example 4.4, we may take such formulas to be statements, we shall express most of our results for a set of statements. We begin by extending the result obtained in Section 2 to the effect that the notions of consistency and satisfiability of a set of formulas of the statement calculus are equivalent to the case of a set of statements of To.
THEOREM 6.1. If the set r of statements of to is simultaneously satisfiable, then r is consistent. Proof. Assume that r is an inconsistent set of statements. Then there , Am) of r and a formula B such that exists a finite subset IA,, A2, A,, A2,
, A. I- B A --i B. By Theorem 5.3, m > 0. Then the
deduction theorem and the statement calculus give F- -1 AM;. The reasoning in the proof of Theorem 3.3 may be applied to this result to conclude that -1 A; A; is valid. Hence, A A; is not satisfiable, , Am) is not satisfiable in view of the which means that JAI, A2, definition of conjunction. It follows that r is not satisfiable. The reader is asked to convince himself that this proof is not admixBible in the sense of Hilbert's metamathematics. In fact, the definition of satisfiability of a set of formulas is probably not admissible.
The converse of Theorem 6.1 is a much deeper result. We state it in a sharp form due to L. Henkin (1949). The proof that we shall
9.6
I
Consistency and Satisfiability of Sets of Formulas
411
give-a refinement of Henkin's original proof-appears in a paper by G. Hasenjaeger (1953), who attributes the idea to Henkin. It is an elaboration of that given for Theorem 2.7.
THEOREM 6.2 . If r is a consistent set of statements of Zo, then r is simultaneously satisfiable in a domain whose cardinal number is equal to the cardinal number of the set of primitive symbols of Xa. Proof.
We shall carry out the proof for the case where the set of
primitive symbols of To is denumerable and indicate afterward the modifications needed in the general case. be symbols which do not occur among the symbols Let u,, u2, of to. Let % be the theory whose primitive symbols are those of to augmented with u,, u2, as individual constants. The set of formulas of T is denumerable and there is an effective procedure for listing them. This induces an effective enumeration of the statements of T and, in turn, of those statements of the form (3x)A(x).t Suppose that (3x)Ai(x), for i = 1, 2, , is an enumeration of all such statements.
We shall use this last ordering to construct a consistent set of statements of T that includes P. We begin by defining a sequence of sets of statements of `.slr by induction. Let Po be P. Po, r,, F2, In the list u,, u2, -, let u;, be the first constant that does not occur in (3x)A,(x). Then take Ti to be the set whose members are (3x)A,(x) -+ and the members of Po. Assuming that P; has been defined, let u,,.,,
be the first constant in the list u,, u2, that does not occur in , Ai(u;;), (3x)Ai+,(x). Then take Pi+, to be the set whose Al(u;,), members are (3x)A;+t(x) -a Ai+t(uj,..)
and the members of F. Then each Pi (i = 0, 1, 2, - ) is consistent. For example, to show that Ti is consistent, assume to the contrary that Po, (3x)A,(x) -+
A,(u,,) - B A -1 B for some formula B. Then, by the deduction theorem,
Po 1- ((3x)A,(x) -' Ai(u,,)) --+ B A B. In some demonstration of ((3x)A,(x) -+ A,(u,,)) -+ B A -, B, replace
u,, by a new variable y which does not occur in any formula of the t The notation "A(x)" for the type of formula under consideration is a convenient one for exhibiting the result of substituting some individual symbol for the free occurrences of x.
412
First-order Theories '
CHAP. 9
deduction. Since u,, does not occur in any member of To or in (3x)Al(x),
we then have
Tot- ((3x)Ai(x) -' AI(y)) --' B A -1 B. From this can be inferred, by the machinery of the predicate calculus, ro l- ((3x)Ai(x) - (3y)AI(y)) -+ B A -t B. Then a change of bound variable f gives B A -1 B. ro F- ((3x)AI(x) -' (3x)Ai(x)) But since (3x)Ai(x) -> (3x)Ai(x) is a theorem, we have To l- B A --1 B, contrary to the supposed consistency of To.
Similarly, the consistency of Ti+i follows from that of Ti, and thereby the consistency of each Ti is established by induction. Let F be the union of the sets To (= T), Ti, T2, . Then I' is a consistent set. For the contrary assumption implies the inconsistency of some finite subset of r, and hence that of some Ti, contrary to what was proved above. Next we shall construct a set A of statements of T which includes r (and hence r) and which is maximal consistent in the sense explained in the proof of Theorem 2.7. For this purpose we define an infinite as follows. Let AO be the same as P. sequence of sets Ao, AI, A2, Then, if the (n + 1)th statement A of Z (in the chosen enumeration of these statements) is consistent with A, (that is, if A. U { A } is a
consistent set), let A,,+i be the set whose members are A and the members of A,,; otherwise take An+i to be the same as A,,. It follows immediately by induction that each of these sets is consistent. Let A . Clearly, A includes P. Morebe the union of the sets Ao, A,, 42, over, it has the following two properties, which is all we shall use to show that A, and hence r, is simultaneously satisfiable in a denumerable domain.
(i) A is a maximal consistent set of statements of Z.
(ii) If a formula of the form (3x)A(x) is in A, then for some constant u;, A(u;) is in A.
For the proof of (i) we note first that the consistency of A is shown by the same argument as was used above to establish the consistency f This is an application of a theorem about the predicate calculus which may be stated as follows: If y is an individual variable which is not free in a formula C and x is an individual variable which does not occur in C, if E results from D by substituting x for y in C for an occur-
rence of C in A and if re j- D, then ro 1-- E.
9.6
,
Consistency and Satisfiability of Sets of Formulas
413
of I . Next, let A be any statement that is consistent with A. Suppose that A is the (n + 1)th statement of Z. Then A. U I A) is a consistent set. Therefore, by the definition of O"+1, A is in A,, and hence in A. The proof of (ii) is left as an exercise. Next we mention five further properties of A which stem from (i) and which we will need.
(iii) A statement A is a member of A if A I- A.
(iv) If B is any statement, then exactly one of the pair B, -, B is in A.
(v) If B C A, then A -4 B E A for any statement A. (vi) If A V A, then A -+ B E A for any statement B.
(vii) If AEAand BQA,then A-9BVA. These five properties of a maximal consistent set were listed in the
proof of Theorem 2.7. The earlier proof of the first carries over directly to A. Proofs of the earlier statements of the remaining four (which the reader was asked to provide) also-carry over directly to A. Thus, we feel free to continue.
This we do by introducing an interpretation, Z, of Z. As its domain, D, we take the set of individual constants of Z. We order all constants (individual and predicate) of Z in a sequence (Co, C,, C2, ... )
and then, corresponding to this, we form a sequence (eo, e1, G2,
)
as follows. If C; is an individual constant we take e1 to be C;, and
if C; is an n-place predicate constant we take a to be the n-ary relation in D such that for individual constants dl, d2,
- , do we have E e1 if A E- C;(d1, d2, , Q. The key property of Z is the following: Each statement A of T is true in Z iff A I- A. The proof (sketched for an A containing no operation symbols) is by induction on (d1j d2,
,
the number m of symbols in A, counting each occurrence of -,, -*, and a universal quantifier as a symbol. If m = 0, then A has the form da), where P is a predicate symbol and the d's are in D. If A F- P(dl, d2, ... , 4), then, clearly, every D-sequence satisfies A since (d1, d2, , 6) is a member of the n-ary relation assigned to P. The converse is equally obvious. Assume next that the assertion holds for all statements with fewer than m symbols and consider A with m symbols. P(d,, d2,
Case 1.
,
A is -, B. Assume that A 1- -, B. Then it is not the case
414
First-order Theories
I
CHAP. 9
that A I- B, by (iii) and (iv). From the induction hypothesis it follows that B is not true and, hence, - B is true in Z. The converse,
that -' B is true in Z implies A I- -, B, follows by reversing this argument. Case 2.
A is B -> C. This is disposed of by properties (v)-(vii)
of A. The details are left as an exercise. Case 3. A is (x)B(x). If A F- (x)B(x), then by (PC5) and modus ponens, A F- B(d) where d is any individual constant. The induction hypothesis and clause (V.) of the definition of satisfaction then imply that (x) B(x) is true in Z. For the converse, assume that we do not have A I- (x)B(x). Then -, (x)B(x) and, hence, (3x) -1 B(x)-by the definition of the latter formula together with modus ponens-is in A. From (ii) it follows that there exists u; such that -1 B(u;) C A, so we do not have A I- B(u;). Hence, by the induction hypothesis, B(u;) is
not true in ¶, which implies that (x) B(x) is not true in D by the definition of truth for (x)B(x). In view of the result just proved, all formulas of A are true in and so are simultaneously satisfiable in D. Since r is a subset of A, the theorem is proved for the case of a To whose primitive symbols are denumerable. The only modifications necessary for the proof of
the general case are (i) the replacement of the ui's by symbols u where a ranges over a set with the same cardinal number as the set of primitive symbols of To, and (ii) the selection of some one well, ordering of the formulas of the new T in place of the standard enumeration used above.
The depth of this result may be inferred from the fact that several profound theorems pertaining to both the pure predicate calculus and applied predicate calculi can be derived easily from it. We state as the first result in this category the completeness theorem for such theories.
THEOREM 6.3 (the completeness theorem).
If A is a valid formula of To, then I- A. Proof. Assume that A is valid and consider the closure VA of A. As observed earlier, VA is then valid and, in turn, -, VA is not satisfi, able. Hence, by Theorem 6.2, { -, VA } is inconsistent. Therefore, for some formula B, VA I- B A -, B, and then, by the deduction
theorem and the statement calculus, I- VA. Then, by (PC5) and modus ponens we may clear away any universal quantifiers to obtain I- A.
9.6
I
Consistency and Sati liability of Sets of Formulas
415
If To is taken to be the predicate calculus, then Theorem 6.3 becomes Godel's completeness theorem (Theorem 3.4). Godel, it may be noted, proved the completeness of the pure predicate calculus and then indi-
cated how the method used can be extended to obtain Theorem 6.2 for the pure predicate calculus.' Incidentally, for the pure predicate calculus, Theorem 6.2 may be phrased in the following somewhat more striking form: Every consistent set of statements of the pure predicate calculus is simultaneously satisfiable in the set N of natural numbers. This version follows from the fact that the cardinality of the set of primitive symbols in this case is t4o and, since only the cardinality
of a set matters when it is being considered as the domain of an interpretation, N may be used under the circumstances.
THEOREM 6.4. If P is a set of statements of To which is simultaneously satisfiable, then r is simultaneously satisfiable in a domain whose cardinal number is equal to the cardinality of the set of primitive symbols of To. Proof.
Apply Theorem 6.1 and then Theorem 6.2.
From Theorems 6.1 and 6.2, when stated for the case of sets of arbitrary formulas (instead of statements) of the pure predicate calculus, follows the Skolem-Lowenheim theorem : If a set of formulas of the pure predicate calculus is simultaneously satisfiable, then it is simultaneously
satisfiable in N. Lowenheim first proved this for the case of a single formula. Skolem (1929) generalized this result to the case of simultaneous satisfaction of a countable set of formulas.
THEOREM 6.5. Let r be any set of statements of To such that every finite subset of r is simultaneously satisfiable. Then F is simultaneously satisfiable in a domain whose cardinality is equal to that of the set of primitive symbols of To.
Assume that F is not simultaneously satisfiable. Then r is inconsistent, by Theorem 6.2. Hence, there is a formula B such that both r I- B and F H B. Since the demonstrations of B and -, B from r are finite sequences of formulas, we see that some finite subset of r is already inconsistent. This conclusion is incompatible with the hypothesis, according to Theorem 6.1. Proof.
f However, Godel's proof does not extend to the case of a theory which has uncountably many primitive symbols.
416
First-order Theories
I
CHAP. 9
It is possible to deduce from Theorem 6.2 an extended version of Theorem 6.3, which is known as the strong completeness theorem: For any set r of statements of To, if r K B, then r h- B. Conversely, the strong completeness theorem implies Theorem 6.2 and thereby follows the equivalence of these two results. This equivalence extends one obtained in Section 2 for the statement calculus. We shall establish the strong completeness theorem next and leave the proof of the converse as an exercise.
THEOREM 6.6. For any set r of statements of To, if r
B,
then r F- B. Proof. Assume that r K B. Then the set r u { -, B} is not simultaneously satisfiable. To prove this we note first that if r is inconsistent,
then (Theorem 6.1) it is not simultaneously satisfiable, and hence r u {-,B} is certainly not satisfiable. If r is consistent, then (Theorem 6.2) it is simultaneously satisfiable and any model of IF is a model of B, so again IF U { B } is not satisfiable. From the nonsatisfiability of r U { -, B} follows the inconsistency of this set. Hence,
by the deduction theorem and the statement calculus, r I- B.
Theorem 6.2 holds for a theory Z, like To but with equality, if we replace "a domain whose cardinal number is equal to" by "a domain whose cardinal number is less than or equal to." Before we prove this we note that the definition of "simultaneous satisfaction in a domain D" now includes clause (II.) of the definition of satisfaction of a formula by a D-sequence. That is, the symbol " _" must denote the relation of
equality between individuals of D. (It is because of this inflexible interpretation of the relation of equality that it is classified as a logical constant.) To begin the proof, let E, and E2 be the set of the closures of all instances of the axiom schemas (PC6) and (PC7), respectively, for
equality. Given a set r of statements of T-1, we consider the set r U E, U E2 of statements of the theory To obtained from Z, by dropping the axiom schemas (PC6) and (PC7). Since Theorem 6.2 is applicable to this To, there exists, an interpretation of To with domain D in which r U E, U E2 is simultaneously satisfied, provided that it is consistent in To (which is the case if IF is consistent in a,). To " =" there is assigned by this interpretation some binary relation e k in D. Since from E, and E2 one can deduce in To that (x) (y) (x = y --),y = x) and (x) (y) (z) (x = y A y = z --)' x = z), C k is an equivalence relation on D. The relation ek has the additional property that for any n-ary relation
9.7
I
417
Consistency, Completeness, and Categoricity
ei of the interpretation of to, dlekd,, d2ekd2, , dnekdn imply that i , dn) C ei if ( ', d d dn) C ei. This is guaranteed because (d,, d , , in To we can deduce the formula (1)
x,=x1'Ax2-x$^...A\xn=x.-(Ci(xl, x2, ... , xn) HCi x, x2, ... , X.) ) 1
from assumptions E, U E2. Now let D' be the set of equivalence classes
modulo ek. Then for each subset of D" the canonical mapping on D onto D' determines a subset of (D')n. Consequently, to each constant C, of %, may be assigned a relation in D' in the natural way. By this route we are led to an interpretation of Ti with domain Y. If c is an individual constant to which is assigned d in D, then to c is assigned the equivalence class (element of D') determined by d. Hence, the relation of equality of individuals of D' is assigned to the equality symbol in . ,. Further,
r is simultaneously satisfied in D' since r U E, U E2 is satisfied in D and because of the property (noted above) of ek, which stems from formula (1).
From the foregoing modification of Theorem 6.2 may be inferred Theorem 6.3 in the form: If A is a valid formula of Z1i then 1- A. This result includes Godel's completeness theorem for the pure predicate calculus with equality. Theorems 6.4 and 6.5 also hold for T, when "equal to" is replaced by "less than or equal to." EXERCISES 6.1. Prove that a set of formulas of To is consistent iff every finite subset is consistent.
6.2. Referring to the proof of Theorem 6.2, prove that A has property (ii). 6.3. Referring to the proof of Theorem 6.2, the reader should agree that the proof-outline of the key property of `tJ (that a statement A of Z is true in D if A I- A) is lacking in precision. He can correct matters by proving by induction on the length of A the following LEMMA: For every D-sequence a = (do, d1, d2, ) and every formula A of Z, A(a) C A if a satisfies A, where A(a) is the result of substituting dk for all free occurrences of ak in A, k = 0, 1, 2, . 6.4. Write out an expanded version of the proof given of Theorem 6.3, supplying all missing details.
7. Consistency, Completeness, and Categoricity of First-order Theories In this section we shall discuss the notions mentioned in the section heading for an arbitrary first-order theory Z having a set r of statements
418
First-order Theories
I
CHAP. 9
as its mathematical axioms. The results of Section 6 become available for use in our discussion simply by changing the status of r from that of a set of axioms to a set of assumption formulas. To explain this in detail, let Zi be the theory which coincides with Z except that the axioms of Zi are just the logical axioms of T. Then, as shown in Example 4.5, the theorems of T are precisely those formulas of T1 which are deducible
(in Zi) from r as a set of assumption formulas. The transition from Z to Ti amounts to nothing more than the change in status of r mentioned above. When Z is regarded as %I it qualifies as a theory of the type considered in the latter part of Section 6, so the results obtained there may be applied to Z. When discussing $ in this way, the definition of a model of Z coincides
with that of a model of r. For, by definition, a model of Z is a model of the set of axioms of Z, but this amounts to an interpretation which is a model of r, since the remaining (logical) axioms of Z are true in every interpretation. Likewise, when our earlier definition of the consistency of a theory is applied to Z, it is seen to coincide with the more recent definition of consistency for r. The definitions given earlier in this chapter of negation completeness and of categoricity of an axiomatic
theory may also be applied to r instead of Z. In summary, the notions which are uppermost in our mind now may be formulated at one's or its set of mathematical axioms. Sometimes pleasure for either there are psychological reasons for having a preference. Our first concern is the extension of Godel's completeness theorem and its converse to Z, thereby establishing the correctness and adequacy of the deductive apparatus which is available for Z.
THEOREM 7.1. A model of Z is a model of the set of theorems of X. Proof.
Let B be a theorem of %. Then r I- B in Z1, which means
, A. of r. In turn, that Al, A2, - , A. I- B f o r members A,, A2, A, A; -+ B is a theorem of Z1. If Z is a model of Z, then Z is a model Further, as a theorem of Zi, , A.) and hence of of (A,, A2, /\;A; --* B [which may be reformulated as (-, /\7,A;) V BI is a valid
formula and hence has Z as a model. Thus, Z is a model of B. An alternative version of Theorem 7.1 is : If B is a theorem of T, then B is true in every model of T. We continue by proving the converse statement. Assume that B is true in every model of Z. Then VB is true in every model of X. Thus r v (-1VBJ has no model, and consequently
9.7
1
Consistency, Completeness, and Categoricity
419
is inconsistent by Theorem 6.2. Thus, for some formula C we have r, -1 YB F- C A --i C, and then, by the deduction theorem and the state-
ment calculus, r F- bB and, in turn, r F- B, which completes the proof. Taken together, these two results mean that the theory of inference at hand (that is, that of the predicate calculus) enables one to establish only and all those formulas which are valid conas theorems of sequences of the mathematical axioms of T. We summarize this conclusion in our next theorem.
THEOREM 7.2. A formula B is a theorem of T if B is true in every model of Z.
We take up next the question of the consistency of (the set of mathematical axioms of) T. For such a theory, with its formal definition of deduction, consistency becomes amenable to exact discussion. Indeed, according to Theorems 6.1 and 6.2, Z is consistent if it has a model, thereby establishing that consistency and satisfiability are entirely equivalent notions. We recall that in Section 5.3 we gave a heuristic argument that in informal theories satisfiability implies consistency. Now we have an exact form for both that argument and the meaning of the concepts involved. A further gain that is achieved by formalization is the converse, which is a striking result when stated as, "If the set of axioms of a theory
is not satisfiable, then a contradiction can be derived." Certainly this is by no means clear when operating at the intuitive level. Unfortunately, we feel obliged to detract from these lofty observations by mentioning that although in principle a model exists for every consistent first-order
theory, finding or describing a model may be difficult, and it is a fact of life that many mathematical axiomatic theories are not of first order. We consider next the question of completeness of a first-order theory %. Theorem 7.2 gives an affirmative answer in the sense that validity implies provability, so we turn to the concept of negation completeness. As with consistency, this has a characterization in terms of models.
THEOREM 7.3. Z is negation complete if every statement of T which is true in one model of Z is true in every model of T. Proof. If Z is inconsistent, then the left-hand side of the biconditional is trivially true and the right-hand side is vacuously true. So assume that Z has a model. Breaking the biconditional of the theorem into two conditionals, we shall prove both parts by contraposition.
Suppose there is a statement A and models Za and tz of Z such
420
First-order Theories
I
CHAP. 9
that A is true in Zt and not true in S D2. Then neither A nor -,A is a theorem, by Theorem 7.2, and Z is not negation complete. Conversely,
assume that Z is not negation complete; let A be a statement such that neither A not -,A is a theorem. Adjoin A to the set I' of axioms of Z. The set r u {A} is consistent, for otherwise we could deduce from this set a formula of the form B A -, B, and then, by the deduction theorem and the statement calculus, r i- -,A, which is contrary to assumption. Similarly, r U { -, A } is consistent. Hence, there exist models Zt and X12 of r u { A } and r u { -, A 1, respectively, by
Theorem 6.2. These are also models of r (that is, of Z) and A is true in )i and not true in Z2.
We assume that our earlier discussion of isomorphism is adequate for gathering the meaning of this notion for the case of models of first-
order theories. It is left to the reader to prove that if a statement is true in one model of Z, then it is true in any isomorphic model. We can now prove the following
THEOREM 7.4. If Z is categorical, then Z is negation complete. Proof.
Assume that Z is categorical. Then a statement which is
true in one model of Z is true in every model. Hence, Z is negation complete, by Theorem 7.3. Parenthetically we note at this point that from each of Theorems 7.3 and 7.4 we may infer the completeness of a theory which has no modelthat is, a theory which is inconsistent. (Of course, the completeness of
an inconsistent theory is also an immediate consequence of inconsistency.) This triviality having been uncovered, henceforth we shall consider completeness only for consistent theories. From the next two theorems we may infer that the range of applicabil-
ity of Theorem 7.4 is rather limited, since together they imply that practically no first-order theory is categorical.
THEOREM 7.5. If
has an infinite model, then for every infinite cardinal number c which is greater than or equal to the cardinality of the set of formulas of Z, Z has a model of cardinality c. t Proof. Let Z be an infinite model of Z and let A be a set of cardinality c. Adjoin to one new individual constant a for each element
t When speaking of the cardinal number of a model we have in mind the cardinal numb* of its domain.
9.7
I
Consistency, Completeness, and Categoricity
421
of A and adjoin to the set r of mathematical axioms of 2 all formulas of the form a 76 fg for distinct a and P. Let St' denote this extension
of St and let r' denote the set of its mathematical axioms (thus, r' is the union of r and the set of all axioms of the form a 0 p9). Since the cardinal number of the set of primitive symbols of St cannot exceed c, the cardinal number of the set of primitive symbols of St' is equal to c. Further, S is a model of any finite subset of r', since (i) it is a model of r, and (ii) being infinite, we can assign to any finite number of distinct a's distinct elements of the domain of Z. It follows from Theorem 6.5 (taking into account the presence of the equality relation) that r' has a model V whose cardinalitycall it c'-is less than or equal to c. But since to the equality symbol is assigned the relation of equality of individuals in the domain of `S', c' > c. Hence, `V' is a model of T, having cardinality c.
THEOREM 7.6. If St has models of arbitrarily large, finite cardinality, then it has an infinite model. Proof.
For any positive, finite cardinal number n, the domain of
any model of the formula
... A ao s an_, A a, 5,6 a2 A ... A an-2 0 an-1) has at least n elements. Let us adjoin to the set r all statements of and call the augmented set of axioms , Cn, the sequence C1, C2, Cn: (3ao) (3ai) ... (3an-i) (ao 5,64 ai A ao 0 a2 A
F'. Then, if St has models of arbitrarily large, finite cardinality, each finite subset of F' has a model, and hence r' has a model D by Theorem 6.5. Since to the equality symbol is assigned the relation of equality of individuals in the domain of `aJ, this domain must be infinite if every C is to be true in Z. Since every Cn is true in Z, it is an infinite model.
The two preceding theorems imply that unless a finite upper bound on the cardinality of models of T can be exhibited, then St has models of any preassigned infinite cardinality. Such a theory cannot be categorical, for since isomorphic models always have the same cardinality, the existence of models of St having different cardinalities excludes the possibility of every pair of its models being isomorphic. Even when a finite upper bound on the cardinality of models of T can be found, if models of different cardinalities exist, then St is not categorical for the same reason as above. Hence, a necessary condition for categoricity of
422
First-order Theories
I
CHAP. 9
a first-order theory Z is that every model have the same finite cardinality. But even this condition is not sufficient. To prove this we note first that it is possible to augment the set r of axioms of % with an axiom that restricts the domain of any model to have a preassigned cardinal
number. For example, the conjunction of the formula C (used in the suffices to ensure that proof of Theorem 7.6) and the formula -, the domain of every model has exactly n elements. Suppose now that we adjoin to the mathematical axioms of elementary group theory, as formulated in Example 4.1, the axiom which expresses the fact that there exist exactly four objects. Then every model of this theory has cardinal number 4. After the reader has studied a bit of the theory of groups presented in Chapter 8, he will be able to construct two nonisomorphic models of the theory just defined. So the condition that every model of a theory have the same finite cardinality, which is necessary for categoricity, is not sufficient. Although categoricity has essentially no applications to questions of completeness, the following generalization does lead to significant results
in this area. If c is a cardinal number, a first-order theory is called categorical in power c if any two models of cardinality c are isomorphic. The following result concerning such theories was obtained independently by R. L. Vaught (1953) and J. Loi (1954).
THEOREM 7.7. t If all models of Z are infinite and if, for some infinite cardinal c greater than or equal to the number of formulas of Z, Z is categorical in power c, then T is negation complete. Proof. If T is inconsistent, the theorem is true in a trivial way, so assume that X is consistent. We shall prove that for any given statement of X either it or its negation is a theorem of S, by assuming the contrary and deriving a contradiction. So let S be a statement of Z such that neither S nor -,S is a theorem. Let T' be the theory which results from Z by adjoining S as an axiom and let " be the theory which results from Z by adjoining -i S as an axiom. Since -1 S is not a theorem of Z, V is consistent, and, since S is not a theorem of Z, V" is consistent. Hence, V has a model `Y' and V" has a model
`s", according to Theorem 6.2. Since ¶' and Z" are models of 2 as well, both are infinite by assumption. Let c be an infinite cardinal such that any two models of Z of cardinality c are isomorphic. Then, f (Added in proof.) In M. D. Morley (1962) there is announced the following theorem which meshes very nicely with the above result: If a first-order theory is categorical in one uncountable power, then it is categorical in every uncountable power.
9.7
Consistency, Completeness, and Categoricity
1
423
by Theorem 7.5, V has a model 11' and V" has a model a", both of cardinal number c. Again, (9' and (&-" are also models of T, and consequently they are isomorphic. However, this is impossible, since S is true in t' while -1 S is true in (Y".
This theorem can be used to establish the completeness of a variety of theories. Some examples follow.
EXAMPLES 7.1. The elementary theory of densely ordered sets is a first-order theory in which the 2-place predicate < is the only mathematical constant and whose mathematical axioms are the following. 01. 02.
03. 04. 06. 06.
(x) --i (x < X).
(x)(y)(x 96 y -'x < y Vy <x). (x) (y) (z) (x < y A y < z -' x < z). (x) (y) (3z) (x < y -- x < z A z < y). (3x) (3y) (x < y)
(x) (dY) (3z) (y < x A x < z).
The models of this theory are precisely all simply ordered dense sets of at least two different elements and have neither a least nor a greatest element. Since 01 and B, each with its natural ordering, are models, the theory is not categorical. However, according to the result stated at the beginning of Exercise 2.6.11, any two denumerable models are isomorphic (since each is isomorphic to Q with its natural ordering). It follows that Theorem 7.7 is satisfied with c = No, and thus the theory is negation complete. 7.2. The elementary theory of atomless Boolean algebras is the first-order theory described in Chapter 6 with the axioms given there supplemented by one which implies that each model (that is, each Boolean algebra) has no atoms. All such algebras are infinite, and it can be proved that any two denumerable atomtess algebras are isomorphic. Hence, the theory is negation complete. 7.3. The elementary theory of infinite commutative groups in which every
element different from the identity has a given prime order p is the theory defined in Example 4.1, with the necessary additional axioms to ensure that every model has the distinguishing features stated. For example, among these axioms will appear the formulas C1, Cs, mentioned in the proof of , C,,, Theorem 7.6. It can be shown that any two models of this theory which have the same cardinal number are isomorphic. That is, the second condition of Theorem 7.7 is satisfied for an arbitrary infinite cardinal c. Hence, for each p, the theory is negation complete. 7.4. The elementary theory of algebraically closed fields of given characteristic p may be described as follows. First, the theory of fields as defined in Chapter 8 is formalized as a first-order theory. Then, if p > 0, the formula X, which,
424
First-order Theories
I
CHAP. 9
translated into everyday language, states that the successive addition of any element to itself p times yields the zero element, is added as an axiom. If p = 0, we add instead the sequence of formulas -1X2, -iXs, -iXs, , -,Xp, Finally, to restrict models to fields which are algebraically closed (which means that every polynomial equation with coefficients in the field has a root in the field), we add another infinite sequence of axioms, A2, A3, , A,,, where A. expresses the fact that every polynomial of degree n has at least one
,
root. There are pairs of denumerable algebraically closed fields of any given char-
acteristic which are not isomorphic. However, it is known that any two uncountable algebraically closed fields of the same cardinality and the same characteristic are isomorphic. So, again, the conditions of Theorem 7.7 are satisfied for every c > No and the theory, for each choice of p, is complete.
The above proofs of negation completeness are all due to Vaught; however, the results themselves are known earlier, having been obtained by other methods. We remark further that each of these theories is decidable. This matter is discussed in Section 9. We conclude the section with the unraveling of a paradox that can be derived from two of our earlier theorems. On the one hand, Theorem
2.1.8 seems to imply that the arithmetic of the natural numbers is a categorical theory (since it asserts that any two models are isomorphic),
while on the other hand Theorem 7.5 implies that it cannot be categorical. To bring this conflict into sharp focus we prove a version of Theorem 7.5 which is tailored specially- for the matter at hand: The theory N is not categorical. To prove this we introduce the first-order theory N' which coincides with N except that it has a further individual constant, u, and the following additional mathematical axioms: U9&0
u00+1 u 54 0 + 1+ I +
+ 1 (with n occurrences of "1")
Now let A be any finite subset of the set r' of mathematical axioms of N' and consider the following interpretation of A. As the domain of the interpretation we choose N, and to -}-, , 0, and 1 we assign the familiar meaning, and, if "u Fx- 0 + 1 + + I" (with m occurrence's of "1") is the last member of the above sequence of axioms that occurs in A, then to u we assign m'. Clearly this interpretation is a model of A, Hence, by Theorem 6.5, I", that is, N', has a model. This model is not
9.7
1
Consistency, Completeness, and Categoricity
425
isomorphic to (N, +, -, ', 0, 1) (for what would be the image of u under a proposed isomorphism?). This completes the proof. The fact that this theorem is not in conflict with Theorem 2.1.8 be-
gins to take form when one attempts to formalize the proof of Theorem 2.1.8. It is found that the first-order predicate calculus is inadequate to carry out this proof because use is made of bound occurrences of predicate variables. That is, the formalization requires the so-called
predicate calculus of second order, which, unlike that of first order, admits quantification of both individual and predicate variables. At this point one might conclude that the state of affairs might be summarized by the assertion that when the arithmetic of the natural numbers is formalized as a first-order theory it is not categorical but when formalized as a "second-order theory" it is. Matters are even 'more complicated than this, however, since the latter part of the assertion must be qualified before it becomes correct. The following is an indication of the precise state of affairs.
Suppose that arithmetic is formalized as a second-order theory N". In rough terms this means that we start with the first-order pure pred-
icate calculus with equality, adjoin the constants introduced for N, and alter the definition of formula to admit as a formula (x)A for any formula A and any individual or predicate variable x. Finally, adjoin as the mathematical axioms those introduced for N except that the axiom schema for induction is replaced by a single axiom prefixed with the
quantifier "(A)." The definition of an interpretation is as before. However, a description of the valuation procedure relative to an interpretation with domain D must specify the range of each n-place predicate variable for n = 1, 2,
. We select as this range some non-
empty collection 61n of sets of n-tuples of elements of D. If every formula
of N" is to be meaningful in an interpretation, the sets 6'ri cannot be chosen in an arbitrary mahner. For example, if A is a 1-place predicate variable and A(x) is interpreted as meaning that x is in the set S, then -,A(x) means that x is in the complement of S; hence the range for 1-place predicate variables should be closed under complementation. In general, each method of compounding formulas has associated with it some operation on the sets 6',,, with respect to which these sets must be closed.t We shall assume that the satisfy all such closure conditions. The earlier definition of a model is then applicable to N". If t It is not really necessary to postulate these closure conditions, as is explained in Henkin (1953).
426
First-order Theories
I
CHAP. 9
an interpretation of N" such that, for each n,
sets of n-tuples of D (that is, each n-place predicate variable ranges over all subsets of D") is a model, it is called a standard model. All other models are called nonstandard models. The existence of nonstandard models-indeed, ones in which all of the domains D, Pt, Q2,
are denumerable-can be proved. Finally, we are in a position to describe precisely the meaning of Theorem 2.1.8. It is the assertion that any two standard models of N" are isomorphic ; that is, if only standard models of N" are admitted as models, then N" is categorical. Thus the
formulation of the arithmetic of natural numbers as a second-order theory is stronger than the formulation as a first-order theory. But the existence of nonstandard models of N" means that even this theory is not categorical. This was discovered by Henkin (1950).
EXERCISES 7.1. Formalize the theory of partially ordered sets, using a 2-place relation symbol as the only mathematical constant. Augment the axioms with one that means that there exist exactly three distinct objects, and then show that this theory is not categorical. 7.2. Given any finite set of positive integers, devise a statement such that, when it is adjoined as an axiom to elementary group theory, the cardinal number of any model of the resulting theory is one of the members of the set (and vice versa).
8. Turing Machines and Recursive Functions f Of the metamathematical notions which we have promised to discuss for first-order theories, there remains that of decidability. As we have already pointed out in Section 5, a precise definition of a decision procedure is necessary if one hopes to prove that some theory is undecidable.
In this section we develop a tool for coping with decision problems in general. Then, in the next section, questions of decidability and undecidability are discussed. We begin with a sketch of how the type of metamathematical problem at hand can be recast in arithmetical form. The objects of a formal theory are various symbols, various finite sequences of symbols (the formulas of the theory), and various finite sequences of formulas (such as deductions). Since the set of symbols of those theories with which we are cont In the remainder of the chapter we dd not maintain the level of rigor and degree of selfcontainment exhibited up to this point. Results from without are introduced and some arguments arc purely intuitive in nature.
9.8
1
427
Turing Machines and Recursive Functions
cerned is denumerable, so is (Theorem 2.4.5) the set of all objects. Now suppose that we provide a particular enumeration of the set of all objects of such a theory. If we let a metamathematical statement of the theory refer to the indices in the enumeration instead of to the objects enumerated, a statement of number theory results. More generally, a predicate
of the metalanguage of a first-order theory can be transformed into a number-theoretic predicate, that is, a function on the set N" of all n-tuples of natural numbers into IT, F). Now with each number-theoretic predicate may be correlated a function on N" into N, the so-called characteristic function of the predicate, which takes the value 0 or I
according as the predicate is true or false. If by the computation problem for a number-theoretic function f is understood the problem of discovering a procedure describable in advance for computing the value off for any given argument in a finite number of steps, each deter-
mined by the preassigned recipe, then the decision problem for a predicate in the metalanguage is transformed into the computation problem for some number-theoretic function. Thus, in particular, by way of an arithmetization of the metalanguage of a theory, the decision
problem for that theory reduces to the computation problem for a number-theoretic function. The process of the arithmetization of the metalanguage of a theory, which was devised by Godel for the purpose of establishing the theorems which are discussed in Section 10, is analogous to the arithmetization of Euclidean geometry via the introduction of a coordinate system. A
typical example is afforded by the following arithmetization of the metalanguage of N. The starting point is a correlation of certain natural numbers with the formal symbols of N; for example, the following might be adopted : 3
5
--I
-a
7
9
11
13
15
17
19
21
and, to the ith individual variable, the ith prime greater than 22. Having assigned numbers to symbols, we next assign numbers to formulas as follows. Let n,, n2, , nk be the numbers of the symbols of a formula A in the order in which they occur in A. Let p,(= 2), P2, . -, PA; be the first k primes in order of increasing magnitude. Then the number assigned to A is p' p;' pk'. For example, the numbers of the symbols of the formula
-i (x) -1(x = 0'),
which translates into "0 has a successor," are successively 3, 9, 23
428
First-order Theories
I
CHAP. 9
(assuming that "x" is the first variable), 11, 3, 9, 23, 21, 13, 15, and 11. So the number of the formula is 23 - 39
523
. 711 . 113 - 139. 1783. 1921. 2313. 2915
3111.
The numbers assigned to symbols and formulas in such a way as this
are called the Godel numbers of the symbols and formulas. Every formula has a G6del number but not all numbers are assigned to formulas. For example, the number 4(= 22) is not assigned to any formula. If a number is assigned to a formula, the formula can always be found as follows. Factor the number into its prime factors. Then the number of 2's occurring in the factorization is the number of the first symbol of the formula, the/number of 3's occurring in the factorization is the number of the second symbol, and so on. The fundamental theorem of arithmetic implies that this method of numbering is a one-to-one map-
ping on the set of symbols and formulas of N into N. Finally, to any string of formulas we may correlate a unique number 2"' 3"' pknk, , nk are the successive G6del numbers of the formulas where nt, n2,
of the string. In particular, to every formal proof corresponds a number, the so-called G6del number of the proof. We conclude our example of arithmetization with the observation that the predicate "A is a theorem" is representable by the arithmetic sentence "There exists a number x which is the G6del number of a proof such that the G6del number of A is the power of the largest prime number in the decomposition of x into a product of prime powers."
Returning to the discussion prior to the example we note that if a computation procedure of the sort described can be found for a number-
theoretic function, the function is said to be effectively calculable. It was the close relation between decision problems and the finding of effectively calculable functions that first aroused the interest of workers in the foundations of mathematics in the question of what functions are effectively calculable. On the basis of the foregoing intuitive description of effective calculability we can certainly agree that such functions as x + 1 and xy (committing the "abuse of language" whereby a function is designated by its value-a practice which we shall find convenient to follow in this section) are effectively calculable. But to prove that a given function is not effectively calculable requires an exact definition. We need now to review the historical situation in the 1930's. In his famous paper of 1931 (see Section 10) G6del employed a class of number-
theoretic functions, which are now called the primitive recursive functions (see Section 2.2), and which by their very nature are seen to be
9.8
I
Turing Machines and Recursive Functions .
429
effectively calculable. In 1934 Godel, building on a suggestion of Her-
brand, extended this class of functions to that of general recursive functions, and these, it was agreed, are also effectively calculable. About the same time (1932-1935) Church and Kleene defined a class of functions (the X-definable functions) which, on the basis of their investigations, they proposed might be regarded as embracing all functions which
should be classified as effectively calculable. This proposal took on more significance when Kleene proved that this class of functions is the same as Gbdel's class. In particular, it led Church to formulate the following thesis: Every effectively calculable function is X-definable or, equiv-
alently, general recursive. Since the converse of this statement clearly holds, Church's thesis served to give an exact mathematical meaning to the vague intuitive notion of a number-theoretic function being
calculable by preassigned instructions. A little later (1936-1937) a paper by A. Turing appeared, in which was given an exact definition of a class of functions (we shall call these Turing-computable, or simply computable) along with the proposal that these be identified with those functions which are effectively calculable. Shortly thereafter, Turing proved that his class of functions was the same as the class of
h-definable functions, and hence the same as the class of general recursive functions.t This result, which implies that Turing's thesis is equivalent to Church's, tends to make more reasonable the identification of this class of functions with that of effectively calculable functions. For these reasons almost all research workers in foundations make this identification. Actually, an extension of the above is generally accepted. To describe
this we make a definition. By a partial function is meant a function whose domain is some set of n-tuples of natural numbers and whose values are natural numbers. That is, a partial function is a "partially defined" number-theoretic function. The distinction between partial functions and what we have called number-theoretic functions is often made by calling the latter "total functions." (Note that a partial function may be a total function.) The various classes of functions mentioned above can all be extended to classes of partial functions and the extension of the Church-Turing thesis to partial functions leads to the identification of the class of effectively calculable partial functions with the class of partially recursive functions. Turing's conception of computability arose as a result of an analysis t It should also be noted that E. Post (1936), independent of Turing, formulated a class of functions essentially the same as Turing's.
430
First-order Theories
I
C R A P. 9
of computation procedures (as we know them intuitively) into "atomic" acts, sufficient repetitions of which Turing believed would suffice for any possible computation. Because of the naturalness of this approach
we shall give it preference in our discussion. This centers on giving a mathematical characterization of a class of objects which we shall call Turing machines. These are defined by analogy with physical digital computers. In rough terms, a Turing machine may be described as an imaginary digital computer which is not liable to error and which has a potentially infinite memory. In somewhat more detail, but still at the intuitive level, we imagine a computing machine through which runs a linear tape, assumed to be infinite in both directions and ruled into a two-way infinite sequence of blocks as indicated by the diagram
Initially the input, in the form of a finite number of symbols which the machine "recognizes," with one symbol to a block, is placed on this tape, the other blocks being blank. The "moments" for the operation of the machine are numbered 1, 2, . To the machine is assigned a finite number of "internal states" (in the nature of simple bookkeeping instructions) and the ability to "scan" a single block of the tape at each
moment of operation. The machine is deterministic in the sense that at each moment its next act is completely determined by its internal state at that moment and the symbol printed on the block scanned at that moment. Specifically, in terms of a finite alphabet of symbols which the machine is able to recognize, the machine is capable of the following atomic acts, given an internal state and a symbol on the scanned block.
(i) Erase that symbol, print a new symbol from its alphabet, and (possibly) go'into a different predetermined internal state. (ii) Move one block to the right (that is, scan the block located immediately to the right of the original scanned block) and (possibly) go into another predetermined internal state. (iii) Move one block to the left and (possibly) go into another predetermined internal state. (iv) Come to a complete halt of operations. In order to represent these concepts symbolically, we shall use the symbols ql, qs, to denote internal states of machines. The symbol will be regarded as,the alphabet which various machines are. capable of printing; the symbols R and L will represent a move of one; So, S1,
block to the right and to the left, respectively. By an expression We
9.8
I
Turing Machines and Recursive Functions.
431
shall mean a finite sequence (possibly empty) of symbols chosen from the foregoing. A quadruple is an expression having one of the following forms : qi
(1)
Si
(+
Sk
qL
S2 R qi (2) Si L qt q+ (3) Quadruples serve to specify the next act of a Turing machine when in internal state qj and scanning a block on which appears the symbol S;. q:
One of the form (1) indicates that the next act is to replace S; by St on the scanned block and to enter the internal state qi. One of the form (2) indicates that the next act is a motion of one block to the right followed by the entry into internal state qt. One of form (3) has a similar meaning but with a motion to the left.
We now define a Turing machine to be a nonempty, finite set of quadruples such that no two distinct members have the same first two symbols. (The restriction is to avoid the possibility of a machine assuming a "confused state"!) The q's and S's which appear in the quadruples of a Turing machine are called its internal states and its alphabet, respectively.
In order to motivate the formulation for Turing machines of the formal analogue of a physical machine performing atomic acts in sequence, let us consider an example. Suppose that Mo is the Turing machine which consists of the following quadruples: (4)
qi
So
SS
q2
qi
S1 R
q2
R q2 The following diagram is intended to indicate that Mo is in the initial state ql and scanning a block in which So is entered. The string of S's are initially printed on successive blocks of the tape: S8
q2
Si
S1
So T
(5)
S2
S1
S8
q,
Since there is a quadruple of Mo beginning with qoSo the machine performs an atomic act after which it is in state q2 scanning S, now printed on the same block. This is summarized by the next diagram: S8
(6)
S1
S1
1 q2
S2
SL
S3
432
First-order Theories
!
CHAP. 9
Then, because g2SiRg2 C Mo, there is a move to the right and a continuation of state q2. Because there is no quadruple of the form q2S2in in Mo, the machine "stops."
In place of diagrams of the type used above to describe machine configurations, expressions can be used. For example, the expression (7)
S3
S,
q2
Si
S2
S1
Ss
when interpreted as meaning that the S's shown are printed on the tape and the machine is in state q2 scanning the symbol S1, conveys the
same information as (6). An expression like (7) containing neither R nor L and containing exactly one q and with it not the rightmost symbol,
is called an instantaneous description. If M is a Turing machine and D is an instantaneous description, then D is called an instantaneous description of M if the q that occurs in D is an internal state of M and the S's that occur in D belong to the alphabet of M. An expression composed entirely of members of the letters Si is called a tape expression.
As just defined, a tape expression is a finite sequence of symbols whereas the intuitive analogue is a finite sequence of symbols flanked on either side by an infinite sequence of blank blocks. To obtain the equivalent, in the formal setting, of this additional feature of an intuitive tape expression, we assign the symbol So, which henceforth will also be written as B, the special role of serving as a blank. Then, roughly
speaking, we arrange for the adjunction of a B to an end of a tape expression when the machine is about to run off that end of the (finite)
tape expression. A precise description is embodied in the following definition, which, in its entirety, is the formal analogue of the performance of an atomic act. Let D and E be instantaneous descriptions and M be a Turing machine. Then we shall write
DUE to mean that one of the following alternatives holds. Here, X and Y denote tape expressions. (8) (9) (10) (11) (12)
D is Xq,S,Y, q=S,Skgi C M, and E is Xg1SkY. D is XgIS,SkY, gS,Rga C M, and E is iS,QaS3 Y. D is Xg,S1i q;S1Rga C M, and E is XS1g1B. D is XSkqiSJY, giS5Lga C M, and E is XgLSkSJY. D is qS,Y, q,S1Lgz C M, and E is g1BS,Y.
An operation with input 1), of a Turing machine M is a finite , DA; of instantaneous descriptions such that:
sequence Dl, D2,
9.8
1
Turing Machines and Recursive Functions
433
k - 1. An instantaneous description D is called terminal for M if for no instantaneous description E is D TME. An operation D1, D2, . , DA: of M is called a computation of M with output Dk if Dk is terminal for M. In this terminology the example above employing the machine Mo is the (see Exercise 8.1) computation D; TM D;.+1 for i = 1, 2,
of MO with input S3S1g1SoS2S1Ss and output SsS1S1Q2S2S1Sa. If Mo is mod-
ified by adjoining g2S2Lg1 to it, then it is easily seen that an operation with the same input has no terminal instantaneous description. Thus an operation with this input does not yield a computation. In order to have Turing machines perform numerical computations and, thereby, to define partial functions, it is necessary to introduce a
symbolic notation for natural numbers. For this we shall write the symbol S1 as the, tally mark "I." Then we shall represent the natural numbers by strings of tallies, I for 0, II for 1, III for 2, and so on. Further, with the n-tuple (m1, m2, - - , m.) of natural numbers we shall associate
the tape expression I . . . I BI . . . I BI . . I, where first (from left to right) appears the representation of m1 (as m1 + I tallies), then B, then the representation of m2, - , then B, and then the representation of m,. We shall abbreviate this tape expression by Im1+1BIm2
... BIm.+1.
An operation of a machine M having g1ImI+1BIm=+1 ... BI=-+1
as input will be called an application of M to (ml, m2, , ma). Then, for each positive integer n, we associate with M the partial function TIZ) , m,), if there exists of n variables defined as follows. Given (m1, m2, , Mn) which is a computation, then an application of M to (m1, m2, is equal to the number of tallies in the output of TA,' (m1, m2, , the computation ; if no application of M to (m1, m2, , m,,) is a computation, then Tom' is undefined at (m1, m2, , m,,). A partial function of n variables is partially computable if there exists a Turing machine M such that the function TM' which M defines is equal to f. If, in addition, f is a total function, then f is called computable. -
EXAMPLES 8.1. The successor function is computable. Let M = {Qiiig2}. When M is applied to m, there is the following computation : Q1Im+1
g2Im+1
hence, T};' (m)
= m + 1.
First-order Theories
434
I
CHAP. 9
8.2. Let M = {q'I Bq2, gsBRqs, qal Bq4, gaBBga, g4BRgi}.
Then the function of one variable which M defines is the partial function f such that f(m) = 0 if m is odd and f(m) is undefined if m is even. Indeed, if m is an
odd natural number there is the following computation when M is applied tom: g1Im+t
g2Blm
Bga'm
Bg4Bj'-'
Bmgal
Bmg4B
Bm+lq,B
Since there are no tallies in the output, f(m) = 0. It is left as an exercise for the reader to prove that no application of M to an even natural number is a computation. Thus f is undefined in this case. 8.3. Addition of natural numbers is a computable function. To prove this, consider the machine M = {gllBgl, QiBRgs, gsIRgs, gsBRqs, gsI Bqs}.
It is left to the reader to show that TM)(mi, m2) = ml + ms.
We are relying on the definition of a computable function and the equality of the class of computable functions and that of recursive func-
tions to give the reader some "feeling" for the concept of a recursive function. However, it may be worthwhile to examine the fatter concept
directly. An informal definition of the class of (general) recursive functions is obtained by adding to the schemes listed in Section 2.2 for generating primitive recursive functions (namely, composition and primitive recursion) the following: k(xi, xs, .. , x,) = µyja (xl, x2, .. , , xn, y) = 01,
where the symbol on the right denotes the smallest y such that ) x,,,y) = 0, assuming that for each (x1, xs, - , x,,) there is such a y and that a is any primitive recursive function. This is not the a(xi, Xs,
original Herbrand-Godel definition, but one which was proved by Kleene (1936) to be equivalent to the original. If the assumption that the symbol on the right is defined for all n-tuples is dropped, the result is a definition of the (more extensive) class of partial recursive func-
9.8
!
Turing Machines and Recursive Functions.
435
tions. As is easily imagined, it is a nontrivial matter to show the equality of the class of computable functions and that of recursive functions and, more generally, the equality of the class of partially computable functions and that of partially recursive functions. Accepting these results,
along with the Church-Turing thesis, provides us with the precise concept of computability, or, equivalently, general recursiveness, as a substitute for the intuitive notion of effective calculability. Thereby, in turn, the decision problem for a predicate becomes amenable to exact investigation.
In the terminology of recursive functions, a predicate is called (primitive) recursive if its characteristic function is (primitive) recursive. If a positive solution of the decision problem for a predicate is found, the decision problem for that predicate is called recursively solvable; otherwise the decision problem for the predicate is called recursively unsolvable. We conclude this section with the formulation of some other notions pertaining to formal theories in the language of the theory of recursive functions. A set of natural numbers is called a recursive set if its char-
acteristic function is recursive. A set S of natural numbers is called
recursively enumerable if either S = 0 or S is the range of a recursive function. It can be shown that a set of natural numbers is recursive if it and its complement are both recursively enumerable. An example of a set which is recursively enumerable but not recursive can be effectively constructed. A set S of formulas of a formal theory is called recursive if the set of natural numbers correlated with the members of S by means of a Godel numbering is recursive. The notion of the recursiveness of a set of formulas of a formal theory can be extended
to that of operations on formulas and relations between formulas. In terms of recursiveness, the requirements which we specific in Section 1
for formal axiomatic theories may be restated as (i) thet of formulas must be a recursive set, (ii) the set of axioms must be a recursive set, and (iii) the rules of inference must determine recursively derivability relations.
EXERCISES 8.1. Prove that if D TuE and D TMF, then E = F. Deduce that if there exists a computation of a machine M corresponding to a given input, then it (and hence the output) is uniquely determined. 8.2. Show that the function f of Example 8.2 is undefined for even arguments. 8.3. Show that the function T$;' defined in Example 8.3 is addition.
First-order Theories
436
I
CHAP. 9
8.4. Let n and i be given integers with 1 < i:5 n. Let M be the Turing machine consisting of all quadruples of the form qil Bgsn+i, q,BRq,+,, qsn+,BRq ,
where j ranges over all integers between I and n other than i, together with the following four quadruples: qil Bqi, giBRgs+. it qsn+ilRgsn+i, qsn+iBRgi+l.
Show that Tom) is equal to the identity function Ut defined in Section 2.2.
9. Some Undecidable and Some Decidable Theories The first objective of this section is to sketch a proof of Church's theorem, which asserts the undecidability of the predicate calculus. The initial step is the construction of a number-theoretic function which
is not computable. That such functions exist is clear as soon as the Church-Turing thesis is adopted. For, on the one hand, the set of all possible Turing machines, and hence the set of all computable functions,
is denumerable. On the other hand, Theorem 2.4.6 implies that the set of number-theoretic functions is uncountable. Thus, the illustration is of interest primarily because it is a specific and simple example of such a function. A preliminary for this is the arithmetization of the theory of Turing machines, following the same pattern as that described for first-order theories in Section 8. The starting point is some assignment, such as the following, of certain natural numbers to the symbols of the theory: 3
5
7
9
11
13
15
17
19
R
L
So
q,
S,
qs
Ss
qs
S3
Then, numbers are assigned to expressions in the following way. Let , ni, be the numbers corresponding to the symbols of an nl, ng, expression E in the order in which they occur in E. Then to E is assigned pi' pa` . p,t` where, as before, pi is the ith prime. Numbers assigned to symbols and expressions in this way are called the GOdel
numbers of the symbols and expressions. For example, the Godel number of the quadruple gsBRgs is
218.37.53.717 This method of numbering is a one-to-one mapping on the set of symbols and formulas of the theory of Turing machines into N.
9.9
1
Some Undecidable and Some Decidable Theories
437
By the Godel number of a finite sequence El, E2, , Ek of expressions the number we understand /,0 .Aam Pt yk C1 where gi is the Godel number of E; for i = 1, 2, , k. This assignment
determines a one-to-one mapping on the set of finite sequences of ex
pressions into N. In particular, every computation of each Turing machine has a Godel number and this number uniquely determines the computation. Finally, Godel numbers can be assigned to Turing machines. , Et is any arrangement of the quadruples of a machine M, then the Godel number of this sequence is called a Godel number of M. Although M has a Godel number corresponding to each arrangement of its defining quadruples, each such number uniquely defines M. It may be noted that there exists an effective procedure for obtaining the Godel number of an expression and for obtaining an expression
If E,, Es4
from its Godel number.
In order to define the function promised we introduce the predicate T(m, x, y): m is a Godel number of a Turing machine M such that the application of M to x is a computation having Godel number y. Intuitively this number-theoretic predicate is effectively decidable. For suppose that values of m, x, and y are given. Upon decomposing m into a product of primes we can decide whether it is a Godel number of a machine. If it is not, then the predicate is false for this triple. If m is a Godel number of a machine M, then we decompose y into a product
of primes and determine whether it is the Godel number of a finite sequence of expressions. If it is not, then T(m, x, y) is false. If y is the Godel number of the sequence El, E$, , E. of expressions, then we compare El with D1 = gllx+l. If El 96 D1, then T(m, x, y) is false. If E1 = Dl we then "supply" M with D1 as input and compare each successive instantaneous description of M which can be formed with the corresponding E. By making at most n such comparisons, we can determine whether T(m, x, y) is true or false. Agreement that T(m, x, y) is effectively decidable implies, via the Church-Turing thesis, that its characteristic function is computable. A full treatment of this matter would not make an appeal to the Church-
Turing thesis in order to show the computability of this function; instead, a direct proof that it is primitive recursive would be given. Such a proof appears in Chapter 4 of M. Davis (1958). Each Turing machine determines a partial function of one variable
First-order Theories
438
(
CHAP. 9
in the manner explained prior to Example 8.1. If m is a Godel number of the machine and x is in the domain of the associated function of one variable, we shall denote the value of this function at x by tm(x). We shall also employ this notation for the partial function of two variables whose domain is the set of ordered pairs (m, x) such that m is a Godel number of a machine M and x is in the domain of the function of one variable defined by M. Thus the function tm(x) is defined for given rn and x if there exists a y such that T(m, x, y). Let us symbolize "there exists a y such that T(m, x, y)" by (Ey) T(m, x, y).
Here we intend that the expression "(Ey)" shall symbolize the informal
(and meaningful) phrase "there exists a y such that." Then we may say that tm(x) is defined if (Ey) T(m, x, y). Consider now the total function t such that
ts(x) + 1 if (Ey) T(x, x, y), t(x) = 0 otherwise. [So i(x) = t.(x) + 1 if x is a Godel number of a machine whose application to x yields a computation with output t=(x); otherwise, t(x) = 0,)
We contend that t is not computable; the proof is by contradiction, employing Cantor's diagonal procedure. Assume that t is computable, Then there exists a machine M with Godel number n, let us say, which
computes it. That is, using the notation agreed upon earlier, there exists a function t,,, such that t(x) = t (x) for all numbers x. Hence
t(n) = t (n). Now to say that M computes t implies that for all x there exists a y such that T(n, x, y) and, in particular, there exists a y such that T(n, n, y), that is, (Ey) T(n, n, y). Hence, by the definition of t,
t(n) = t,.(n) + 1, and the two displayed equations furnish the contradiction. We state this result as our next theorem.
THEOREM 9.1. The total function t defined by if 10 otherwise,
t(x) = l t=(x) + 1 is not computable.
(Ey) T(x, x, y),
9.9
!
Some Undecidable and Some Decidable Theories
439
From this result follows easily the undecidability of the predicate (Ey)T(x, x, y). This is our next theorem.
THEOREM 9.2. The predicate (Ey) T(x, x, y) is undecidable. Proof. We shall show that the decidability of (Ey) T(x, x, y) implies the computability of the function t of Theorem 9.1. So assume that (Ey) T(x, x, y) is decidable-that is, there exists a decision procedure such that for each x we can decide whether or not (Ey) T(x, x, y). If for a given x this procedure leads us to the conclusion that (Ey) T(x, x, y),
then we continue the calculation by imitating the application of the machine with Godel number x (this yields a computation by assump-
tion) to compute t=(x) and, finally, add I to the result. If for the given x the assumed decision procedure leads to the conclusion that it is not the case that (Ey) T(x, x, y), then we write 0. Thereby we have a computation procedure for t. An alternative formulation of Theorem 9.2 is: The decision problem for the predicate (Ey) T(x, x, y) is recursively unsolvable. This result is essen-
tially the theorem proved in Church (1936). The only difference is that we have constructed an example in terms of Turing computability whereas Church devised one in terms of X-definability.
We continue with an outline of how Church inferred from this theorem the undecidability of both elementary number theory and the predicate calculus. A prerequisite for clarity in this matter is the introduction of extensive symbolism. Three kinds of symbols are required: Symbols of N (formal symbols), symbols which stand as names of formal symbols (metamathematical symbols), and symbols of intuitive number
theory. The symbols of N consist of the mathematical constants listed
in Example 4.2, the usual logical constants, and a list of individual variables which we take to be
a,g,C,...,
Formulas of N are certain strings of formal symbols. For example, (5) a = 6, (30 (a = 0" c), (6) (7)
(3r) W + a = b)
are formulas of N.
To speak about formulas we shall need metamathematical symbols. As names for variables we shall use X, Xl, X2, ' ' '
3
440
First-order Theories
I
CHAP. 9
and as names for formulas we shall use capital script letters. Further, we shall use a composite notation such as (8)
0'(X1, 22, ... ,
Xn)
instead of "6" for a formula when we are interested in the dependence of 6' on (distinct) variables XI, x2, , X as well as when a substitution is to be performed for some of the variables. Any formula of N can be interpreted as expressing a predicate in intuitive number theory under the usual number-theoretic meaning of the symbols. The intuitive predicate corresponding to the formula (8) we shall denote by (9)
P(xl, x2, ... ,
x are intuitive variables which range over N; we shall say that x; "corresponds to" the formal variable X;, i = 1, 2, - , n. As illustrations, formula (6) expresses a is even [if (6) were designated by then its interpretation would be designated by "E(a)"], and formula (7) expresses a < b. A term of N can be interpreted as expressing an intuitive natural number. The terms 0, 0', 0", , which represent the various natural numbers under the intended interpretation, are called numerals and will be abbreviated by the same symbols "0," "1," "2," as we use
for the natural numbers intuitively. If we introduce an italic letter such as "n" to designate an intuitive natural number, then the corresponding boldface italic letter "n" will designate the corresponding numeral 0'- ' (with n accents). With these specifications about symbolism, let us get on with some definitions. An intuitive number-theoretic predicate P(x1, x2, , is
said to be numeralwise expressible in N if there exists a formula [related to the predicate as (8) is to (9) ] with no free variables other than the distinct variables XI, X2j , x, such that for , x,,) of natural numbers, each n-tuple (x,, x2, 61(91, X2,
(10)
,
if P(xl, x2,
,
is true, then I- 6'(xl, x2,
,
xn),
and (11)
if P(xi, x2,
,
is false, then I- --i 6'(xl, x2,
,
For example, it can be shown that the formula (6) numeralwise expresses a < b and that (5) expresses a = b. The following basic property of primitive recursive predicates first appeared in G6del (1931).
9.9
I
Some Undecidable and Some Decidable Theories
441
LEMMA 9.1. Every primitive recursive predicate is numeralwise expressible in N.
The following remarks are pertinent to the proof of this result. The notion of numeralwise expressibility of number-theoretic predicates has
the following analogue for functions. A number-theoretic function , x,,) is said to be numeralwise representable in N if there A X17 x2, is a formula Q(21, za, , fin, V) containing no variables free other than the distinct variables T1, T2,
X, , z such that for each n-tuple
x,,) of natural numbers, if f(xi, x2, , x, then x) is provable and, moreover, the formal analogue of , x,,, B(xl, x2, "there exists a unique x such that P(xj, x2, - , x,., x)" is provable. It (XI, x2)
,
can be proved that every primitive recursive function is numeralwise repre-
sentable in N. The proof is by induction, first showing how to represent numeralwise the three initial types of functions admissible in a primitive recursive derivation and then showing how to build up formulas which
numeralwise represent functions obtained from initial functions by composition and primitive recursion. Lemma 9.1 follows from the result in italics by an application to the characteristic function of a primitive recursive predicate. The theorem whose proof we would like to sketch may be stated as : If the theory N is consistent, then N is undecidable. However, for technical
reasons, we must settle for a result which is weaker in the sense that undecidability is inferred from the stronger assumption of w-consistency,
a notion introduced in Godel (1931). The theory N, or one which includes the symbolism of N, is called w-consistent if for no variable ir and formula d(z) are all of (12)
-, (X)a(9), a(O), a(1), a(2),
-
provable or, in other words, if not both I- -, (z)a(g) and h a(n) for every natural number n. If N is w-consistent, then it is consistent, for if N is inconsistent then all formulas, in particular those in (12) for some 9 and a(9), are provable in N. However, the converse is false, so w-consistency is stronger than (simple) consistency.
THEOREM 9.3 (Church).
If the theory N is w-consistent, then it is undecidable; that is, if N is w-consistent, then the decision problem for N is unsolvable. Proof.
Our point of departure is the result stated earlier without
proof that the predicate T(m, x, y) is primitive recursive. It follows
First-order Theories
442
I
CHAP. 9
that the predicate T(x, x, y) is primitive recursive, so by Lemma 9.1 there exists a formula d(;7, y), having only 3r and y as free variables, such that if T(x, x, y) is true, then I- d(x, y), (13) and if T(x, x, y) is false, then F d(x, y). (14) We now define B(x) to be (3y)a(x, y)
and note that (15)
if (Ey) T (x, x, y) is true, then - 63(x)
and, conversely, (16)
if
63(x), then (Ey) T(x, x, y) is true.
To establish (15) assume that (Ey) T(x, x, y) is true. Then there exists a y such that T(x, x, y) is true. Applying (13) for x and this y gives I- a(x, y), so, by the predicate calculus, I- (3y)Ct(x, y). That is, I- 63(x). To establish (16), assume that y) is false. Then T(x, x, y) is false for y = 0, 1, 2, - and, hence by (14), I- -, a(x, 0), h -1 d(x, 1), 1- -i d(x, 2), . It follows from the assumed co-consistency of N that -, (y) -1 d(x, y), and hence (3y)a(x, y) is not provable. That is, 63(x) is not provable and (16) follows by contraposition. Using (15) and (16) it is possible to give an indirect proof of the theorem by showing that the assumptions of both the co-consistency and decidability of N yield a contradiction. Indeed, assuming that N is decidable implies that for each x there is a decision procedure for B(x). If 63(x) is provable, then (Ey) T(x, x, y) is true by (16). If 63(x) is not provable, then (Ey) T(x, x, y) is false by (15). That is, there is a decision procedure for (Ey) T(x, x, y) for each x, contrary to Theorem 9.2.
The first step in Church's proof of the undecidability of the predicate
calculus is the translation of N into a theory without the operation symbols +, , and ' and the individual constant 0. This can be done in the manner suggested in Example 4.3. Each of the 2-place operation
symbols is replaced by a ternary relation symbol, ' is replaced by a binary relation symbol, and 0 i4. replaced by a unary relation symbol, together with an appropriate alteration of the mathematical axioms, Then the formula B(x) (defined above) transforms into a formula e(R)
9.9
f
Some Undecidable and Some Decidable Theories
443
of the pure predicate calculus with equality. The next (and major) step is the demonstration that an informal proof of (Ey) T(x, x, y) for an x such that (Ey) T(x, x, y) is true, can be formalized as a deduction from a suitable finite list of closed statements (independent of v) of the predicate calculus. (To lend plausibility to this step we recall an analogous result discussed earlier for applied predicate calculi : A theorem of such a theory can be regarded as a deduction, within the predicate calculus, from a suitable finite list of assumption formulas.) If 1) is the conjunction of this set of formulas, this step may be summarized as (17)
if (Ey) T(x, x, y) is true, then D I- e(z) in the predicate calculus.
Further, a metamathematical proof of the converse of (17) can be given : (18)
if 5 I-- e(x) in the predicate calculus, then (Ey) T(x, x, y) is true
By the deduction theorem and its converse, (17) and (18) yield (19)
(Ey) T(x, x, y) is true if I- SD - e(.V) in the predicate calculus.
The theorem in question follows immediately from (19). For if there were a decision procedure for provability in the predicate calculus with equality, then applying it for given x to decide whether 5) -' e(x) is provable, we could decide in view of (19) whether (Ey) T(x, x, y) is true. But this is contrary to the known undecidability of (Ey) T(x, x, _V).
A similar method of proof yields the undecidability of the predicate calculus without equality. We record these results, which were obtained independently by both Church (1936a) and Turing (1936-1937), as our next theorem.
THEOREM 9.4. The decision problems for the pure predicate calculus and the pure predicate calculus with equality are unsolvable.
COROLLARY. There is no decision procedure for validity in the pure predicate calculus. Proof. This follows from the theorem by way of Theorems 3.3 and 3.4.
Since these initial results of undecidability, the decision problem has been settled in the negative for a great variety of formalized theories. Two different methods of attack have been found to be successful. One of these, which is called the direct method by Tarski (who administers a "school of undecidability" at Berkeley), is essentially based on ideas created by Godel (1931) and is applicable to those theories in which
444
First-order Theories
I
CHAP. 9
considerable number-theoretic apparatus can be developed. The proof of Theorem 9.3, when given in detail, is of this kind. The other method, called the indirect method by Tarski, consists in reducing the decision problem for a theory Z, to that for some other theory Xr2 for which the decision problem has been solved. The proof of Theorem 9.4 is of this sort. With the indirect method the undecidability of a great variety of algebraic theories, including the elementary theories of groups, rings, fields, and lattices, have been proved. In order to present concluding remarks about undecidable theories, several definitions are required. A first-order theory Z, is called a subtheory of a first-order theory Z2 if every theorem of Z, is a theorem of Z2. Under the same circumstances, X-2 is referred to as an extension of Ti. A first-order theory Z is called essentially undecidable if Z is undecidable and the same holds true of every consistent extension of Z which has the same constants as Z. Some undecidable theories have decidable extensions. For example, the predicate calculus with equality becomes a decidable theory upon adding as an axiom (x) (y) (X = y)
On the other hand, in Rosser (1936) it is proved not only that if the theory N is consistent then it is undecidable (thereby strengthening Theorem 9.3) but that it is essentially undecidable. The next definition requires a preliminary remark. For the most part, our discussion of first-order theories has included the assumption either implicitly or explicitly that they are axiomatic theories. However, it is possible to formulate first-order theories without a set of mathematical
axioms. In that event the notion of theorem is replaced by that of valid statement. No uniform method for defining this notion is available. The only general condition which such a definition should fulfill is that any statement that is derivable from valid statements by the rules of inference should be a valid statement. Sometimes it is agreed to consider as valid those and only those statements which are true in a given model. A first-order theory Z in which validity has been defined in some way is said to be axiomatizable if there exists a recursive set S of valid statements of Z such that every valid statement is derivable from the set S; if S is finite, then Z is said to be finitely axiomatizable. Thus, every axiomatic theory is axiomatizable in the sense just defined, and every axiomatizable theory can be represented as an axiomatic theory. There are numerous interrelations among the notions which we now
9.9
I
Some Undecidable and Some Decidable Theories
445
have available for first-order theories. We shall content ourselves with the following, all of which are to be found in Tarski, Mostowski, and Robinson (1953).
THEOREM 9.5 . For a negation complete first-order theory T, the following three conditions are equivalent : (i) T is undecidable, (ii) Z is essentially undecidable, (iii) Z is not axiomatizable. Proof.
The result that (i) implies (iii) for a complete theory is a
consequence of a result due to Kleene (1943). The remaining parts of the theorem follow directly from the definitions of the notions involved.
THEOREM 9.6. A first-order theory Z is essentially undecidable iff Z is consistent and no negation complete extension of Z which has the same constants as Z is axiomatizable. The necessity of the condition follows immediately from Theorem 9.5 and the definitions of the concepts involved. A proof of the sufficiency of the condition is given (in the book just mentioned) in the following equivalent form.
THEOREM 9.7. Every consistent and decidable first-order theory T has a consistent, negation complete, and decidable extension T' which has the same constants as Z. We turn now to a brief survey of decidable theories. From Theorem 9.5 one can infer that a complete and axiomatizable first-order theory is de-
cidable. This provides the justification for our earlier statement that each of the theories defined in Examples 7.1-7.4 is decidable. More generally, any first-order theory Z which satisfies the hypotheses of Vaught's theorem (Theorem 7.7) and, in addition, is axiomatizable, is decidable. If it is assumed that Z is finitely axiomatizable, as well as categorical
in power a for some infinite cardinal c greater than or equal to the number of formulas of Z, then the decidability of Z follows. This result is due to Henkin (1955). This is a modification of Vaught's theorem in the sense that it is not required that all models of Z be infinite. From Henkin's result it follows that the theory considered in Example 7.3 is
still decidable when the requirement that all' models be infinite is dropped (since this theory is finitely axiomatizable).
In Tarski (1951) is presented a decision method for elementary algebra, which is that part of the theory of real numbers which can be
446
First-order Theories
(
CHAP. 9
formalized as a first-order theory. Roughly speaking, this restricts one
to the portion of the general theory of real numbers which can be formulated and established without the help of any set-theoretical devices. For instance, the variables in elementary algebra always stand
for arbitrary real numbers and cannot be supposed to take values in specific sets (such as the set of integers) of real numbers. From the decidability of elementary algebra Tarski inferred the decidability (via the introduction of a coordinate system) of that part of traditional geometry which can be formalized as a first-order theory. This includes most of elementary geometry in the everyday meaning of the term. In. conclusion we mention that from a result of M. Presburger (1930) metamathematical proofs of consistency and completeness, and a decision
procedure, can be given for the first-order theory obtained from elementary number theory by omitting the formation rules and axioms for . In other words, this theory is the elementary theory of addition for natural numbers.
10. Godel's Theorems The theorems in question are the two main results in Godel's paper of 1931. We shall designate them as "Godel's first theorem" and "Godel's
second theorem." In rough terms, the first theorem (which is often referred to simply as "Godel's theorem") asserts for any formal theory T rich enough to include all the formulas of formalized elementary number theory (that is, all formulas of N) that if it is consistent, then it is (negation) incomplete. Defining a closed formula S of a formal
theory as an undecidable formula if neither S nor its negation is a theorem, the theorem asserts, alternatively, the existence of undecidable
formulas in Z if Z is consistent. Godel's second theorem, which is a. corollary of the other, asserts the impossibility of proving the consistency of Z by methods "formalizable within the theory," where the qualifying
clause in quotation marks has a technical meaning which we shall discuss later.
On account of their great importance for the whole program of metamathematics, it is worthwhile to sketch Godel's original proofs and then outline a later proof of his first theorem and a generalization of il,
based on Church's theorem. To simplify the presentation we shall restrict our attention principally to the theory N. Godel's proof of his first theorem, as he himself pointed out, is modeled on the reasoning involved in the logical antinomy known as the Richard
9.10
I
447
Godel's Theorems
paradox, devised by the French mathematician J. Richard in 1905. To discuss this antinomy, which deals with the notion of finite definability,
we consider the English language with (i) the 26 Latin letters, the comma, and the blank space as alphabet, (ii) a preassigned dictionary, and (iii) a preassigned grammar. By an "expression" in this language we understand any finite sequence of these 28 symbols not beginning with a blank space. The set of expressions is denumerable; an enumeration can be given by, for example, specifying that expression E, precedes expression E; if E; contains fewer symbols than does E, and, if they contain the same number of symbols, then precedence is deter-
mined by lexicographic ordering. Upon striking from the specified enumeration of all expressions those which do not define an arithmetic property of natural numbers, we obtain an enumeration (say, ) of those which do. Then, for arbitrary numbers n and p, E0, E,, E2, one of the following cases must occur:
(i) n possesses the property determined by E, or, more simply, Ep is true for n, in which case we write t= E,(n);
(ii) E, is not true for n, in which case we write -1 t= E,(n). Now consider the expression "the natural number n does not have the property determined by the expression which corresponds to n in the ." The mention in this expression of the enumeration Eo, El, E2, can be replaced by an explicit definition of enumeration Eo, E,, E2, it, and if this replacement is made the result is an expression in terms of the given alphabet which defines a property of natural numbers. Hence, there is a q such that the quoted expression is E0. On the other hand, the same expression may be symbolized by n t= E (n). Thus, there exists a q such that for each n, t= EQ(n) iff --it= EE(n). Setting n = q we obtain the contradiction t= EQ(q) if --i t= EQ(q)
Godel's proof that if N is consistent then it is incomplete may be described (with some oversimplification) as the determination of a statement of N which behaves like the quoted expression above with respect to provability; that is, it has the quality that it is provable if its negation is provable. To this end, Godel created the ingenious device, which we described earlier, of an arithmetization of the metalanguage of N. Then he constructed the crucial sentence to be one which, interpreted by a person who knows the enumeration, asserts its own unprovability.
448
First-order Theories
I
CHAP. 9
The proof of the original form of GSdel's first theorem hinges on the next lemma, for which we introduce the following notation. Relative to any specified Godel numbering, for any n which is the Godel number of a formula, let "e." designate the formula. If it is desirable to indicate that this formula contains a free variable z, we may also write
C. as "e (z).
LEMMA 10.1. There is a Godel numbering of the formal symbols of N such that the predicate A(x, y) defined by
A(x, y) : x is the Godel number of a formula e,(z) and y is the Godel number of a proof of the formula C .(x) is numeralwise expressible in N.
The proof consists of showing that the predicate A (x, y) is primitive recursive and then applying Lemma 9.1.
Let a(x, y) be a particular formula which numeralwise expresses A(x, y) for the Godel numbering employed in the lemma, so that if A(x, y) is true, then I- a(x, y), and if A(x, y) is false, then I- -, a(x, y). f Now consider the formula (1)
-t (3y)a(z, y),
which contains x and no other variable free. Let p be the Godel number
of (1). Then (1) is the same as the formula that we have agreed to Since this formula expresses the metamathematical designate as statement that there is no proof of e .(x), the closed formula (2)
-, (3y)a(p, y),
which is C ,(p), expresses the statement that there is no proof of ep(p). That is, (2) expresses its own unprovability. The formal counterpart of this is the first part of the next theorem.
THEOREM 10.1(Godel's first theorem in the original form). If N is consistent, then C,(p) is unprovable, and if N is w-consistent, then -'e,,(p) is unprovable. Thus, if N is w-consistent, then it is negation incomplete, with C,(p) as an example of an undecidable formula. t Electing as we have to outline Godel's original proof leads to some duplication of results stated in Section 9. Indeed, instead of introducing the predicate A(x, y), we could continue with the predicate T(x, x, y). It is because of this fact that we have chosen the same designation for a formula which numeralwise expresses A(x, y) as we did for a formula which expresses T(x, x, y). Further, we call attention to the fact that formula (1), in the symbolism of Section 9, is -iB(T).
9.10
1
Godel's Theorems
449
To establish the first assertion we assume that N is consistent and that C,(p) is provable, and derive a contradiction. The provability of (2) implies the existence of a proof of it; let k be the Godel number of some proof. Then A(p, k) is true and, in turn, I- a(p, k). Hence, by the predicate, calculus, I- (3y)a(p, y), that is, -, a (p) is provable. This contradicts the assumed consistency of N. To prove the second assertion, assume that N is w-consistent. Then N is consistent, and hence, by the foregoing, C,(p) is unprovable. This implies that no natural number is the Godel number of a proof of en(p); that is, for every natural number n, A(p, n) is false. Hence, Proof.
for every natural number n, I- -, ct(p, n). But then the assumed w-consistency of N implies that -, (y) -, U(p,y) is unprovable. By the predicate calculus it follows that -, C ,(p) is unprovable. Rosser (1936), using a more complicated example of an undecidable formula, proved that consistency alone implies the incompleteness of N. We state this as
THEOREM 10.2 (Rosser's form of Godel's first theorem). If N is consistent, then it is negation incomplete.
It was this form of the theorem that we had in mind when describing the heuristic motivation. We have emphasized the original form of the theorem because the proof is intuitively simpler. The example of an undecidable formula which Rosser's proof employs may be interpreted as asserting that for any proof of it there exists a proof of its negation
with an equal or smaller Godel number. With the help of the same formula the hypothesis of Theorem 9.3 can also be simplified to "If N is consistent."
Next we shall sketch the derivation of Godel's second theorem from Theorem 10.1. We begin by assuming that there exists an informal proof of the consistency of N. If to this we append the proof which we gave-that the unprovability of e ,,(p) follows from the consistency of N-the composite proof will be one of the unprovability of C,(p) from scratch. Upon replacing the symbols and formulas of N in this proof by their Godel numbers, it could be transformed into one, in informal
number theory. Now we ask whether this proof in informal number theory could be formalized in N. If it could, then the formula e ,(p) would itself be the formalized version of the resulting theorem, that is, that G;(p) is unprovable. Thus, a formal proof that G,(p) is unprovable
450
First-order Theories
I
CHAP. 9
would be a formal proof of ep(p). By Theorem 10.1 such a proof cannot exist if N is consistent. That is, if N is consistent, then a formal proof within N having the form of (i) a proof of the consistency of N extended by (ii) a proof of the unprovability of ep(p) if N is consistent, does not exist. By showing that part (ii) does exist, we have a method of showing that part (i) does not; that is, there exists no proof within N of the consistency of. N, if N is consistent.
To make the foregoing argument convincing, the first step is to devise a formula of N which expresses the consistency of N. (This is an easy matter for one who is familiar with the technique of Godel numbering of formulas.) Let us call one such formula, "Consis." The second step is the formalization in N, via the Godel numbering, of the metamathematical proof of "N is consistent implies that Cp(p) is unprovable." [This is a long and tedious affair; an account is given in Hilbert and Bernays (1939).] The result is 1- Consis -a C,(p). Finally, the following metamathematical proof by contradiction is supplied. Suppose that I- Consis. Then, by the statement calculus, we infer that F- e,(p). But this is impossible by Theorem 10.1 if N is consistent. Hence not F- Consis. We state this result as
THEOREM 10.3 (Godel's second theorem). If N is consistent, then there is no proof of its consistency by methods formalizable within the theory. Since we have already discussed the significance of consistency theorems for Hilbert's program of metamathematics, we shall merely add a few further remarks at this point. According to Theorem 10.1, the formulation of elementary number theory that we have given is not adequate to ensure that every formula or its negation can be deduced by explicitly stated rules from explicitly stated axioms. It is natural to ask if this deficiency could not be corrected by extending the set of axioms. For instance, if the formula C,(p) were adjoined as an axiom,
then Theorem 10.1 would have no significance. Godel proved that completeness cannot be achieved in this way. The pertinent result is as follows. So long as the axioms of N are extended by a set of formulas whose Godel numbers constitute a recursive set, the resulting theory is incomplete if it is consistent. As for extensions of N with sets of axioms whose Godel numbers do not form a recursive set, they are unacceptable
9.10
I
Godel's Theorems
451
since there is then no effective procedure for deciding whether a given formula is an axiom. f Theorem 10.1 demonstrates the incompleteness of N in another sense, namely, that there are expressible in it statements which are true on finitary grounds but unprovable formally. The formula e ,(p) serves to bear this out. The consistency of N can be proven using transfinite induction up to a sufficiently great ordinal. This was shown by G. Gentzen (1936, 1938).
We conclude this section with the proof of a generalized form of Godel's first theorem. Results of this sort, which rely on the ChurchTuring thesis, are due to Kleene (1936, 1943). As background we recall that in the proof of Theorem 9.3 there was introduced a formula 03(9) of N corresponding to the predicate (Ey) T(x, x, y) such that h (B(x) iff (Ey) T(x, x, y) is true. The theorem which we shall present is applicable to first-order theories in which there can be found a formula which "expresses" (Ey) T(x, x, y) in essentially this way.
THEOREM 10.4. Let Z be a first-order theory which includes enough of the symbolism of N so there can be found a formula a(g) such that, for each natural number x, (i) if I- B(x), then (By) T(x, x, y) is true, and (ii) if f- -' B(x), then (Ey) T(x, x, y) is false. Then there exists a number q such that (Ey) T(q, q, y) is false and neither B(q) nor -1 B(q) is provable. Proof. With X a first-order theory, there is an effective procedure for listing its proofs. Assuming, as we are, that contains a formula B(9) having properties (i) and (ii), we can set up the following computation procedure. Given x, search in order through the proofs of Z for one of -, B(x) and, if such a proof is found, write 0.
By the Church-Turing thesis there exists a Turing machine M with Gddel number q, let us say, to carry out this procedure. Now apply M to q as argument. Then {- --i B(q) iff M applied to q computes a value. Since by the definition of (Ey) T(q, q, y), M applied to q computes a value if (Ey) T(q, q, y) is true, it follows that (3)
(Ey) T(q, q, y) is true if f- -, B(q).
t Actually axiom sets which are only recursively enumerable are acceptable since it is known that if a theory has a recursively enumerable axiomatization, then it has a recursive axiomatization.
452
First-order Theories
I
CHAP. 9
Now assume that 1- -1 (q). Then, by assumption (ii), (Ey) T(q, q, y) is false, and hence, by (3), -1 03(q) is not provable, contradicting our assumption. So by reductio ad absurdum - (B(q) is not provable. In turn, it follows from (3) by contraposition that (Ey) T(q, q, y) is false. In turn, by assumption (i) 63(q) is not provable.
11. Some Further Remarks about Set Theory It should be clear that Zermelo-Fraenkel set theory (Chapter 7) can be formalized as a first-order theory having only one mathematical constant, namely, the two-place relation symbol E. Indeed, the axioms
and axiom schemas are stated in Chapter 7 in a form which makes their translation into symbols form a routine matter. The von Neumann-
Bernays-Godel theory of sets can also be formalized as a first-order theory with the same relation symbol as its only mathematical constant. This may be done by admitting class variables X, Y, Z, , along with set variables x, y, z, - and including as prime formulas (in addition to
those of the form xCx,xCy, ,yCz),xCX,xCY, ,j'CZ,
Set variables are tacitly assumed to range over individuals of a special kind, called sets, and therefore the corresponding quantifiers, such as (x) and (3x) must be interpreted as abbreviations for (x) (S(x) -, .) and (3x) (S(x) A . . . ), respectively. Here the predicate S can be taken to be defined by S(x) H (3X)(x E X). .
The only existing results pertaining to the consistency of these two theories are of a relative nature. In Godel (1940) it is proved that if von Neumann set theory without the axiom of choice is consistent, then consistency is preserved when the axiom of choice as well as Cantor's generalized continuum hypothesis are adjoined as axioms. This is, in other words, a proof of the relative consistency of the axiom of choice
and the generalized continuum hypothesis with the other axioms of von Neumann's set theory. Results concerning the relative "strengths" of these two theories of sets with their respective axioms of choice neglected have been obtained. In I. L. Novak (1950) and Rosser and Wang (1950) it is proved that von Neumann set theory is relatively consistent to Zermelo-Fraenkel set theory; that is, if Zermelo-Fraenkel set theory is consistent, then so is, the von Neumann theory. In the same
paper Rosser and Wang prove that any theorem of von Neumann set theory which involves only set variables is a theorem of Zermelo-Fraenkel
9.11
I
Some Further Remarks about Set Theory -
453
set theory. Taking these two results into account, Rosser and Wang conclude that the two theories are of "essentially equal strength." If the set theories under consideration are consistent, then they are incomplete. This conclusion is a consequence of Godel's first theorem, since elementary number theory can be derived within each of them. In turn, it follows that neither theory is categorical. This is also a direct
consequence of the existence of an infinite model for each theory. Finally, since elementary arithmetic is essentially undecidable and can be developed in both theories, if they are consistent then they too are essentially undecidable. The von Neumann theory, being finitely axiomatizable, thereby establishes the existence of essentially undecidable and finitely axiomatizable theories. In conclusion we shall discuss Skolem's paradox for Zermelo-Fraenkel
set theory. We choose this theory of sets because we have given its axioms; the paradox applies equally well to von Neumann set theory. From the assumption that Zermelo-Fracnkel set theory, which we shall symbolize Cam, is consistent, it follows, using results appearing in Section 6, that ( has a model whose domain D is a countable set. From the ob-
servations made at the end of Section 7.2, the axioms rule out the possibility that D is finite, and consequently D is denumerable. But one axiom of ( postulates the existence of an infinite set and another the existence corresponding to any set, of a set which includes all subsets of that set. From Cantor's theorem there follows the existence of an uncountable set of sets. In summary, e is a theory which, on the one hand, has a denumerable model and, on the other hand, contains a theorem which asserts the existence of uncountably many sets. This is Skolem's paradox (1922-1923). An explanation of sorts can be given. We begin with the observation that within e5 one can define only those subsets of a given set which can be constructed by operations or singled out in the set by properties (in other words, predicates). Now the basic operations for set formations, together with the processes for constructing predicates which are pro-
vided by the axioms, are countable in number; hence their iteration provides the means for defining only a denumerable collection of subsets of a given set. Thus it appears possible to have a denumerable model of Cam.
Now suppose that 9)1 is such a model and that D is its domain. Further,
let x be the set of all subsets of some infinite set defined in e. Since an enumeration of the elements of D can be given, there is an enumeration f of those elements of D which represent the elements of x within the
454
First-order Theories
I
CHAP. 9
model 9J2. One escape from an outright contradiction at this point is for it to turn out that the enumeration f, which (being a function) is a set, is not definable within e. That is, we are suggesting the possibility of the set x of all subsets of a given infinite set definable within t being denumerable from without a while being uncountable within e, because no enumeration is among the sets definable with S. If we accept this "explanation" of Skolem's paradox, then we are faced with the following alternatives. One is that any axiomatization of set theory as a first-order theory must fail to capture fully the notions of the set of subsets of a given set, one-to-one correspondence, and uncountability. Consequently, these concepts must be given a prior status independent of axiomatic theories. If this conclusion is disagreeable (as it may well be in view of the classical set-theoretic paradoxes), then we must be content with the set theory which can be explicitly characterized within the framework of first-order theories. This brings us to the second alternative. Set-theoretic notions such as uncountability must be accepted as relative in nature; a set which is uncountable in a given
axiomatization may prove to be denumerable in another. In brief, such a notion as absolute uncountability is nonexistent. This relativization of set theory was proposed by Skolem. Finally, we mention another explanation of Skolem's paradox: There is no collection of objects which satisfies the axioms of e. This would imply the inconsistency of S and hence the existence of a contradiction within S. As yet, one has not been found.
BIBLIOGRAPHICAL NOTES Sections 1-3. There exist a great variety of formulations of the statement calculus as an axiomatic theory. Those of the first-order predicate calculus are not as numerous. In this connection A. Church (1956) should be consulted. For extended treatments of the statement calculus and the predicate calculus there are several excellent texts available. In addition to the book by Church just cited we mention those by S. C. Kleene (1952), Rosser (1953), and Hilbert and Ackermann (1950). Even better than the English edition of Hilbert and Ackermann is the third German edition (1949). Section 4. A concise description of first-order theories appears in A. Tarski, A. Mostowski, and R. M. Robinson (1953). A variety of algebraic systems, formulated as first-order theories, appears in A. Robinson (1951) and A. Robinson (1956). Section 5. The magnum opus of the Hilbert school of formalism is Hilbert and Bernays (1934, 1939). Less comprehensive but adequate accounts of meta-
Bibliographical Notes
455
mathematics appear in Kleene (1952) and A. A. Fraenkel and Y. Bar-Hillel (1958).
Section 6. This section is essentially an account of Henkin's paper of 1949. Many of the definitions and theorems also appear in Church (1956). For an account of some applications of Henkin's principal theorem to problems in modern algebra, see Henkin (1953). Section 7. Church (1956) and Kleene (1952) were used as references for this section. Although the idea of nonstandard models of a theory originated with Skolem, Henkin was the first to investigate them in a systematic way. For this see Henkin (1947) and (1950) or the account in Church (1956). Sections 8-9. Self-contained and complete accounts of the material discussed in these sections are to be found in Kleene (1952) and, from a somewhat
different viewpoint, in M. Davis (1958). An informative and nontechnical account of Turing machines and recursive functions appears in H. Rogers, Jr. (1959). For proofs of the undecidability of a variety of algebraic theories, see Tarski, Mostowski, and Robinson (1953).
Section 10. A complete account of Godel's theorems and their consequences is given in Kleene (1952). Another high-level development is given in Mostowski (1952). For semitechnical accounts, see Rosser (1939) and G. Kreisel (1952-1953). For a semipopular account, see E. Nagel and J. Newman (1958). Our presentation draws on that which appears in mimeographed notes entitled Sets, Logic, and Mathematical Foundations by Kleene.
REFERENCES ACKERMANN, W.
1937. "Die Widerspruchsfreiheit der allgemeinen Mengenlehre," Math. Ann., 114: 305-315. BACHMANN, H.
1955. Transfinite Zahlen. Springer, Berlin-Gottingen-Heidelberg. BERNAYS, P.
1937-1954. "A system of axiomatic set theory," I-VII. J. Symbolic Logic, 1, 2 (1937): 65-77; II, 6 (1941): 1-17; 111, 7 (1942): 65-89; IV, 7 (1942): 133-145; V, 8 (1943): 89-106; VI, 13 (1948): 65-79; VII, 19 (1954): 81-96.
AND FRAENKEL, A. A. 1958. Axiomatic Set Theory. North-Holland, Amsterdam. BIRKHOFF, G. 1948. Lattice Theory. (2nd ed.). Amer. Math. Soc. Coll. Publ., Vol. 25, New York. BOURBAKI, N. 1939. Theorie des ensembles. Hermann, Paris.
BURALI-FORTI, C.
1897. "Una questione sui numeri transfiniti," Rend. Circ. Mat. Palermo, 11: 154-164 and 260. BYRNE, L. 1946. "Two brief formulations of Boolean algebras," Bull. Amer. Math. Soc.,
52: 269-272. CANTOR, G. 1915. Contributions to the Founding of the Theory of Transfinite Numbers. English tr.
by P. E. B. Jourdain. Chicago, 1915. Reprinted by Dover, New York. 1932. Cesammelte Abhandlungen mathematischen and philosophischen Inhalts. E.
Zermelo (ed.), Springer, Berlin. CHEVALLEY, C. 1956. Fundamental Concepts of Algebra. Academic, New York. 457
458
References
CHURCH, A.
1936. "An unsolvable problem of number theory," Amer. J. Math., 58: 345-363. 1936a. "A note on the Entscheidungsproblem," J. Symbolic Logic., 1: 40-41 and 101-102 (a correction). 1956. Introduction to Mathematical Logic, Vol. 1, Princeton University Press, Princeton, N.J.
COPT, I. M. 1954. Symbolic Logic. Macmillan, New York.
DAVIS, M. 1958. Computability and Unsolvability. McGraw-Hill, New York.
DEDEKIND, R. 1888. Was sind and was sollen die Zahlen? Vieweg, Braunschweig. 1932. Gesammelte mathematische Werke. Vieweg, Braunschweig.
EXNER, R. M. AND ROSSKOPF, M. F. 1959. Logic in Elementary Mathematics. McGraw-Hill, New York.
FRAENKEL, A. A. 1922. "Zu den Grundlagen der Cantor-Zermeloschen Mengenlehre," Math. Ann., 86: 230-237. 1961. Abstract Set Theory. North-Holland, Amsterdam. AND BAR-HILLEL, Y. 1958. Foundations of Set Theory. North-Holland, Amsterdam.
FREGE, G. 1884. Die Grundlagen der Arithmetik, Eine logischmathematische Untersuchung fiber
der Begriff der Zahl. Breslau. Reprinted Marcus, Breslau, 1934.
GENTZEN, C.
1936. "Die Widerspruchsfrieheit der reinen Zahlentheorie," Math. Ann., 122: 493-565.
1938. "Die gegenwartige Lage in der mathematischen Grundlagenforschung," in Forschungen zur Logik and zur Grundlegung der exakten Wissenschaften
(n.s.), No. 4, Hirzel, Leipzig, pp. 5-18. GODEL, K. 1930. "Die Volistandigkeit der Axiome des logischen Funktionenkalkiils," Monatsh. Math. Physik, 37: 349-360.
1931. "uber formal unentscheidbare Satze der Principia Mathematica and verwandter Systeme. I," Monatsh. Math. Physik, 173-198.
References
459
1940. The Consistency of the Axiom of Choice and of the Generalized Continuum-
hypothesis with the Axioms of Set Theory. Annals of Mathematics Studies, No. 3. Princeton, N.J. (3rd printing 1953).
HALL, JR., M. 1959. The Theory of Groups. Macmillan, New York.
HALMOS, P. R. 1956. "The basic concepts of algebraic logic," Amer. Math. Monthly, 63: 363-387. 1960. Naive Set Theory. Van Nostrand, Princeton, N.J.
HALPERN, J. D. 1961. "The independence of the axiom of choice from the Boolean prime ideal theorem," Notices, Amer. Math. Soc., 8: 279-280.
HAMILTON, N. T. AND LANDIN, J. 1961. Set Theory. Allyn and Bacon, Boston, 1961. HARTOGS, F.
1914. "Uber das Problem der Wohhordnung," Math. Ann., 76: 438-443. HASENJAEGER, G.
1953. "Eine Bemerkung zu Henkin's Beweis fur die Vollstandigkeit des Pradikatenkalkuls der ersten Stufe," J. Symbolic Logic, 18: 42-48. HENKIN, L.
1947. "The completeness of formal systems," unpublished. Ph.D. thesis, Princeton University. 1949. "The completeness of the first-order functional calculus," J. Symbolic Logic, 14: 159-166. 1950. "Completeness in the theory of types," J. Symbolic Logic, 15: 81-91. 1953. "Some interconnections between modern algebra and mathematical logic," Trans. Amer. Math. Soc., 74: 410-427. 1955. "On a theorem of Vaught," J. Symbolic Logic, 20: 92-93. HERBRAND, J. 1930. Recherches sur la Thiorie de la Demonstration. Travaux de la Societ6 des
Sciences et des Lettres de Varsovie, Classe III, Sciences math6matiques et physiques, No. 33. HILBERT, D. 1899. Grundlagen der Geometric. 7th ed. (1930), Teubner, Leipzig and Berlin. English tr. by E. J. Townsend, The Foundations of Geometry, Open Court, Chicago, 1902; La Salle, Ind., 1962.
460
References
AND ACKERMANN, W. 1928. Grundzgge der theoretischen Logik. Springer, Berlin. 2nd ed., 1938. Re.
printed by Dover, New York, 1946. 3rd ed., Springer-Berlin-GottingenHeidelberg, 1949. English tr. of the 2nd ed., Principles of Mathematical Logic.
Chelsea, New York, 1950.
HILBERT, D. AND BERNAYS, P. 1934. Grundlagen der Mathematik, Vol. 1. Springer, Berlin. Reprinted by J. W. Edwards, Ann Arbor, Mich., 1944. 1939. Grundlagen der Mathematik, Vol. 2. Springer, Berlin. Reprinted by J. W. Edwards, Ann Arbor, Mich., 1944. HUNTINGTON, E. V. 1904. "Sets of independent postulates for the algebra of logic," Trans. Amer. Math. Soc., 5: 288-309.
RLEENE, S. C. 1936. "General recursive functions of natural numbers," Math. Ann., 112: 727-742. 1943. "Recursive predicates and quantifiers," Trans. Amer. Math. Soc., 53: 41-73. 1952. Introduction to Metamathematics. Van Nostrand, Princeton, N.J.
KREISEL, O.
1952-1953. "The diagonal method in formalized arithmetic," British J. Philos. Sci., 3: 364-374.
KUROSH, A. O. 1955. The Theory of Groups. Chelsea, New York. LANDAU, E.
1930. Grundlagen der Analysis. Akademische Verlagsgesellschraft M.B.H., Leipzig. LEDERMANN, W. 1953. Introduction to the Theory of Finite Groups. Interscience, New York.
Los, J. 1954. "On the categoricity in power of elementary deductive systems," Colloq. Math., 3: 58-62. AND
1954-1955. "Effectiveness of the representation theory for Boolean algebras," Fund. Math., 41: 49-56.
References
461
MCCOY, N. H. 1960. Introduction to Modern Algebra. Allyn and Bacon, Boston.
MCKINSEY, J. C. C. 1935. "On the independence of undefined ideas," Bull. Amer. Math. Soc., 41: 291-297.
MCSHANE, E. J. AND BOTTS, T. A. 1959. Real Analysis. Van Nostrand, Princeton, N.J. MIRIMANOFF, D. 1917. "Les antinomies de Russell et de Burali-Forti et le probleme fundamental de la th6orie des ensembles," Enseignement Math., 19: 37-52.
MONTAGUE, R. AND HENKIN, L.
1956. "On the definition of `formal deduction,' " J. Symbolic Logic, 21: 129-136. MORLEY, M. D. 1962. "Categoricity in power," Notices, Amer. Math. Soc., 9: 281. MOSTOWSKI, A. 1952. Sentences Undecidable in Formalized Arithmetic. An Exposition of the Theory
of Kurt Godel. North-Holland, Amsterdam.
NAGEL, E. AND NEWMAN, J. 1958. Godel's Proof. New York University Press, New York. VON NEUMANN, J.
1925. "Eine Axiomatisierung der Mengenlehre," J. Reine Angew. Math., 154: 219-240; 155 (1926) : 128 (corrections). 1928. "Die Axiomatisierung der Mengenlehre," Math. Z., 27: 669-752..
1928a. "Uber die Definition durch transfinite Induktion and verwandte Fragen der allgemeinen Mengenlehre," Math. Ann., 99: 373-391. 1929. "Uber.eine Widerspruchsfreiheitsfrage in der axiomatischen Mengenlehre," J. Reine Angew. Math., 160: 227-241. NOVAK, I. L.
1950. "Construction of models for consistent systems," Fund. Math., 37: 87-110. PEANO, G. 1889. Arithmetices Principia. Bocca, Turin.
462
References
POST, EMIL L. 1921. "Introduction to a general theory of elementary propositions," Amer. J. Math., 43: 163-185. PRESBURGER, M.
1930. "Uber die Vollstandigkeit eines gewissen Systems der Arithmetik ganzer Zahlen, in welchem die Addition als einzige Operation hervortritt," in C.R.I. Congres des Math. des Pays, pp. 92-101, 395. Slaves, Warsaw. ROBINSON, A. 1951. On the Metamathematics of Algebra. North-Holland, Amsterdam. 1956. Complete Theories. North-Holland, Amsterdam.
ROBINSON, R. M.
1937. "The theory of classes. A modification of von Neumann's system," J. Symbolic Logic, 2: 29-36.
ROGERS, JR., H. 1959. "The present theory of Turing machine computability," J. Soc. Indust. Appl. Math., 7: 114-130. ROSENBLOOM, P. C. 1950. The Elements of Mathematical Logic. Dover, New York.
ROSSER, J. B. 1936. "Extensions of some theorems of Godel and Church," J. Symbolic Logic,
1: 87-91. 1939. "An informal exposition of proofs of Godel's theorem and Church's theorem," J. Symbolic Logic, 4: 53-60. 1953. Logic for Mathematicians. McGraw-Hill, New York. AND WANG, H.
1950. "Non-standard models for formal logics," J. Symbolic Logic,
15:
113-129.
RUSSELL, B. 1902. "Theorie ggnerale des series bien-ordonn6es," Rev. Math. Pores Appl.,
8: 12-43. 1908. "Mathematical logic as based on the theory of types," Amer. J. Math., 30: 222-262. SIKORSKI, R. 1960. Boolean Algebras. Springer, Berlin-Gottingen-Heidelberg. SKOLEM, T. 1922. "Eire Bemerkungen zur axiomatischen Begrundung der Mengenlehre,"
References
463
Wiss. Vortrage gehalten auf dem 5 Kongress der skandinaa. Math. in Helsingfor
1923, pp. 217-232.
1929. "Uber einige Grundlagen fragen der Mathematik," Skrifter Utgitt au det Norske Videnskaps-Akademi i Oslo, No. 4. STABLER, E. R.
1953. An Introduction to Mathematical Thought. Addison-Wesley, Reading, Mass.
STONE, M. H. 1936. "The theory of representations for Boolean algebras," Trans. Amer. Math. Soc., 40: 37-111.
1937. "Applications of the theory of Boolean rings to general topology," Trans. Amer. Math. Soc., 41: 321-364. 1938. "The representation of Boolean algebras," Bull. Amer. Math. Soc., 44: 807-816.
SUPPES, P. 1957. Introduction to Logic. Van Nostrand, Princeton, N.J. 1960. Axiomatic Set Theory. Van Nostrand, Princeton, N.J. TARSKI, A.
qui equivalent AL 1'axiome de choix," Fund. 1923. "Sur quelques Math., 5: 147-154. 1924. "Sur les ensembles finis," Fund. Math., 6: 45-95. 1941. Introduction to Logic, Oxford University Press, New York.
1951. (With the assistance of J. C. C. McKinsey.) A Decision Method for Elementary Algebra and Geometry. 2nd ed., University of California Press, Berkeley and Los Angeles. MOSTOWSKI, A., AND ROBINSON, R. M. 1953. Undecidable Theories. North-Holland, Amsterdam. TURING, A. M. 1936-1937. "On computable numbers, with an application to the Entscheidungsproblem," Roc. London Math. Soc., Ser. 2, 42: 230-265; 43 (1937): 544-546 (correction).
VAUGHT, R. L. 1954. "Applications of the Lowenheim-Skolem-Tarski theorem to problems of completeness and decidability," Indag. Math., 16: 467-472. WANG, H.
1957. "The axiomatization of arithmetic," J. Symbolic Logic, 22: 145-158.
464
References
WHITEHEAD, A. N. AND RUSSELL, B. 1910-1913. Principia Mathematica, Cambridge University Press, Cambridge.
WILDER, R. L. 1952. Introduction to the Foundations of Mathematics. Wiley, New York.
ZERMELO, E. 1904. "Beweis, dass jede Menge wohlgeordnet werden kann." Math. Ann., 59: 514-516.
1908. "Untersuchungen fiber die Grundlagen der Mengenlehre. I," Math. Ann., 65: 261-281.
1930. "Uber Grenzzahlen and Mengenbereiche," Fund. Math., 16: 29-47. ZORN, M. 1935. "A remark on method in transfinite algebra," Bull. Amer. Math. Soc., 41: 667-670. 1944. "Idempotency of infinite cardinals," Univ. California Publ. Math. (n.s.),
VoL 2, No. 1, pp. 9-12.
SYMBOLS AND NOTATIONS Page
Symbol
2
9, $
2
Symbol
f:X--Y Yx
2
ix
2 2 2
f/A
2
xCA x1, x2, ,xxCA
4
xQA
7+
(xl) X2,
,
nx XA
37
X"
gof
4
f-1 Va
37 38
5
na
41 43 44 45 47
4
ACB
ACB
10 10
(xCAIP(x)}
35 35 36 36 37
7 8 9
{xIP(x) }
Page
N X(A11C1) lub A g1bA D,x
11
ACB
53 53 59 79
6'(A)
11
7
A U B
12
card A
80 80
B2A
0
13
AB
13 13
940
85
13 24 25 26
14
92
x ft, Y
98 99
a*
101
s(a)
R.
106 120
1x
26 26 26
9(A)
121
[A]
26
x Nay
133 134 138 139
A(B X-A
A+B (x, y) (zi, xg,
b, .x x Y
, xn)
31
IP
31 35
A
w
[x];
x'- y [x],
81
82
Symbols and Notations
466
Symbol
Page
Symbol
Page
x icy
146 149
x+
298
ord a
311
162 162 162 162 162
Zy (G:H)
331
[x]t
-, p PAQ
PvQ
P--+ Q P+-+ Q T F
164 164
I- A
Pi- A D(r, A)
A,,A2, ,A. B
339 377
377 377 379
V A
392
KA
172
N
397
A eq B
173
432
A,, As, ... , A. K B (Vx)
180 195 196 265
DT ME T(m, x, y)
(3x)
B/I
(Ey)
437 438
A(x,y)
448
AUTHOR INDEX Ackermann, W., 297, 454
Bachmann, H., 320 Bar-Hillel, Y., 290, 320, 455 Bernays, P., 318-320, 450, 454 Bernstein, F., 81 Birkhoff, G., 55, 287 Bolyai, J., 222-223 Borel, E., 111 Botts, T. A., 372 Bourbaki, N., 113 Cantor, G., 1-6, 55, 86, 88, 92, 98, 128, 159, 315
Cayley, A., 223 Chevalley, C., 372
Copi, I. M., 220 Church, A., 429, 439, 441-443, 454-455 Davis, M., 437, 455 Dedckind, R., 59, 87, 129, 159
Kreisel, G., 455 Kurosh, A. G., 372
Landau, E., 129, 159 Landin, J., 55 Ledermann, W., 372 Lobachevsky, N., 222-223 Loi, J., 286, 422 McCoy, N. H., 159 McKinsey, J. C. C., 244 McShane, E. J., 372 Mirimanoff, D., 304 Montague, R., 392 Morley, M. D., 422 Mostowski, A., 454-455 Nagel, E., 455 Neumann, J. von, 80, 289, 303-306, 318 Newman, J., 455 Novak, I. L., 452
Euclid, 221-222
Exner, R. M., 220 Fraenkel, A., 129, 289, 290, 303, 320, 455 Frege, G., 80, 160
Padoa, A., 244 Peano, G., 59, 228 Pieri, M., 226, 243 Post, E., 382, 429 Presburger, M., 446
Gentzen, G., 451 Godel, K., 318-320, 393, 405, 415, 427-428, 441, 443, 447, 450
Hall, Jr., M., 372 Halmos, P. R., 287, 292, 320 Halpern, J. D., 286 Hamilton, N. T., 55 Hartogs, F., 129 Hasenjaegor, G., 411 Henkin, L., 392, 410, 425, 426, 445, 455 Herbrand, J., 391, 429 Hilbert D., 226, 237, 243, 403-405, 450, 454 Huntington, E. V., 250 Kleene, S., 172, 429, 451, 454-455 Klein, F., 223
Quine, W. V., 320
Richard, J., 447 Robinson, A., 454 Robinson, R. M., 306, 313 Rogers, Jr., H., 455 Rosenbloom, P. C., 287-288 Rosser, J. B., 90, 129, 228, 449, 452-455 Rosskopf, M. F., 220 Russell, B., 9, 80, 160, 404 Ryll-Nardzewski, C., 286 Schroder, E., 81 Sierpinski, W., 129 Sikorski, R., 287-288 467
468 Skolem, T., 289, 303, 455 Stabler, E. R., 247, 287 Stone, M., 271, 284, 287 Suppes, P., 220, 290, 320
Author Ind Vaught, R. L., 422, 424 Wang, H., 129, 452-453 Whitehead, A., 160, 404 Wilder, R. L., 247
Tarski, A., 87, 125-126, 129, 220, 247, 320,
443, 445-446, 454-455 Turing, A., 429-430, 443
Zermelo, E., 101, 111, 289, 298, 305, 404 Zorn, M., 121, 129
SUBJECT INDEX Abelian group, 204 Absolute complement, 13 Absolute value, 142, 358 Absorption laws, 20 Affine geometry, 230, 233 Aleph, 124 Algebra, 323; Lindenbaum, 274; of sets, 16, 248; quotient, 266, 324 Antecedent, 161 Antisymmetric relation, 48 Applied predicate calculus, 393
Binary relation, 25 Bolyai-Lobachevsky geometry, 222-223, 237 Boolean algebras: atom of, 267; atomic, 271;
Archimedean property: for ordered fields, 365; for rational numbers, 141; for real
principle of duality for, 251; quotient alge-
numbers, 152 Archimedean-ordered field, 366 Assignment, 206 Associative law, 73 Associative operation, 73 Atom, 267 Atomic Boolean algebra, 271 Axiom: of choice, 92, 112 if., 302; of extension, 291; of infinity, 298; of pairing, 292; of power set, 296; of regularity, 305; of replacement, 303; of subsets, 291; of union, 295
Axiom schema, 291, 376; of replacement, 303; of restriction, 305; of subsets, 291 Axioms: dependent set, 243; independent set, 243
Axioms for affine geometry, 230 Axioms for Boolean algebras, 249, 255 Axioms for densely ordered sets, 423 Axioms for fields, 348 Axioms for groups, 229, 329-331, 397 Axioms for integral domains, 348 Axioms for natural numbers, 58-59, 397 Axioms for ordered integral domains, 357 Axioms for partially ordered sets, 202 Axioms for predicate calculi, 389-390 Axioms for rings, 346 Axioms for set theory, 291-303 Axioms for statement calculi, 376
Biconditional, 162; truth table for, 165 binary operation, 37
complete, 271; congruence relation for, 259-262, proper, 262; filter of, 280, maximal, 280, principal, 280; formulation of, 249, 255; free, 276; homomorphic image, 263; homomorphism of, 263-266, kernel of, 265; ideal of, 264; maximal, 269, proper, 264, unit, 264, zero, 264; isomorphic im-
age of, 263; ordering relation for, 252; bra of, 266; representation of, 267-272; two-valued homomorphism of, 282; unit element of, 249; zero element of, 249 Boolean logic, 280 Boolean ring, 350 Burali-Forti's paradox, 128, 310, 319 Cancellation property, 132 Canonical mapping, 40, 343 Cantor's paradox, 128, 317-318 Cantor's theorem, 86 Cardinality of a model, 420 Cardinal numbers, 80, 317; addition of, 95; exponentiation for, 97; finite, 83; infinite, 85; multiplication of, 95; natural numbers as, 83; ordering of, 82, 120, 317
Cartesian product, 26, 297; of a family of sets, 47
Categoricity in power, 422 Cauchy convergence principle, 156 Cauchy sequences of rational numbers, 143;
addition of, 144; multiplication of, 144; positive, 146 Cauchy sequences of real numbers, 154 Chain, 50
Characteristic function, 37; of a predicate, 427
Choice function, 113 Church's theorem, 443 Church-Turing thesis, 429, 435 Class theorem, 319 Classes, 319; proper, 319; residue, 31 Closed formula, 392 ALA
Subject Index
470 Collection of sets, 5; disjoint, 13
Commutative group, 204, 331; simply ordered, 235 Commutative law, 78 Commutative operation, 78 Complement, 249; absolute, 13; relative, 13, 295
Complete Boolean algebra, 271 Complete ordered field, 367 Complete set, 300 Completeness theorem, 414; for the predicate calculus, 393, 415, 417; for the statement calculus, 381 Composite, 73 Composite function, 38 Composite sentence, 161 Composition of functions, 38, 75 Computable function, 429 Computation problem, 427 Conditional, 161; truth table for, 165 Congruence relations: on a Boolean algebra,
216 ff.; on a group, 341-342; on an algebra, 323 Conjunction, 161, 391; truth table for, 165 Consequence, 180, 215 Consequent, 161 Consistent theory, 2 36 Constant: individual, 193, 388; logical, 395; mathematical, 395 Continuum hypothesis, 94; generalized, 121 Continuum problem, 94 Contradiction, 188 Converse relation, 49 Coset, 339 Coset decomposition, 339 if. Countable set, 87 Cover, 50 Cyclic group, 335
Decidable predicate, 408 Decidable theories, 408, 445 if. Decision problem: for a predicate, 408; for a theory, 408 Decision procedure, 408 Deduction from assumptions, 377, 390 Deduction theorem: for the predicate calculus, 391; for the statement calculus, 378 Defining property, 7 Definition by induction, 72, 76 Definition by transfinite induction, 103 Demonstration, 377, 390 DeMorgan laws; 20, 46
Dense chain, 101 Densely ordered set, 423 Denumerable set, 87 Difference group, 343 Difference ring, 354 Disjoint collection of sets, 13 Disjoint sets, 13 Disjunction, 161; truth table, 165 Division algorithm, 70 Division ring, 348 Domain, 205; of a relation, 26 Domination, 81 D-sequence, 399 Dual, 18 Dual ideal, 280 Dyadic expansion, 93
Effective procedure, 374 Effectively calculable function, 428 Effectively decidable predicate, 408 Effectively decidable theory, 408 Elementary number theory, 397 Elementary theory of groups, 397 Empty set, 11, 294. Enumeration, 87 Epsilon number, 315 Equivalence class, 30 Equivalence relation, 29 Existential quantifier, 196 Expression, 430 Family, 45
Field, 348; Archimedean-ordered, 366; of quotients, 363; ordered, 364 Filter, 280 Finite cardinal, 83 Finite set, 83, 301 First coordinate, 24 First-order theory, 394; axiomatizable, 444; essentially undecidable, 444; finitely axiomatizable, 444; formula of, 395-396; interpretation of, 399; logical axioms of, 396; logical constants of, 395; mathematical axioms of, 395; mathematical constants of, 395; terms of, 395 Formal proof, 224, 377, 390 Formulas (of a first-order theory), 395-396; consequence of a set of, 401; model of, 400-
401; satisfaction by a D-sequence, 399; satisfiable, 410, 415; satisfiable in a domain, 401; true, 400; valid, 401; valid in a domain, 401
Subject Index Formulas (of the predicate calculus), 201, 388; closed, 392; composite, 201; consequence of a set of, 215, 401; model of, 400;
prime, 200, 388; substitution in, 209, admissible, 212; true in an interpretation, 400; valid, 208, 401; valid in a domain, 208, 401
Formulas (of the statement calculus), 169, 375-376; arithmetical representation of, 178; composite, 169; consequence of a set of, 180; consistent set of, 384; deducible
from assumptions, 377; denial of, 176; equivalent, 173; inconsistent set of, 384; maximal consistent set of, 386; prime, 169; satisfiable set of, 188, 384; string of, 182; valid (tautological), 172 Free Boolean algebra, 276 Function, 34; argument of, 34; characteristic, 37; choice, 113; composite, 38; composition, 39, 75; computable, 429; effectively calculable, 428; extension of, 36; identity, 36; inverse, 41; X-definable, 429; logical, 205; number-theoretic, 75; one-to-one, 36;
on X into Y, 35; on X onto Y, 35; order preserving, 52; partial, 429; partial recursive, 434-435; partially computable, 433; primitive recursive, 75; recursive, 434; restriction of, 36; successor, 57; total, 429; truth, 170 Fundamental theorem of arithmetic, 71 Generalized continuum hypothesis, 121 Godel numbers, 428, 436 Gi del's completeness theorem for the predicate calculus, 393 Gddel's first theorem, 446, 448-449 G3del's second theorem, 446, 449-451 Graph of a relation, 27 Greatest lower bound, 53 Greatest member, 53 Group, 329 ff.; Abelian, 204, 331; commutative, 331; cyclic, 335; difference, 343; homomorphic image of, 343; homomorphism of, 343; infinite, 331; isomorphic, 335; of one-to-one-transformations, 331; order of, 331; quotient, 342; transformation, 337 Group theory, 229, 232, 397
Hartog's theorem, 125 Hausdorff's maximal principle.. 116 Hilbert's axiom, 118
471
Homomorphism: of Boolean algebras, 263; of groups, 343; kernel of, 265, 344; of rings, 355; two-valued, 282
Ideal: dual, 280; of a Boolean algebra, 264; of a ring, 353 Idempotence: of intersection, 19; of union, 19 Identity function, 36 Identity relation, 26 Image, 34
Inclusion, 9-10; proper, 10 Inconsistent theory, 236 Independent primitive term, 244 Index, 45, 339 Index set, 45 Indexed set, 45 Individual constant; 193, 388 Individual symbol, 388 Individual variable, 193, 388; bound occurrence of, 203, 389; conditional interpretation of, 392; free occurrence of, 203, 389; generality interpretation of, 392; substitution for, 209 Individuals, 290 Induction: definition by strong, 77; definition by weak, 72; proof by strong, 70; proof by weak, 70; see also Transfinite induction Infinite cardinal, 85 Infinite set, 85, 301 Infinum, 53 Informal axiomatic theories, 227; categorical, 241; consequence of, 242; consistent, 236; deductively complete, 239; definitions of, 233 ff.; formulation of, 242; inconsistent, 236; interpretation of, 231; language of, 230; model of, 231; negation complete, 239; representation problem for, 245; representation theorems for, 245; sentence of, 230; statement of, 230 Informal theories, 227 Injection mapping, 36 Integer, 134; negative of, 135; positive, 134 Integral domain, 348: ordered, 357 Integral rational number, 139 Integral system, 58 if. Interpretation, 231; of first-order theory, 399 Intersection, 13, 294; of a collection of sets, 44 Invalid argument, 187 Inverse function, 41 Inverse image, 42 Inversion of functions, 41 Invertible element, 326
472
Irreflexive relation, 49 Isomorphism, 240-241; of Boolean algebras, 263; of groups, 335; of integral systems, 59, 68; of ordered domains, 359; of partially ordered sets, 52; of rings, 355 Kernel, 265, 364
a-definable functions, 429 Lattice, 253-254 Least member, 53 Least upper bound, 53 Left coset, 339; decomposition, 339 Limit, 155 Lindenbaum algebra, 274 Logical constant, 395 Logical function, 205 Lower bound, 53; greatest, 53
Mapping: see Function Mathematical constant, 395 Matrix, 348 Maximal member, 53 Membership, 4, 9, 289 Metalanguage, 402 Metamathematics, 401 if. Metatheorem, 402 Minimal member, 53 Model, 231-236, 273, 297, 401; cardinality of, 420; nonstandard, 426; standard, 436 Modus ponens, 185, 376, 378, 390
n-ary operation, 37 n-ary relation, 25 Natural mapping, 40, 343 Natural numbers, 299 ff.; as cardinal numbers, 83; definition of, 299; Peano's axioms for, 59; recursive set of, 435; recursively enumerable set of, 435; see also Natural number sequence Natural number sequence, 57 ff.; definition of, 61; ordering relation for, 62; addition in, 63; multiplication in, 66 Negation, 161; truth table for, 165 von Neumann set theory, 318, 452-453 Neutral element, 324 Nonstandard model, 426 Normal subgroup, 342 n-place operation symbol, 395 n-place predicate symbol, 388, 395
Subject Index Number-theoretic function, 75; computation problem for, 427; numeralwise representable, 441
Number-theoretic predicate, 427 Numeral, 440 Numeralwise-expressible predicate, 440 Object language, 402 w-consistency, 441 One-to-one correspondence, 36 One-to-one function, 36
Operation: associative, 73; binary, 37; com-
mutative, 78; induced, 323; n-ary, 37; neutral element of, 324; ternary, 73; unary, 37
Operation symbol, 395 Operators, 322 Order: of a group, 331; of a group element, 336
Ordered field, 364; rational subfield of, 365; complete, 367 Ordered integral domain, 357; absolute value in, 358; order-isomorphic, 359; order-isomorphism of, 359 Ordered n-tuple, 25 Ordered pair, 24, 296 Ordered set: partially, 50; simply, 50 Order-isomorphic domains, 359 Order-preserving function, 52 Order types, 99; addition of, 100; multiplication of, 100 Ordinal numbers, 105 if., 307 ff.; exponentiation of, 315; limit, 109; natural numbers as, 105; of the first kind, 107; of the second kind, 107; ordering of, 105, 309; product of, 107, 314; series of, 315; sum of, 107, 313; transfinite, 105, 312 Ordinally similar chains, 98
Padoa's method, 244 Pair, 293; ordered, 24, 296 Paradox: Burali-Forti's, 128, 310, 319; Cantor's, 128, 317-318; Richard's, 446-447; Russell's, 9, 127, 291; Skolem's, 453-454 Partial function, 429 Partial ordering relation, 48 Partial recursive function, 434-435 Partially computable function, 433 Partially ordered set, 50, 202; dual of, 54; product of, 54; self-dual, 54 Partition, 13
Subject Index
473
Peano's axioms, 58-59, 69, 228, 231, 299, 397 Power set, 11, 296
Quotient set, 31
Predecessor, 76
Range of a relation, 26 Rational number, 38; integral, 139; positive,
Predicate: decision problem for, 408; effectively decidable, 408; n-place, 194, 200; number theoretic, 427; numeralwise expressible, 440; primitive recursive, 435; recursive, 435; recursively solvable, 435; recursively unsolvable, 435; undecidable, 408
Predicate calculus of first order, 200 if., 387 ff.; applied, 393; axioms for, 389; completeness theorem for, 393, 415, 417;
deduction theorem for, 391; pure, 393; rules of inference for, 390; with equality, 389 Predicate calculus of second order, 425
Predicate letter, 200 Predicate symbol, 388, 395 Preordering relation, 49 Prime formula: of first-order theory, 395; of predicate calculus, 200, 388; of statement calculus, 169 Prime sentence, 161 Primitive recursion, 74 Primitive recursive derivation, 75 Primitive recursive function, 75, 434 Primitive recursive predicate, 435 Principle: of abstraction, 6; of definition by
induction, 72-78; of definition by transfinite induction, 103; of duality, 19, 251;
of proof by induction, 70; of proof by transfinite induction, 103 Projection, 47
Proof: by contradiction, 189; formal, 224, 377, 390; indirect, 189; by induction, 70; by transfinite induction, 103 Proof schema, 377 Proof theory, 403 Proper inclusion, 10 Proper subset, 10 Proper subtraction, 76 Provable statement, 224 Pure predicate calculus, 393 Quadruple, 431
Quantifier, 192; existential, 196; scope of, 202; universal, 195 Quotient algebra, 266, 324 Quotient field, 363 Quotient group, 342
139
Rational real number, 150 Real number, 149; positive, 149; rational, 150
Recursive function, 434; partial, 434-435; primitive, 75 Recursive predicate, 435; primitive, 435 Recursive set, 435 Reflexive relation, 29 Relation, 23 ff.; antisymmetric, 48; binary,
25, 297; converse of 49; domain of, 26; equivalence, 29; from X to Y, 26; graph of, 27; identity, 26; in X, 26; irreflexive, 49; n-ary, 25; partial ordering, 48; preordering, 49; range of, 26; reflexive, 29; simple ordering, 50; symmetric, 29; ternary, 25; transitive, 29; universal, 26; void, 26
Relative complement, 13, 295 Residue class, 31 Restricted quantification, 214 p-relatives, 27 Richard's paradox, 446-447 Right coset, 340; decomposition, 340
Ring, 346 ff.; Boolean, 350; commutative, 348; difference, 354; division, 348; extension of, 355; homomorphic image of, 355; honwmorphism of, 355; ideal of, 353; imbedded, 355; isomorphic image of, 355 Rosser's form of GBdel's first theorem, 449
Rules of inference, 181; ep, 185; derived, 378; detachment, 185; eg, 218; es, 218; generalization, 390; modus ponens, 185, 376, 378, 390; p, 182; t, 182; ug, 216 Russell's paradox, 9, 127, 291
Satisfiable formula, 410; in a domain, 409
Satisfiable set: of formulas, 410; of statements, 188 Schroder-Bernstein theorem, 81 Second coordinate, 24 Second-order theory, 425 Self-dual formula, 20 Semigroup, 324-328; Abelian, 325; commutative, 325; identity element of, 325; neutral element of; 324; unit element of, 325; zero element of; 325
Subject Index
474 Sentential connectives, 161 Sequence, 45 Sequence function, 103 Series of ordinals, 315 Set of descendents, 59 Similar sets, 79 if. Simple ordering relation, 50 Simply ordered set, 50 Simply ordered commutative groups, 235 Skolem's paradox, 453-454 Standard model, 426
Statement, 6, 204; refutable, 239 Statement calculi, 273 if. Statement calculus, 160 if., 375 ff.; axioms for, 376; completeness theorem for, 284-
Total function, 429
Transfinite induction: definition by, 103; proof by, 103 Transformation group, 337 Transitive relation, 29 Triple, 295 Truth function, 170 Truth table, 164 Truth value, 164 Turing machine, 431; alphabet of, 431; computation of, 433; instantaneous description
of, 432, terminal. 433; internal state of, 431; operation of, 432; tape expression for, 432
285, 381; deduction theorem for, 378; Lindenbaum algebra of, 274; rule of inference for, 376 Statement variables, 375 Stone's theorem, 271 String, 182 Strong completeness- theorems, 384, 416-417 Subalgebra, 323 Subfield, 352
Subgroup, 333; generated, 334; improper, 334; normal (invariant), 342; proper, 334 Subring, 351 Subset, 10; proper, 10 Successor function, 57 Successor set, 298 Supremum, 53 Symbols: defined, 225; primitive, 225, 375, 388, 395 Symmetric difference, 13, 262, 296 Symmetric relation, 29
Tautology, 172 Term, 194, 395; value of, 399 Ternary operation, 73 Ternary relation, 25 Theorem, 224, 377, 390 Theorem schema, 377 Theory: consistent, 236; decidable, 408, 445 ff.; first-order, 394 ff.; inconsistent, 236; second-order, 425; with standard formalization, 394; undecidable, 408, 436 if.
Unary operation, 37 Unary system, 58 Uncountable set, 92 Undecidable formula, 446 Undecidable predicate, 408 Undecidable theory, 408, 436 If. Union, 12, 295; of a collection of sets, 43 Unit, 348 Unit set, 5, 293 Universal quantifier, 195 Universal relation, 26 Universal set, 13 Upper bound, 53; least, 53
Valuation procedure, 206 Venn diagram, 14-15 Void relation, 26
Well-ordered sets, 53, 102 ff.; initial segment
of, 103; ordinal product of, 314; ordinal sum of, 313
Zermelo-Fraenkel set theory, 289 if., 452453; axioms of, 291, 292, 295, 298, 302, 303; prime formulas of, 290, 296 Zorn's lemma, 116 if.
A CATALOG OF SELECTED
DOVER BOOKS IN SCIENCE AND MATHEMATICS
m
CATALOG OF DOVER BOOKS
Astronomy BURNHAM'S CELESTIAL HANDBOOK, Robert Burnham, Jr. Thorough guide to the stars beyond our solar system. Exhaustive treatment. Alphabetical by constellation: Andromeda to Cetus in Vol. 1; Chamaeleon to Orion in Vol. 2; and Pavo to Vulpecula in Vol. 3. Hundreds of illustrations. Index in Vol. 3. 2,000pp. 6% x 9%. Vol. I: 0-486-23567-X Vol. 11: 0-486-23568-8 Vol. III: 0-486-23673-0
EXPLORING THE MOON THROUGH BINOCULARS AND SMALL TELESCOPES, Ernest H. Cherrington, Jr. Informative, profusely illustrated guide to locating and identifying craters, rills, seas, mountains, other lunar features. Newly revised
and updated with special section of new photos. Over 100 photos and diagrams. 240pp. 8% x 11.
0-486-24491-1
THE EXTRATERRESTRIAL LIFE DEBATE, 1750-1900, Michael J. Crowe. First detailed, scholarly study in English of the many ideas that developed from 1750 to 1900 regarding the existence of intelligent extraterrestrial life. Examines ideas of Kant, Herschel, Voltaire, Percival Lowell, many other scientists and thinkers. 16 illustrations. 704pp. 5% x 8%.
0-486-40675-X
THEORIES OF THE WORLD FROM ANTIQUITY TO THE COPERNICAN REVOLUTION, Michael J. Crowe. Newly revised edition of an accessible, enlightening book recreates the change from an earth-centered to a sun-centered conception of the solar system. 242pp. 5% x 8%. 0-486-41444-2 A HISTORY OF ASTRONOMY, A. Pannekoek. Well-balanced, carefully reasoned study covers such topics as Ptolemaic theory, work of Copernicus, Kepler, Newton, Eddington's work on stars, much more. Illustrated. References. 521pp. 5% x 8%. 0-486-65994-1
A COMPLETE MANUAL OF AMATEUR ASTRONOMY: TOOLS AND TECHNIQUES FOR ASTRONOMICAL OBSERVATIONS, P. Clay Sherrod with Thomas L. Koed. Concise, highly readable book discusses: selecting, setting up and maintaining a telescope; amateur studies of the sun; lunar topography and occul-
tations; observations of Mars, Jupiter, Saturn, the minor planets and the stars; an introduction to photoelectric photometry; more. 1981 ed. 124 figures. 25 halftones. 37 tables. 335pp. 6% x 9%.
0-486-40675-X
AMATEUR ASTRONOMER'S HANDBOOK, J. B. Sidgwick. Timeless, comprehensive coverage of telescopes, mirrors, lenses, mountings, telescope drives, micrometers, spectroscopes, more. 189 illustrations. 576pp. 5% x 8%. (Available in U.S. only.) 0-486-24034-7
STARS AND RELATIVITY, Ya. B. Zel'dovich and I. D. Novikov. Vol. 1 of Relativistic Astrophysics by famed Russian scientists. General relativity, properties of matter under astrophysical conditions, stars, and stellar systems. Deep physical insights, clear presentation. 1971 edition. References. 544pp. 5% x 8%. 0-486-69424-0
CATALOG OF DOVER BOOKS
Chemistry THE SCEPTICAL CHYMIST: THE CLASSIC 1661 TEXT, Robert Boyle. Boyle defines the term "element," asserting that all natural phenomena can be explained by the motion and organization of primary particles. 1911 ed. viii+232pp. 5% x 8%. 0-486-42825-7
RADIOACTIVE SUBSTANCES, Marie Curie. Here is the celebrated scientist's doctoral thesis, the prelude to her receipt of the 1903 Nobel Prize. Curie discusses establishing atomic character of radioactivity found in compounds of uranium and thorium; extraction from pitchblende of polonium and radium; isolation of pure radium chloride; determination of atomic weight of radium; plus electric, photographic, luminous, heat, color effects of radioactivity. ii+94pp. 5% x 8%. 0-486-42550-9
CHEMICAL MAGIC, Leonard A. Ford. Second Edition, Revised by E. Winston Grundmeier. Over 100 unusual stunts demonstrating cold fire, dust explosions, much more. Text explains scientific principles and stresses safety precautions. 128pp. 5% x 8%.
0-486-67628-5
THE DEVELOPMENT OF MODERN CHEMISTRY, Aaron J. Ihde. Authoritative history of chemistry from ancient Greek theory to 20th-century innovation.
Covers major chemists and their discoveries. 209 illustrations. Bibliographies. Indices. Appendices. 851pp. 5% x 8%.
14 tables. 0-486-64235-6
CATALYSIS IN CHEMISTRY AND ENZYMOLOGY, William P. Jencks. Exceptionally clear coverage of mechanisms for catalysis, forces in aqueous solution, carbonyl- and acyl-group reactions, practical kinetics, more. 864pp. 5% x 8%. 0-486-65460-5
ELEMENTS OF CHEMISTRY, Antoine Lavoisier. Monumental classic by founder of modem chemistry in remarkable reprint of rare 1790 Kerr translation. A must for every student of chemistry or the history of science. 539pp. 5% x 8%. 0-486-64624-6
THE HISTORICAL BACKGROUND OF CHEMISTRY, Henry M. Leicester. Evolution of ideas, not individual biography. Concentrates on formulation of a coherent set of chemical laws. 260pp. 5% x 8%. 0-486-61053-5
A SHORT HISTORY OF CHEMISTRY, J. R. Partington. Classic exposition explores origins of chemistry, alchemy, early medical chemistry, nature of atmosphere, theory of valency, laws and structure of atomic theory, much more. 428pp. 5% x 8%. (Available in U.S. only.)
0-486-65977-1
GENERAL CHEMISTRY, Linus Pauling. Revised 3rd edition of classic first-year text by Nobel laureate. Atomic and molecular structure, quantum mechanics, statistical mechanics, thermodynamics correlated with descriptive chemistry. Problems. 992pp. 5% x 8%.
0-486-65622-5
FROM ALCHEMY TO CHEMISTRY, John Read. Broad, humanistic treatment focuses on great figures of chemistry and ideas that revolutionized the science. 50 illustrations. 240pp. 5% x 8%.
0-486-28690-8
CATALOG OF DOVER BOOKS
Engineering DE RE METALLICA, Georgius Agricola. The famous Hoover translation of greatest treatise on technological chemistry, engineering, geology, mining of early modern times (1556). All 289 original woodcuts. 638pp. 631 x 11. 0-486-60006-8
FUNDAMENTALS OF ASTRODYNAMICS, Roger Bate et al. Modern approach developed by U.S. Air Force Academy.. Designed as a first course. Problems, exercises. Numerous illustrations. 455pp. 5% x 8%.
0-486-60061-0
DYNAMICS OF FLUIDS IN POROUS MEDIA, Jacob Bear. For advanced students of ground water hydrology, soil mechanics and physics, drainage and irrigation engineering and more. 335 illustrations. Exercises, with answers. 784pp. 6% x 9%. 0-486-65675-6
THEORY OF VISCOELASTICITY (Second Edition), Richard M. Christensen. Complete consistent description of the linear theory of the viscoelastic behavior of materials. Problem-solving techniques discussed. 1982 edition. 29 figures. xiv+364pp. 6% x 9%.
0-486-42880-X
MECHANICS, J. P. Den Hartog. A classic introductory text or refresher. Hundreds
of applications and design problems illuminate fundamentals of trusses, loaded beams and cables, etc. 334 answered problems. 462pp. 5% x 8%.
0-486-60754-2
MECHANICAL VIBRATIONS, J. P. Den Hartog. Classic textbook offers lucid explanations and illustrative models, applying theories of vibrations to a variety of practical industrial engineering problems. Numerous figures. 233 problems, solutions. Appendix. Index. Preface. 436pp. 5% x 8'%.
0-486-64785-4
STRENGTH OF MATERIALS, J. P. Den Hartog. Full, clear treatment of basic material (tension, torsion, bending, etc.) plus advanced material on engineering methods, applications. 350 answered problems. 323pp. 5% x 8%.
0-486-60755-0
A HISTORY OF MECHANICS, Rene Dugas. Monumental study of mechanical principles from antiquity to quantum mechanics. Contributions of ancient Greeks, Galileo, Leonardo, Kepler, Lagrange, many others. 671pp. 5% x 8%. 0-486-65632-2
STABILITY THEORY AND ITS APPLICATIONS TO STRUCTURAL MECHANICS, Clive L. Dym. Self-contained text focuses on Koiter postbuckling analyses, with mathematical notions of stability of motion. Basing minimum energy principles for static stability upon dynamic concepts of stability of motion, it develops asymptotic buckling and postbuckling analyses from potential energy considerations, with applications to columns, plates, and arches. 1974 ed. 208pp. 534 x 8%. 0-486-42541-X
METAL FATIGUE, N. E. Frost, K.J. Marsh, and L. P. Pook. Definitive, clearly written, and well-illustrated volume addresses all aspects of the subject, from the historical development of understanding metal fatigue to vital concepts of the cyclic stress that causes a crack to grow. Includes 7 appendixes. 544pp. 5% x 8%. 0-486-40927-9
CATALOG OF DOVER BOOKS ROCKETS, Robert Goddard. Two of the most significant publications in the history of rocketry and jet propulsion: "A Method of Reaching Extreme Altitudes" (1919) and "Liquid Propellant Rocket Development" (1936). 128pp. 53A x 8%. 0-486-42537-1
STATISTICAL MECHANICS: PRINCIPLES AND APPLICATIONS, Terrell L. Hill. Standard text covers fundamentals of statistical mechanics, applications to fluctuation theory, imperfect gases, distribution functions, more. 448pp. 5% x 8%. 0-486-65390-0
ENGINEERING AND TECHNOLOGY 1650-1750: ILLUSTRATIONS AND TEXTS FROM ORIGINAL SOURCES, Martin Jensen. Highly' readable text with more than 200 contemporary drawings and detailed engravings of engineering projects dealing with surveying, leveling, materials, hand tools, lifting equipment, transport and erection, piling, bailing, water supply, hydraulic engineering, and more. Among the specific projects outlined-transporting a 50-ton stone to the Louvre, erecting an obelisk, building timber locks, and dredging canals. 207pp. 8% x 11%. 0-486-42232-1
THE VARIATIONAL PRINCIPLES OF MECHANICS, Cornelius Lanczos. Graduate level coverage of calculus of variations, equations of motion, relativistic mechanics, more. First inexpensive paperbound edition of classic treatise. Index. Bibliography. 418pp. 5% x 8%.
0-486-65067-7
PROTECTION OF ELECTRONIC CIRCUITS FROM OVERVOLTAGES, Ronald B. Standler. Five-part treatment presents practical rules and strategies for circuits designed to protect electronic systems from damage by transient overvoltages. 1989 ed. xxiv+434pp. 6% x 9%. 0-486-42552-5
ROTARY WING AERODYNAMICS, W. Z. Stepniewski. Clear, concise text covers aerodynamic phenomena of the rotor and offers guidelines for helicopter performance evaluation. Originally prepared for NASA. 537 figures. 640pp. 614 x 9'%. 0-486-64647-5
INTRODUCTION TO SPACE DYNAMICS, William Tyrrell Thomson. Comprehensive, classic introduction to space-flight engineering for advanced undergraduate and graduate students. Includes vector algebra, kinematics, transformation of coordinates. Bibliography. Index. 352pp. 5% x 8%. 0-486-65113-4
HISTORY OF STRENGTH OF MATERIALS, Stephen P. Timoshenko. Excellent historical survey of the strength of materials with many references to the theories of elasticity and structure. 245 figures. 452pp. 5% x 8%. 0-486-61187-6
ANALYTICAL FRACTURE MECHANICS, David J. Unger. Self-contained text supplements standard fracture mechanics texts by focusing on analytical methods for determining crack-tip stress and strain fields. 336pp. 6% x 9%.
0-486-41737-9
STATISTICAL MECHANICS OF ELASTICITY, J. H. Weiner. Advanced, selfcontained treatment illustrates general principles and elastic behavior of solids. Part 1, based on classical mechanics, studies thermoelastic behavior of crystalline and polymeric solids. Part 2, based on quantum mechanics, focuses on interatomic force laws, behavior of solids, and thermally activated processes. For students of physics and chemistry and for polymer physicists. 1983 ed. 96 figures. 496pp. 5% x 8%. 0-486-42260-7
CATALOG OF DOVER BOOKS
Mathematics FUNCTIONAL ANALYSIS (Second Corrected Edition), George Bachman and Lawrence Narici. Excellent treatment of subject geared toward students with background in linear algebra, advanced calculus, physics and engineering. Text covers introduction to inner-product spaces, normed, metric spaces, and topological spaces; complete orthonormal sets, the Hahn-Banach Theorem and its consequences, and many other related subjects. 1966 ed. 544pp. 616 x 9%. 0-486-40251-7 ASYMPTOTIC EXPANSIONS OF INTEGRALS, Norman Bleistein & Richard A. Handelsman. Best introduction to important field with applications in a variety of scientific disciplines. New preface. Problems. Diagrams. Tables. Bibliography. Index. 448pp. 5% x 8'%. 0-486-65082-0
VECTOR AND TENSOR ANALYSIS WITH APPLICATIONS, A. I. Borisenko and I. E. Tarapov. Concise introduction. Worked-out problems, solutions, exercises. 0-486-63833-2
257pp. 556 x 8'%.
AN INTRODUCTION TO ORDINARY DIFFERENTIAL EQUATIONS, Earl A. Coddington. A thorough and systematic first course in elementary differential equations for undergraduates in mathematics and science, with many exercises and problems (with answers). Index. 304pp. 5% x 8'b.
0-486-65942-9
FOURIER SERIES AND ORTHOGONAL FUNCTIONS, Harry F. Davis. An incisive text combining theory and practical example to introduce Fourier series, orthogonal functions and applications of the Fourier method to boundary-value problems. 570 exercises. Answers and notes. 416pp. 5% x 8'6.
0-486-65973-9
COMPUTABILITY AND UNSOLVABILITY, Martin Davis. Classic graduatelevel introduction to theory of computability, usually referred to as theory of recurrent functions. New preface and appendix. 288pp. 5% x 8'6.
0-486-61471-9
ASYMPTOTIC METHODS IN ANALYSIS, N. G. de Bruijn. An inexpensive, com-
prehensive guide to asymptotic methods-the pioneering work that teaches by explaining worked examples in detail. Index. 224pp. 5% x 8'%
0-486-64221-6
APPLIED COMPLEX VARIABLES, John W. Dettman. Step-by-step coverage of fundamentals of analytic function theory-plus lucid exposition of five important applications: Potential Theory; Ordinary Differential Equations; Fourier Transforms; Laplace Transforms; Asymptotic Expansions. 66 figures. Exercises at chapter ends. 0-486-64670-X
512pp. 5% x 831.
INTRODUCTION TO LINEAR ALGEBRA AND DIFFERENTIAL EQUATIONS, John W. Dettman. Excellent text covers complex numbers, determinants,
orthonormal bases, Laplace transforms, much more. Exercises with solutions. Undergraduate level. 416pp. 5% x 8%.
0-486-65191-6
RIEMANN'S ZETA FUNCTION, H. M. Edwards. Superb, high-level study of landmark 1859 publication entitled "On the Number of Primes Less Than a Given Magnitude" traces developments in mathematical theory that it inspired. xiv+315pp. 5% x 851.
1
0-486-41740-9
CATALOG OFDOVER BOOKS
CALCULUS OF VARIATIONS WITH APPLICATIONS, George M. Ewing. Applications-oriented introduction to variational theory develops insight and promotes understanding of specialized books, research papers. Suitable for advanced undergraduate/graduate students as primary, supplementary text. 352pp. 5% x 8%. 0-486-64856-7
COMPLEX VARIABLES, FrancisJ. Flanigan. Unusual approach, delaying complex
algebra till harmonic functions have been analyzed from real variable viewpoint. Includes problems with answers. 364pp. 5% x 8%.
0-486-61388-7
AN INTRODUCTION TO THE CALCULUS OF VARIATIONS, Charles Fox. Graduate-level text covers variations of an integral, isoperimetrical problems, least action, special relativity, approximations, more. References. 279pp. 5% x 8%. 0-486-65499-0
COUNTEREXAMPLES IN ANALYSIS, Bernard R. Gelbaum and John M. H. Olmsted. These counterexamples deal mostly with the part of analysis known as "real variables." The first half covers the real number system, and the second half encompasses higher dimensions. 1962 edition. xxiv+198pp. 5% x 8%. 0-486-42875-3
CATASTROPHE THEORY FOR SCIENTISTS AND ENGINEERS, Robert Gilmore. Advanced-level treatment describes mathematics of theory grounded in the work of Poincare, R. Thom, other mathematicians. Also important applications to problems in mathematics, physics, chemistry and engineering. 1981 edition. References. 28 tables. 397 black-and-white illustrations. xvii + 666pp. 6% x 9%. 0-486-67539-4
INTRODUCTION TO DIFFERENCE EQUATIONS, Samuel Goldberg. Exceptionally clear exposition of important discipline with applications to sociology, psychology, economics. Many illustrative examples; over 250 problems. 260pp. 5% x 8%. 0-486-65084-7
NUMERICAL METHODS FOR SCIENTISTS AND ENGINEERS, Richard Hamming. Classic text stresses frequency approach in coverage of algorithms, polynomial approximation, Fourier approximation, exponential approximation, other topics. Revised and enlarged 2nd edition. 721pp. 5% x 8'%. 0-486-65241-6
INTRODUCTION TO NUMERICAL ANALYSIS (2nd Edition), F. B. Hildebrand. Classic, fundamental treatment covers computation, approximation, interpolation, numerical differentiation and integration, other topics. 150 new problems. 669pp. 5% x 8%.
0-486-65363-3
THREE PEARLS OF NUMBER THEORY, A. Y. Khinchin. Three compelling puzzles require proof of a basic law governing the world of numbers. Challenges concern van der Waerden's theorem, the Landau-Schnirelmann hypothesis and Mann's theorem, and a solution to Waring's problem. Solutions included. 64pp. 53/. x 8'/,. 0-486-40026-3
THE PHILOSOPHY OF MATHEMATICS: AN INTRODUCTORY ESSAY, Stephan KSrner. Surveys the views of Plato, Aristotle, Leibniz & Kant concerning
propositions and theories of applied and pure mathematics. Introduction. Two appendices. Index. 198pp. 5% x 8%.
0-486-25048-2
CATALOG OF DOVER BOOKS INTRODUCTORY REAL ANALYSIS, A.N. Kolmogorov, S. V. Fomin. Translated by Richard A. Silverman. Self-contained, evenly paced introduction to real and functional analysis. Some 350 problems. 403pp. 5% x 8'l. 0-486-61226-0
APPLIED ANALYSIS, Cornelius Lanczos. Classic work on analysis and design of finite processes for approximating solution of analytical problems. Algebraic equations, matrices, harmonic analysis, quadrature methods, much more. 559pp. 534 x 8'/.. 0-486-65656-X
AN INTRODUCTION TO ALGEBRAIC STRUCTURES, Joseph Landin. Superb self-contained text covers "abstract algebra": sets and numbers, theory of groups, theory of rings, much more. Numerous well-chosen examples, exercises. 247pp. 5% x 8'6. 0-486-65940-2
QUALITATIVE THEORY OF DIFFERENTIAL EQUATIONS, V. V. Nemytskii and V.V. Stepanov. Classic graduate-level text by two prominent Soviet mathematicians covers classical differential equations as well as topological dynamics and ergodic theory. Bibliographies. 523pp. 5% x 8'%.
0-486-65954-2
THEORY OF MATRICES, Sam Perlis. Outstanding text covering rank, nonsingularity and inverses in connection with the development of canonical matrices under the relation of equivalence, and without the intervention of determinants. Includes exercises. 237pp. 5% x 8'6.
0-486-66810-X
INTRODUCTION TO ANALYSIS, Maxwell Rosenlicht. Unusually clear, accessible coverage of set theory, real number system, metric spaces, continuous functions, Riemann integration, multiple integrals, more. Wide range of problems. Undergraduate level. Bibliography. 254pp. 5% x 8%.
0-486-65038-3
MODERN NONLINEAR EQUATIONS, Thomas L. Saaty. Emphasizes practical solution of problems; covers seven types of equations. "... a welcome contribution to the existing literature...."-Math Reviews. 490pp. 5% x 8%.
0-486-64232-1
MATRICES AND LINEAR ALGEBRA, Hans Schneider and George Phillip Barker. Basic textbook covers theory of matrices and its applications to systems of lin-
ear equations and related topics such as determinants, eigenvalues and differential equations. Numerous exercises. 432pp. 51 x 81l. 0-486-66014-1 LINEAR ALGEBRA, Georgi E. Shilov. Determinants, linear spaces, matrix algebras, similar topics. For advanced undergraduates, graduates. Silverman translation. 387pp. 5% x 8'%.
0-486-63518-X
ELEMENTS OF REAL ANALYSIS, David A. Sprecher. Classic text covers fundamental concepts, real number system, point sets, functions of a real variable, Fourier 0-486-65385-4 series, much more. Over 500 exercises. 352pp. 5% x 8'!.
SET THEORY AND LOGIC, Robert R. Stoll. Lucid introduction to unified theory of mathematical concepts. Set theory and logic seen as tools for conceptual understanding of real number system. 496pp. 5% x 8'b. 0-486-63829-4
CATALOG OF DOVER BOOKS
TENSOR CALCULUS, J.L. Synge and A. Schild. Widely used introductory text covers spaces and tensors, basic operations in Riemannian space, non-Riemannian spaces, etc. 324pp. 5% x 8'b.
0-486-63612-7
ORDINARY DIFFERENTIAL EQUATIONS, Morris Tenenbaum and Harry Pollard. Exhaustive survey of ordinary differential equations for undergraduates in
mathematics, engineering, science. Thorough analysis of theorems. Diagrams. Bibliography. Index. 818pp. 5% x 8'h.
0-486-64940-7
INTEGRAL EQUATIONS, F. G. Tricomi. Authoritative, well-written treatment of extremely useful mathematical tool with wide applications. Volterra Equations,
Fredholm Equations, much more. Advanced undergraduate to graduate level. Exercises. Bibliography. 238pp. 5% x 8%.
0-486-64828-1
FOURIER SERIES, Georgi P. Tolstov. Translated by Richard A. Silverman. A valuable addition to the literature on the subject, moving clearly from subject to subject and theorem to theorem. 107 problems, answers. 336pp. 5% x 8'l. 0-486-63317-9
INTRODUCTION TO MATHEMATICAL THINKING, Friedrich Waismann. Examinations of arithmetic, geometry, and theory of integers; rational and natural numbers; complete induction; limit and point of accumulation; remarkable curves; complex and hypercomplex numbers, more. 1959 ed. 27 figures. xii+260pp. 5% x 8'%. 0-486-63317-9
POPULAR LECTURES ON MATHEMATICAL LOGIC, Hao Wang. Noted logician's lucid treatment of historical developments, set theory, model theory, recursion theory and constructivism, proof theory, more. 3 appendixes. Bibliography. 1981 edi0-486-67632-3
tion. ix + 283pp. 5% x 8'%.
CALCULUS OF VARIATIONS, Robert Weinstock. Basic introduction covering isoperimetric problems, theory of elasticity, quantum mechanics, electrostatics, etc. 0-486-63069-2
Exercises throughout. 326pp. 5% x 8'/.
THE CONTINUUM: A CRITICAL EXAMINATION OF THE FOUNDATION OF ANALYSIS, Hermann Weyl. Classic of 20th-century foundational research deals with the conceptual problem posed by the continuum. 156pp. 5% x 8'4. 0-486-67982-9
CHALLENGING MATHEMATICAL PROBLEMS WITH ELEMENTARY SOLUTIONS, A. M. Yaglom and I. M. Yaglom. Over 170 challenging problems on probability theory, combinatorial analysis, points and lines, topology, convex polygons, many other topics. Solutions. Total of 445pp. 5% x 8'%. Two-vol. set. Vol. I: 0-486-65536-9 Vol. II: 0-486-65537-7
INTRODUCTION TO PARTIAL DIFFERENTIAL EQUATIONS WITH APPLICATIONS, E. C. Zachmanoglou and Dale W. Thoe. Essentials of partial differential equations applied to common problems in engineering and the physical sciences. Problems and answers. 416pp. 5% x 8'%.
0-486-65251-3
THE THEORY OF GROUPS, HansJ. Zassenhaus. Well-written graduate-level text acquaints reader with group-theoretic methods and demonstrates their usefulness in mathematics. Axioms, the calculus of complexes, homomorphic mapping, p-group theory, more. 276pp. 5% x 8''h.
0-486-40922-8
CATALOG OFDOVER BOOKS
Math-Decision Theory, Statistics, Probability ELEMENTARY DECISION THEORY, Herman Chernoff and Lincoln E. Moses. Clear introduction to statistics and statistical theory covers data processing, probability and random variables, testing hypotheses, much more. Exercises. 364pp. 5% x 8'1.
0-486-65218-1
STATISTICS MANUAL, Edwin L. Crow et al. Comprehensive, practical collection of classical and modern methods prepared by U.S. Naval Ordnance Test Station. Stress on use. Basics of statistics assumed. 288pp. 5% x 8'lz. 0-486-60599-X
SOME THEORY OF SAMPLING, William Edwards Deming. Analysis of the problems, theory and design of sampling techniques for social scientists, industrial managers and others who find statistics important at work. 61 tables. 90 figures. xvii 0-486-64684-X +602pp. 5% x 8'6.
LINEAR PROGRAMMING AND ECONOMIC ANALYSIS, Robert Dorfman, Paul A. Samuelson and Robert M. Solow. First comprehensive treatment of linear programming in standard economic analysis. Game theory, modem welfare economics, Leontief input-output, more. 525pp. 5% x 8Y,.
0-486-65491-5
PROBABILITY: AN INTRODUCTION, Samuel Goldberg. Excellent basic text covers set theory, probability theory for finite sample spaces, binomial theorem, much more. 360 problems. Bibliographies. 322pp. 5% x 811.
0-486-65252-1
GAMES AND DECISIONS: INTRODUCTION AND CRITICAL SURVEY, R. Duncan Luce and Howard Raiffa. Superb nontechnical introduction to game theory, primarily applied to social sciences. Utility theory, zero-sum games, n-person games, decision-making, much more. Bibliography. 509pp. 5% x 8'%. 0-486-65943-7
INTRODUCTION TO THE THEORY OF GAMES, J. C. C. McKinsey. This comprehensive overview of the mathematical theory of games illustrates applications to situations involving conflicts of interest, including economic, social, political, and military contexts. Appropriate for advanced undergraduate and graduate courses; advanced calculus a prerequisite. 1952 ed. x+372pp. 5% x 8'l,. 0-486-42811-7
FIFTY CHALLENGING PROBLEMS IN PROBABILITY WITH SOLUTIONS, Frederick Mosteller. Remarkable puzzlers, graded in difficulty, illustrate elementary 65355-2 and advanced aspects of probability. Detailed solutions. 88pp. 5% x 8'k. PROBABILITY THEORY: A CONCISE COURSE, Y. A. Rozanov. Highly readable, self-contained introduction covers combination of events, dependent events, Bernoulli trials, etc. 148pp. 5% x 8;4.
0-486-63544-9
STATISTICAL METHOD FROM THE VIEWPOINT OF QUALITY CONTROL, Walter A. Shewhart. Important text explains regulation of variables, uses of statistical control to achieve quality control in industry, agriculture, other areas. 192pp. 5% x 8%.
0-486-65232-7
CATALOG OF DOVER BOOKS
Math-Geometry and Topology ELEMENTARY CONCEPTS OF TOPOLOGY, Paul Alexandroff. Elegant, intuitive approach to topology from set-theoretic topology to Betti groups; how concepts of topology are useful in math and physics. 25 figures. 57pp. 5% x 8%. 0-486-60747-X
COMBINATORIAL TOPOLOGY, P. S. Alexandrov. Clearly written, well-organized, three-part text begins by dealing with certain classic problems without using the formal techniques of homology theory and advances to the central concept, the Betti groups. Numerous detailed examples. 654pp. 5'/, x 8%.
0-486-40179-0
EXPERIMENTS IN TOPOLOGY, Stephen Barr. Classic, lively explanation of one of the byways of mathematics. Klein bottles, Moebius strips, projective planes, map coloring, problem of the Koenigsberg bridges, much more, described with clarity and 0-486-25933-1
wit. 43 figures. 210pp. 5% x 8%.
THE GEOMETRY OF RENE DESCARTES, Rene Descartes. The great work founded analytical geometry. Original French text, Descartes's own diagrams, together with definitive Smith-Latham translation. 244pp. 5% x 8%. 0-486-60068-8
EUCLIDEAN GEOMETRY AND TRANSFORMATIONS, Clayton W. Dodge. This introduction to Euclidean geometry emphasizes transformations, particularly isometries and similarities. Suitable for undergraduate courses, it includes numerous examples, many with detailed answers. 1972 ed. viii+296pp. 6% x 9%. 0-486-43476-1
PRACTICAL CONIC SECTIONS: THE GEOMETRIC PROPERTIES OF ELLIPSES, PARABOLAS AND HYPERBOLAS,J. W. Downs. This text shows how to create ellipses, parabolas, and hyperbolas. It also presents historical background on
their ancient origins and describes the reflective properties and roles of curves in design applications. 1993 ed. 98 figures. xii+100pp. 6'h x 9%.
0-486-42876-1
THE THIRTEEN BOOKS OF EUCLID'S ELEMENTS, translated with introduction and commentary by Sir Thomas L. Heath. Definitive edition. Textual and linguistic notes, mathematical analysis. 2,500 years of critical commentary. Unabridged. 1,414pp. 5% x 8'h.. Three-vol. set. Vol. I: 0-486-60088-2 Vol. II: 0-486-60089-0
Vol. III: 0-486-60090-4
SPACE AND GEOMETRY: IN THE LIGHT OF PHYSIOLOGICAL, PSYCHOLOGICAL AND PHYSICAL INQUIRY, Ernst Mach. Three essays by an eminent philosopher and scientist explore the nature, origin, and development of our concepts of space, with a distinctness and precision suitable for undergraduate students and other readers. 1906 ed. vi+l48pp. 5% x 8%.
0-486-43909-7
GEOMETRY OF COMPLEX NUMBERS, Hans Schwerdtfeger. Illuminating, widely praised book on analytic geometry of circles, the Moebius transformation, and two-dimensional non-Euclidean geometries. 200pp. 556 x 8%.
0-486-63830-8
DIFFERENTIAL GEOMETRY, Heinrich W. Guggenheimer. Local differential geometry as an application of advanced calculus and linear algebra. Curvature, transforma0-486-63433-7 tion groups, surfaces, more. Exercises. 62 figures. 378pp. 5% x 8%.
CATALOG OF DOVER BOOKS
History of Math THE WORKS OF ARCHIMEDES, Archimedes (T. L. Heath, ed.). Topics include the famous problems of the ratio of the areas of a cylinder and an inscribed sphere; the measurement of a circle; the properties of conoids, spheroids, and spirals; and the quadrature of the parabola. Informative introduction. clxxxvi+326pp. 5% x 8%. 0-486-42084-1
A SHORT ACCOUNT OF THE HISTORY OF MATHEMATICS, W. W. Rouse Ball. One of clearest, most authoritative surveys from the Egyptians and Phoenicians through 19th-century figures such as Grassman, Galois, Riemann. Fourth edition. 0-486-20630-0
522pp. 5% x 8%.
THE HISTORY OF THE CALCULUS AND ITS CONCEPTUAL DEVELOPMENT, Carl B. Boyer. Origins in antiquity, medieval contributions, work of Newton, Leibniz, rigorous formulation. Treatment is verbal. 346pp. 5% x 8%. 0-486-60509-4
THE HISTORICAL ROOTS OF ELEMENTARY MATHEMATICS, Lucas N. H. Bunt, Phillip S. Jones, and Jack D. Bedient. Fundamental underpinnings of modern arithmetic, algebra, geometry and number systems derived from ancient civilizations. 320pp. 5% x 8%. 0-486-25563-8
A HISTORY OF MATHEMATICAL NOTATIONS, Florian Cajori. This classic study notes the first appearance of a mathematical symbol and its origin, the competition it encountered, its spread among writers in different countries, its rise to popularity, its eventual decline or ultimate survival. Original 1929 two-volume edition presented here in one volume. xxviii+820pp. 5% x 8%. 0-486-67766-4
GAMES, GODS & GAMBLING: A HISTORY OF PROBABILITY AND STATISTICAL IDEAS, F. N. David. Episodes from the lives of Galileo, Fermat, Pascal, and others illustrate this fascinating account of the roots of mathematics. Features thought-provoking references to classics, archaeology, biography, poetry. 1962 edition. 304pp. 5% x 8%. (Available in U.S. only.)
0-486-40023-9
OF MEN AND NUMBERS: THE STORY OF THE GREAT MATHEMATICIANS, Jane Muir. Fascinating accounts of the lives and accomplishments of history's greatest mathematical minds-Pythagoras, Descartes, Euler, Pascal, Cantor, many more. Anecdotal, illuminating. 30 diagrams. Bibliography. 0-486-28973-7
256pp. 5% x 8%.
HISTORY OF MATHEMATICS, David E. Smith. Nontechnical survey from ancient Greece and Orient to late 19th century; evolution of arithmetic, geometry, trigonometry, calculating devices, algebra, the calculus. 362 illustrations. 1,355pp. 5% x 8%. Two-vol. set.
Vol. I: 0-486-20429-4
Vol. II: 0-486-20430-8
A CONCISE HISTORY OF MATHEMATICS, DirkJ. Struik. The best brief history of mathematics. Stresses origins and covers every major figure from ancient Near East to 19th century. 41 illustrations. 195pp. 5% x 8%.
0-486-60255-9
CATALOG OFDOVER BOOKS
Physics OPTICAL RESONANCE AND TWO-LEVEL ATOMS, L. Allen andJ. H. Eberly. Clear, comprehensive introduction to basic principles behind all quantum optical resonance phenomena. 53 illustrations. Preface. Index. 256pp. 5% x 8'%. 0-486-65533-4 QUANTUM THEORY, David Bohm. This advanced undergraduate-level text presents the quantum theory in terms of qualitative and imaginative concepts, followed by specific applications worked out in mathematical detail. Preface. Index. 655pp. 5% x 811. 0-486-65969-0 ATOMIC PHYSICS (8th EDITION), Max Bom. Nobel laureate's lucid treatment of kinetic theory of gases, elementary particles, nuclear atom, wave-corpuscles, atomic structure and spectral lines, much more. Over 40 appendices, bibliography. 495pp. 5% x 8'k. 0-486-65984-4
A SOPHISTICATE'S PRIMER OF RELATIVITY, P. W. Bridgman. Geared toward readers already acquainted with special relativity, this book transcends the view of theory as a working tool to answer natural questions: What is a frame of reference? What is a "law of nature"? What is the role of the "observer"? Extensive treatment, written in terms accessible to those without a scientific background. 1983 ed. xlviii+172pp. 5% x 8'b.
0-486-42549-5
AN INTRODUCTION TO HAMILTONIAN OPTICS, H. A. Buchdahl. Detailed account of the Hamiltonian treatment of aberration theory in geometrical optics. Many classes of optical systems defined in terms of the symmetries they possess. Problems with detailed solutions. 1970 edition. xv + 360pp. 5% x 8'%. 0-486-67597-1
PRIMER OF QUANTUM MECHANICS, Marvin Chester. Introductory text examines the classical quantum bead on a track: its state and representations; operator eigenvalues; harmonic oscillator and bound bead in a symmetric force field; and bead in a spherical shell. Other topics include spin, matrices, and the structure of quantum mechanics; the simplest atom; indistinguishable particles; and stationarystate perturbation theory. 1992 ed. xiv+314pp. 6'i x 9'b. 0-486-42878-8 LECTURES ON QUANTUM MECHANICS, Paul A. M. Dirac. Four concise, brilliant lectures on mathematical methods in quantum mechanics from Nobel Prizewinning quantum pioneer build on idea of visualizing quantum theory through the use of classical mechanics. 96pp. 5% x 8'%.
0-486-41713-1
THIRTY YEARS THAT SHOOK PHYSICS: THE STORY OF QUANTUM THEORY, George Gamow. Lucid, accessible introduction to influential theory of energy and matter. Careful explanations of Dirac's anti-particles, Bohr's model of the atom, much more. 12 plates. Numerous drawings. 240pp. 5% x 831. 0-486-24895-X
ELECTRONIC STRUCTURE AND THE PROPERTIES OF SOLIDS: THE PHYSICS OF THE CHEMICAL BOND, Walter A. Harrison. Innovative text offers basic understanding of the electronic structure of covalent and ionic solids, simple metals, transition metals and their compounds. Problems. 1980 edition. 582pp. 614 x 911.
0-486-66021-4
CATALOG OF DOVER BOOKS HYDRODYNAMIC AND HYDROMAGNETIC STABILITY, S. Chandrasekhar. Lucid examination of the Rayleigh-Benard problem; clear coverage of the theory of instabilities causing convection. 704pp. 5% x 8%.
0-486-64071-X
INVESTIGATIONS ON THE THEORY OF THE BROWNIAN MOVEMENT, Albert Einstein. Five papers (1905-8) investigating dynamics of Brownian motion and evolving elementary theory. Notes by R. Fiirth. 122pp. 5% x 81A. 0-486-60304-0
THE PHYSICS OF WAVES, William C. Elmore and Mark A. Heald. Unique overview of classical wave theory. Acoustics, optics, electromagnetic radiation, more. Ideal as classroom text or for self-study. Problems. 477pp. 5% x 8%. 0-486-64926-1
GRAVITY, George Gamow. Distinguished physicist and teacher takes readerfriendly look at three scientists whose work unlocked many of the mysteries behind the laws of physics: Galileo, Newton, and Einstein. Most of the book focuses on Newton's ideas, with a concluding chapter on post-Einsteinian speculations concerning the relationship between gravity and other physical phenomena. 160pp. 5% x 8%. 0-486-42563-0
PHYSICAL PRINCIPLES OF THE QUANTUM THEORY, Werner Heisenberg. Nobel laureate discusses quantum theory, uncertainty, wave mechanics, work of Dirac, Schroedinger, Compton, Wilson, Einstein, etc. 184pp. 5% x 8%. 0-486-60113-7
ATOMIC SPECTRA AND ATOMIC STRUCTURE, Gerhard Herzberg. One of best introductions; especially for specialist in other fields. Treatment is physical rather than mathematical. 80 illustrations. 257pp. 5% x 8%.
0-486-60115-3
AN INTRODUCTION TO STATISTICAL THERMODYNAMICS, Terrell L. Hill. Excellent basic text offers wide-ranging coverage of quantum statistical mechanics, systems of interacting molecules, quantum statistics, more. 523pp. 5% x 81b. 0-486-65242-4
THEORETICAL PHYSICS, GeorgJoos, with Ira M. Freeman. Classic overview covers essential math, mechanics, electromagnetic theory, thermodynamics, quantum mechanics, nuclear physics, other topics. First paperback edition. xxiii + 885pp. 5% x 8%.
0-486-65227-0
PROBLEMS AND SOLUTIONS IN QUANTUM CHEMISTRY AND PHYSICS, Charles S. Johnson, Jr. and Lee G. Pedersen. Unusually varied problems,
detailed solutions in coverage of quantum mechanics, wave mechanics, angular momentum, molecular spectroscopy, more. 280 problems plus 139 supplementaFy exercises. 430pp. 6% x 9%.
0-486-65236-X
THEORETICAL SOLID STATE PHYSICS, Vol. 1: Perfect Lattices in Equilibrium;
Vol. II: Non-Equilibrium and Disorder, William Jones and Norman H. March. Monumental reference work covers fundamental theory of equilibrium properties of perfect crystalline solids, non-equilibrium properties, defects and disordered systems. Appendices. Problems. Preface. Diagrams. Index. Bibliography. Total of 1,301pp. 5% Vol. 1: 0-486-65015-4 Vol. II: 0-486-65016-2 x 8%. Two volumes. WHAT IS RELATIVITY? L. D. Landau and G. B. Rumer. Written by a Nobel Prize physicist and his distinguished colleague, this compelling book explains the special
theory of relativity to readers with no scientific background, using such familiar objects as trains, rulers, and clocks. 1960 ed. vi+72pp. 5% x 8%.
0-486-42806-0
CATALOG OFDOVER BOOKS
A TREATISE ON ELECTRICITY AND MAGNETISM, James Clerk Maxwell. Important foundation work of modern physics. Brings to final form Maxwell's theory of electromagnetism and rigorously derives his general equations of field theory. 1,084pp. 5% x 8'/r. Two-vol. set.
Vol. 1: 0-486-60636-8 Vol. II: 0-486-60637-6
QUANTUM MECHANICS: PRINCIPLES AND FORMALISM, Roy McWeeny. Graduate student-oriented volume develops subject as fundamental discipline, opening with review of origins of Schrodinger's equations and vector spaces. Focusing on main principles of quantum mechanics and their immediate consequences, it concludes with final generalizations covering alternative "languages' or representations. 1972 ed. 15 figures. xi+155pp. 5% x 8%.
0-486-42829-X
INTRODUCTION TO QUANTUM MECHANICS With Applications to Chemistry, Linus Pauling & E. Bright Wilson, Jr. Classic undergraduate text by Nobel
Prize winner applies quantum mechanics to chemical and physical problems. Numerous tables and figures enhance the text. Chapter bibliographies. Appendices. Index. 468pp. 5% x 8'h.
0-486-64871-0
METHODS OF THERMODYNAMICS, Howard Reiss. Outstanding text focuses on physical technique of thermodynamics, typical problem areas of understanding, and significance and use of thermodynamic potential. 1965 edition. 238pp. 5% x 8'%. 0-486-69445-3
THE ELECTROMAGNETIC FIELD, Albert Shadowitz. Comprehensive undergraduate text covers basics of electric and magnetic fields, builds up to electromagnetic theory. Also related topics, including relativity. Over 900 problems. 768pp. 5% x 8''h.
0-486-65660-8
GREAT EXPERIMENTS IN PHYSICS: FIRSTHAND ACCOUNTS FROM GALILEO TO EINSTEIN, Morris H. Shamos (ed.). 25 crucial discoveries: Newton's laws of motion, Chadwick's study of the neutron, Hertz on electromagnetic waves, more. Original accounts clearly annotated. 370pp. 5% x 8'h. 0-486-25346-5
EINSTEIN'S LEGACY, Julian Schwinger. A Nobel Laureate relates fascinating story of Einstein and development of relativity theory in well-illustrated, nontechnical volume. Subjects include meaning of time, paradoxes of space travel, gravity and its effect on light, non-Euclidean geometry and curving of space-time, impact of radio astronomy and space-age discoveries, and more. 189 b/w illustrations. xiv+250pp. 8% x 9'h.
0-486-41974-6
STATISTICAL PHYSICS, Gregory H. Wannier. Classic text combines thermodynamics, statistical mechanics and kinetic theory in one unified presentation of thermal physics. Problems with solutions. Bibliography. 532pp. 51 x 8'/,.
0-486-65401-X
Paperbound unless otherwise indicated. Available at your book dealer, online at www.doverpublications.com, or by writing to Dept. GI, Dover Publications, Inc., 31 East 2nd Street, Mineola, NY 11501. For current price information or for free catalogues (please indi-
cate field of interest), write to Dover Publications or log on to www doverpubl%cations.com and see every Dover book in print. Dover publishes more than 500 books each year on science, elementary and advanced mathematics, biology, music, art, literary history, social sciences, and other areas.
(continued from front flap) INTRODUCTORY REAL ANALYSIS,
A.
(0-486.61226-0)
N. Kolmogorov and S.
V.
Fomin.
SPECIAL FUNCTIONS AND THEIR APPLICATIONS, N. N. Lebedev. (0-486-60624-4)
CHANCE, LUCK AND STATISTICS, Horace C. Levinson. (0-486-41997-5) TENSORS, DIFFERENTIAL FORMS, AND VARIATIONAL PRINCIPLES, David Lovelock and
Hanno Rund. (0-486-65840-6) SURVEY OF MATRIX THEORY AND MATRIX INEQUALITIES, Marvin Marcus and Henryk
Minc. (0-486-67102-X) ABSTRACT ALGEBRA AND SOLUTION BY RADICALS, John E. and Margaret W.
Maxfield. (0-486-67121-6) FUNDAMENTAL CONCEPTS OF ALGEBRA, Bruce E. Meserve. (0-486-61470-0) FUNDAMENTAL CONCEPTS OF GEOMETRY, Bruce E. Meserve. (0-486-63415-9) FIFTY CHALLENGING PROBLEMS IN PROBABILITY WITH SOLUTIONS, Frederick Mosteller.
(0-486-65355-2) NUMBER THEORY AND ITS HISTORY, Oystein Ore. (0-486-65620-9)
MATRICES AND TRANSFORMATIONS, Anthony J. Pettofrezzo. (0-486-63634-8) THE UMBRAL CALCULUS, Steven Roman. (0-486-44139-3) PROBABILITY THEORY: A CONCISE COURSE, Y. A. Rozanov. (0-486-63544-9) LINEAR ALGEBRA, Georgi E. Shilov. (0-486-63518-X) ESSENTIAL CALCULUS WITH APPLICATIONS, Richard A. Silverman. (0-486-66097-4) INTERPOLATION, J.F. Steffensen. (0-486-45009-0) A CONCISE HISTORY OF MATHEMATICS, Dirk J. Struck. (0-486-60255-9) PROBLEMS IN PROBABILITY THEORY, MATHEMATICAL STATISTICS AND THEORY OF RANDOM
FUNCTIONS, A. A. Sveshnikov. (0-486-63717-4) TENSOR CALCULUS, J. L. Synge and A. Schild. (0-486-63612-7) MODERN ALGEBRA: TWO VOLUMES BOUND As ONE, B.L. Van der Waerden.
(0-486-44281-0) CALCULUS OF VARIATIONS WITH APPLICATIONS TO PHYSICS AND ENGINEERING, Robert
Weinstock. (0-486-63069-2) INTRODUCTION TO VECTOR AND TENSOR
ANALYSIS,
Robert
C.
Wrede.
(0-486-61879-X) DISTRIBUTION
THEORY
AND
TRANSFORM
ANALYSIS,
A.
H.
Zemanian.
(0-486-65479-6)
Paperbound unless otherwise indicated. Available at your book dealer, online at www.doverpublications.com, or by writing to Dept. 23, Dover Publications, Inc., 31 East 2nd Street, Mineola, NY 11501. For current price information or for free catalogs (please indicate field of interest), write to Dover Publications or log on to www.doverpublications.com and see every Dover book in print. Each year Dover publishes over 500 books on fine art, music, crafts and needlework, antiques, languages, literature, children's books, chess, cookery, nature, anthropology, science, mathematics, and other areas. Manufactured in the U.S.A.