This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
P('PiRliPn)~P(ffn+llepn)< 1. This completes the proof of T. Let us now see what this theorem amounts to in plain words. "P('PiRIO»O" says that the probability that a random property in the universe of properties is true of all things in the range R is greater than O. "P('PiRlepn+ l»P('PiRlepS' says that the probability that a random property is true of all things in the range R, is greater, given that it is true of
214
G. H. VON WRIGHT
those of the first n + 1 things in the world which fall in this range, than given (only) that it is true of those of the first n things which fall in this range. "P(J n + IlcP n ) < 1", finally, says that the probability that a random property is true of the (n + I yt thing in the world, if this thing belongs to the range R, is smaller than 1, given that this property is true of those of the first n things in the world which fall in that range. The theorem as a whole thus says the following: lfthe probability that a random property in the universe of properties is true of all things in the range R is not minimal (0), then the probability that this property is true of all things in the range is greater, given that it is true of those of the first n + 1 things which fall in the range, than given (only) that it is true of those of the first n things which fall in the range, if and only if, the probability that it is true of the(n+ 1)slthing, if this belongs to the range, is not maximal (1), given that it is true of those of the first n things which belong to the range R. Now apply the theorem to the individual property F. To say that F is true of all things in the range R is tantamount to saying that the generalization that all A are B is true in the range R. To say that F is true of those of the first n (or n+ 1) things in the world which are also things in the range amounts to saying that the first n (or n + 1) things afford confirming instances of the generalization that all A are B in the range R. To say that F is true of the (n + I yl thing, if this thing belongs to the range, finally, comes to saying that this thing affords a confirming instance of the generalization that all A are B in the range R. When applied to the individual property F, the theorem as a whole thus says the following: If, on tautological data ("a priori"), the probability that all A are B in the range R is not minimal, then the probability of this generalization is greater on the datum that the first n + I things in the world afford confirming instances of it than on the datum that the first n things afford confirming instances, if and only if, the probability that the (n + l.)" thing affords a confirming instance is not maximal on the datum that the first n things afford confirming instances. It follows by contraposition that, if this last probability is maximal (l), then the new confirmation of the generalization in the (n+ It instance does not increase its probability. The new confirmation is, in this sense, irrelevant to the generalization. 6. Now assume that the thing x n + 1 actually does not belong to the range
THE PARADOXES OF CONFIRMATION
215
of relevance R of the generalization that all A are B. In other words, assume that ",Rx n+ 1 . It is a truth of logic (tautology) that "'RXn+1~(Rxn+1~Exn+1)'Since "E" does not occur in the first antecedent, we can generalize the first consequent in "E". It is a truth oflogic, too, that", Rxn+ 1 ~(X)(Rxn+ 1 ~XXn+ 1)' By definition, ff n+1(X) can replace RXn+1~XXn+1' Thus it is a truth of logic that", Rxn+ 1 ~(X)ffn+ 1 (X). From this it follows trivially that", Rxn+ 1-+ (X)( 4in(X)-+'~n+1 (X)). According to axiom Al of probability (X) ( 4in (X ) -+ff n+ 1 (X)) entails that P(ffn+ 114in) = 1 - provided that at least one property has the (second-order) property 4in- The existential condition is satisfied, since the property R trivially has the property 4in • 4in(R) means by definition the same as (RX1 ~RX1) & ... &(Rxn-+Rx n) which is tautology. Herewith has been proved that, if it is the case that", Rxn+ l' i.e. if the (n + l)'t thing in the world does not belong to the range R, then it is also the case thatP(ffn+ ll4i n)=I, i.e. then the probability that this thing will afford a confirmation of any generalization to the effect that something or other is true of all things in this range, is maximal. This probability being maximal, the confirmation which is trivially afforded by the thing in question is irrelevant to any such generalization in the sense that it cannot contribute to an increase in its probability. And this constitutes a good ground for saying that a thing which falls outside the range of relevance of a generalization can be said to afford only a "vacuous" or "spurious" or "paradoxical", and not a "genuine", confirmation of the generalization in question. 7. After all these formal considerations we are in a position to answer such questions as this: Is it possible to confirm genuinely the generalization that all ravens are black through the observation, e.g., of black shoes or white swans? The answer is that this is possible or not, depending upon which is the range of relevance of the generalization, upon what the generalization "is about". If, say, shoes are not within the range of relevance of the generalization that all ravens are black, then shoes cannot afford genuine confirmations of this generalization. This is so, because no truth about shoes can then affect the probability of the generalization that, in the range ofrelevance in question, all things which are ravens are black. So what is then the range of relevance of the generalization that all ravens are black? Here it should be noted that it is not clear by itself which is the range of relevance of a given generalization such as, e.g., that all ravens are black. Therefore it is not clear either which things will afford genuine and
216
G. H. VON WRIGHT
which only paradoxical confirmations. In order to tell this we shall have to specify the range. Different specifications of the range lead to so many different generalizations, one could say. The generalization that all ravens are black is a different generalization, when it is about ravens and ravens only, and when it is about birds and birds only, and when it is - ifit ever is - about all things in the world unrestrictedly. As a generalization about ravens, only ravens are relevant to it, and not, e.g., swans. As a generalization about birds, swans are relevant to it, but not, e.g., shoes. And as a generalization about all things, all things are relevant - and this means: of no thing can it then be proved that the confirmation which it affords is maximally probable relative to the bulk of previous confirmations and therefore incapable of increasing the probability of the generalization. When the range of relevance of a generalization of the type that all A are B is not specified, then the range is, I think, usually understood to be the class of things which fall under the antecedent term A. The generalization that all ravens are black, range being unspecified, would normally be understood to be a generalization about ravens - and not about birds or about animals or about everything there is. I shall call the class of things which are A the natural range ofrelevance of the generalization that all A are B. It would be a mistake to think, when the range of relevance of a generalization is unspecified, it must be identified with the natural range. If it strikes one as odd or unplausible to regard the genus bird, rather than the species raven, as the range of relevance of the generalization that all ravens are black, this is probably due to the fact that the identification of birds as belonging to this or that species is comparatively easy. But imagine the case that species of birds were in fact very difficult to distinguish, that it would require careful examination to determine whether an individual bird was a raven or a swan or an eagle. Then the generalization that all birds which are (upon examination turned out to be) ravens are black might be an interesting hypothesis about hirds. Perhaps we can imagine circumstances too under which all things, blankets and shoes and what not, would be considered relevant to the generalization that all ravens are black. But these circumstances would be rather extraordinary. (We should have to think of ourselves as beings who quasi put their hands into the universe and draw an object at random.) Only in rare cases, if ever, do we therefore intuitively identify the unspecified range with the whole logical universe of things. It would also be a mistake to think that the range of a generalization must become specified at all. But even when the range is left unspecified we may
THE PARADOXES OF CONFIRMATION
217
have a rough notion of what belongs to it and what does not - and therefore also a rough idea about which things are relevant to testing (confirming or disconfirming) the generalization. No ornithologist would ever dream of examining shoes in order to test the hypothesis that all ravens are black. But he may think it necessary to examine some birds which look very like ravens, although they turn out actually to belong to some other species. 8. In conclusion I shall say a few words about the alleged conflict between the so-called Nicod Criterion and the Equivalence Condition (cf. above, section 1). The Nicod Criterion, when applied to the generalization that all A are B, says that only things which are both A and B afford genuine confirmations of the generalization. Assume now that the range of relevance of the generalization in question is A, i.e. assume that we are considering this generalization relative to what we have here called its natural range. Then, by virtue of what we have proved (sections 4-6), anything which is not-A cannot afford a genuine confirmation of the generalization. In other words: Within the natural range ofrelevance ofa generalization, the class ofgenuinely confirming instances is determined by Nicod's Criterion. But is this not in conflict with the Equivalence Condition? This condition, as will be remembered, says that what shall count as a confirming (or disconfirming) instance of a generalization cannot depend upon any particular way of formulating the generalization (of a number of logically equivalent formulations). Do we wish to deny then that the generalization that all A are B is the same generalization as that all not- Bare not-A? We do not wish to deny that "all A are B" as a generalization about things which are A expresses the very same proposition as "all not-B are not-A" as a generalization about things which are A. Generally speaking: when taken relative to the same range of relevance, the generalization that all A are B and the generalization that all not-B are not-A are the same generalization. But the generalization that all A are B with range of relevance A is a different generalization from the one that all not-B are not-A with range of relevance not-B. If we agree that, range of relevance not being specified, a generalization is normally taken relative to its "natural range", then we should also have to agree that, the ranges not being specified, the forms of words "all A are B" and "all not-B are not-A" normally express different generalizations. The generalizations are different, because their "natural" ranges of relevance are different. This agrees, I believe, with how we naturally tend to understand the two formulations.
218
G. H. VON WRIGHT
Speaking in terms of ravens: The generalization that all ravens are black as a generalization about ravens, is different from the generalization that all things which are not black are things which are not ravens as a generalization about all not-black things. But the generalization that all ravens are black as a generalization about, say, birds is the very same as the generalization that all things which are not black are not ravens as a generalization about birds. (For then "thing which is not black" means "bird which is not black".) Within its natural range of relevance, the generalization that all A are B can become genuinely confirmed only through things which are both A and B and is "paradoxically" confirmed through things which are B but not A, or neither A nor B. Within its natural range of relevance the generalization that all not-B are not-A can become genuinely confirmed only through things which are neither A nor B and is "paradoxically" confirmed through things which are both A and B, or B but not A. Within the natural range of relevance, Nicod's Criterion of confirmation is necessary and sufficient. Within another specified range of relevance R, the generalization that all A are B may become genuinely confirmed also through things which are B but not A, or neither A nor B. And within the same range of relevance R, the class of things which afford genuine confirmations of the generalization that all A are B is identical with the class of things which afford genuine confirmations of the generalization that all not-B are not-A. Thus, in particular, if the range of relevance of both generalizations are all things whatsoever, i.e. the whole logical universe of things of which A and B can be significantly predicated, then everything which affords a confirming instance of the one generalization also affords a confirming instance of the other generalization, and vice versa, all confirmations being "genuine" and none "paradoxical".
ASSIGNING PROBABILITIES TO LOGICAL FORMULAS DANA SCOTT Stanford University, Stanford, Calif," and PETER KRAUSS University of California, Berkeley, Calif:
1. Introduction. Probability concepts nowadays are usually presented in the standard framework of the Kolmogorov axioms. A sample space is given together with a a-field of subsets - the events - and a a-additive probability measure defined on this a-field. When the study turns to such topics as stochastic processes, however, the sample space all but disappears from view. Everyone says "consider the probability that X2 0", where X is a random variable, and only the pedant insists on replacing this phrase by "consider the measure ofthe set {WEQ:X(W)20}". Indeed, when a process is specified, only the distribution is of interest, not a particular underlying sample space. In other words, practice shows that it is more natural in many situations to assign probabilities to statements rather than sets. Now it may be mathematically useful to translate everything into a set-theoretical formulation, but the step is not always necessary or even helpful. In this paper we wish to investigate how probabilities behave on statements, where to be definite we take the word "statement" to mean "formula of a suitable formalized logical calculus". It would be fair to say that our position is midway between that of Carnap and that of Kolmogorov. In fact, we hope that this investigation can eventually make clear the relationships between the two approaches. The study is not at all complete, however. For example, Carnap wishes to emphasize the notion of the degree of confirmation which is like a conditional probability function. Unfortunately the mathematical theory of general conditional probabilities is not yet in a very good state. We hope in future papers to comment on this problem. Another question concerns the formulation of
* This work was partially supported by grants from the National Science Foundation and the Sloan Foundation.
220
DANA SCOTT AND PETER KRAUSS
interesting problems. So many current probability theorems involve expectations and limits that it is not really clear whether consideration of probabilities of formulas alone really goes to the heart of the subject. We do make one important step in this direction, however, by having our probabilities defined on infinitary formulas involving countable conjunctions and disjunctions. In other words, our theory is o-additive. The main task we have set ourselves in this paper is to carryover the standard concepts from ordinary logic to what might be called probability logic. Indeed ordinary logic is a special case: the assignment of truth values to formulas can be viewed as assigning probabilities that are either 0 (for false) or 1 (for true). Tn carrying out this program, we were directly inspired by the work of Gaifman [1964] who developed the theory for finitary formulas. Aside from extending Gaifman's work to the infinitary language, we have simplified certain of his proofs making use of a suggestion of C. RyllNardzewski. Further we have introduced a notion of a probability theory, in analogy with theories formalized in ordinary logic, which we think deserves further study. In Section 2 the logical languages are introduced along with certain syntactical notions. In Section 3 we define probability systems which generalize relational systems as pointed out by Gaifman. In Section 4 we show how given a probability system the probabilities of arbitrary formulas are determined. Tn Section 5 we discuss model-theoretic constructs involving probability systems. In Section 6 the notion of a probability assertion is defined which leads to the generalization of the notion of a theory to probability logic. In Section 7 we specialize and strengthen results for the case of finitary formulas. In Section 8 examples are given. An appendix (by Peter Krauss) is devoted to the mathematical details of a proof of a measure-theoretic lemma needed in the body of the paper. 2. The languages of probability logic. Throughout this paper we will consider two different first-order languages, a finitary language !E(w) and an infinitary language !E. To simplify the presentation both languages have an identity symbol = and just one non-logical constant, a binary predicate R. Most definitions and results carryover with rather obvious modifications to the corresponding languages with other non-logical constants, and we will occasionally make use of this observation when we give specific examples. The language !E(w) has a denumerable supply of distinct individual variables Vn' for each n < W, and !E has distinct individual variables v~, for each ~ <WI' where WI is the first uncountable ordinal. Both languages have logical
ASSIGNING PROBABILITIES TO LOGICAL FORMULAS
221
constants r., v, ---', V, 3, and = standing for (finite) conjunction, disjunction, negation, universal and existential quantification, and identity as mentioned before. In addition the infinitary language se has logical constants 1\ and V standing for denumerable conjunction and disjunction respectively. The expressions of se are defined as transfinite concatenations of symbols oflength less than W 1, and the formulas of secw) and se are built from atomic formulas of the forms Rv~v~ and v~=v~ in the normal way by means of the sentential connectives and the quantifiers. Free and bound occurrences of variables in formulas are defined in the well-known way. (For a more explicit description of infinitary languages see the monograph Karp [1964].) A sentence is a formula without free variables. We will augment the non-logical vocabulary of our languages with various sets T of new individual constants tET and denote the resulting languages by secw)(T) and seCT) respectively. It is then clear what the formulas and sentences of secW)(T) and seCT) are. For any set T of new individual constants let Y and yeT) be the set of sentences of se and seCT) respectively, and let oCT) be the set of quantifier-free sentences of seCT). We adopt analogous definitions for the language secw). If L is a set of sentences and cp is a sentence, then cp is a consequence of L if cp holds in all models in which all sentences of L hold, and we write L 1= cpo cp is valid if it is a consequence of the empty set, and we write 1= tp, For both languages 2 Cw) and :e we choose standard systems of deduction, and we write L f- cp if cp is derivable from L. cp is a theorem if cp is derivable from the empty set, and we write f- tp, (For details concerning the infinitary language we again refer the reader to Karp [1964].) By the well-known Completeness Theorem of finitary first order logic we have for every L <;;ycw) and every cpE.
222
DANA SCOTT AND PETER KRAUSS
two sentences
O. Thus there are uncountably many p's such that fl(P) > O. Thus for some n < ca there are uncountably many
236
DANA SCOTT AND PETER KRAUSS
~ < WI such that f.1(P~n) > O. Since f.1([P~n /\ P~'n])=O for ~ # C this is a contradiction. Obviously every complete and consistent set of sentences of an infinitary proposi/ionallanguage has a model. In infinitary propositional logic the trouble therefore arises from the fact that the Prime Ideal Theorem fails for Boolean c-algebras. Naturally the question arises: Does every complete and consistent set 1:<;; Y' have a model? The answer is again no, and a counter-example is due to Professor C. Ryll-Nardzewski. Interestingly enough the counter-example produces a probability model of the complete consistent set of sentences under consideration. The question of whether every complete consistent set 1:<;; Y' has a probability model can, however, be settled by a similar counterexample, and we shall discuss both of these examples in a form slightly modified from Ryll-Nardzewski's original suggestion. Let 2 be an infinitary language with countably many one-place predicates P, for eachj<w, and define a probability model 1ll=(A, R j, d, m)j<", as follows: Let A =w, and let ,91 be the Borel sets of the product space (2"')"'; that is, the o-field of subsets of (2"')'" generated by all sets of the form gE(2"')"': ~(i) (j)=l}, where i,j<w. Let m be the product measure on d determined by m({~E(2"')"':~(i) (j)=l})=! for all i,j<w. Finally, forj<w, define Rj(i)=gE(2"')"':~(i)(j)=l} for all iEA. (Note: strictly speaking III is not a probability model since (s~, m) is not a measure algebra. Thus we would have to consider the quotient algebra s?iII of d modulo the o-ideal I={xEd:m(x)=O}, and lift m up to a strictly positive probability on diI. In this example, however, all sup's and inf's in ,91' that have to be taken into consideration are countable; clauses (vi) and (vii) of the definition of the valuation function h make sense; and everything comes out just the same. We can omit the tedious details.) Then let ~1={/;:jEA} be a set of new individ ual constants such that Ii # Ii" if i # 1'. Now we observe that for every IfJE S, the element h( IfJ )Es?i is invariant under all finite permutations of the second coordinate in (2"')"'. By the well-known 0-1 Law (Hewitt and Savage [1955] p. 496) m is two-valued on h(IfJ). Thus the set 1:={IfJE,'/}: m(h(1fJ ))=1} is a complete and consistent theory of 2. We wish to show that 1: has no model. Indeed, suppose IB=(B, Sj)j<", is a model of 1:. Since B must be non-empty, let bEB. For j<w define formulas Qj(v)=Piv), if bES j; while Qj(v)=iPiv), if brjSj' Then 3v[ A Qj(v)] holds in lB. However, as a j<w
straightforward computation shows, m(3v[ A Qj(v)])=O, which is a contral « o»
diction. On the other hand, by its very construction of 1:; that is, f.1~1(1fJ)=1 for allIfJE1:.
mis a probability model
ASSIGNING PROBABILITIES TO LOGICAL FORMULAS
237
For our second example we let m:'=(A, R~, d')i<<» be that Boolean-algebraic model where A=w, where d'=d/J, the ideal J being the a-ideal of all first-category sets in the Borel algebra d, and where R~(i)=Rj(i)/J for all i.] «co. Since A is countable, we note that the valuation h' for m:' is such that h'(
A' is a tautology, and therefore it has probability 1, for any probability function. And, it follows from this that any probability function, P, such that P(B;)=1 for i=l, ... , n also assigns P(A') = 1. But, it follows directly from the definition of a probability function that if C is any formula and C' is its 'associated' material conditional (or C'=C, if C is truth-functional), then P(C)=1 if and only if P(C')=1. Hence, if P(Bi) = 1, for i = 1, ... , n, then P(A)= 1, and therefore A is a strict consequence of Sf, and hence of S. This concludes the proof of 1.1. Proof of 1.2. This follows immediately from Theorem 1.1 just proven. For, if SII- A, but not SI- A, then A is not a tautological consequence of S, and therefore there is a truth-assignment to the formulas of the language under which all formulas of S have the value 'true', but A has the value 'false'. Again, a probability function can be defined as in the proof of 1.1,
276
ERNEST W. ADAMS
such that P(B)= 1 for all Bin S, but peA) =0, from which it follows trivially that A cannot be a reasonable consequence of S. This proves 1.2. Proof of 1.3. That if A is in S, then SIf- A is obvious. To prove part (ii), suppose that Sand S' are both finite, that S'II- B for all Bin S, and SII- A. For any e> 0 there exists b >0 such that if PCB»~ I-b for all Bin S, then peA»~ I-e. Since S'If- B for all Bin S, there exists bB for all Bin S such that if P(C» 1-b B for all C in S', then PCB»~ I-b. Since Sis finite, there exists a minimum b B for all B in S, which is positive; let b o be this minimum. Clearly, then, if P(C» I-b o for all C in S', then PCB»~ I-b for all Bin S, and therefore peA»~ I-e. Hence S'If- A, as was to be shown. Part (iii) follows directly from the fact that, for any probability function P' of!£ it is possible to construct another probability function P of se such that P(B)=P'(B') for all formulas Band B ' of se, where B' results from B by replacing all occurrences of a in B by qJ. The construction of P is elementary and will not be described here. Assuming this construction, it follows directly that if not S'If- A', then not SIf- A. For, if there were some e>O such that for all b>O there existed a probability function P' such that P'(B'» I-b for all B' in S', but P' (A')::;; 1 -e, then it would also be the case that PCB»~ I-b for all Bin S but peA)::;; 1-e, and hence not SIf- A. This concludes the proof. Theorem 1.1 shows that the notion of strict consequence is of no formal interest, since it is equivalent to tautological consequence. The intuitive significance of Theorem 1.1 is that it suggests that we should not 'get in trouble' in analyzing logical relations among conditional statements by treating them as material conditionals, so long as the premises of our arguments can be asserted with logical certainty. That is, where we may expect trouble in applications of standard logic is in situations in which we are reasoning from premises which are not known with certainty. Theorem 1.3 is significant in showing that the reasonable consequence relation has at least some minimal properties of deduction relations, and therefore justifies calling this a 'consequence' relation, at least as applied to finite sets of premises. That the probabilistic consequence relation is not a deduction relation where its domain is extended to include infinite sets of formulas is seen from the fact that it fails to satisfy the compactness condition: i.e. there are infinite sets of formulas, S, and formulas A, such that SII- A, but not S/If- A for any finite subset, S', of S. An example of a set Sand formula A having this property is as follows. Let S be the set of all formulas B;= 'a, v a. ; 1 --->ai + 1 & - a;' for i = 1, 2, ... (where the 'a;' are distinct atomic formulas), and let A = 'a1--->F'. Now it is a trivial consequence of the axioms of probability that if P(BJ>f for all i = 1,2, ... , then P(aJ::;;-tP(ai+ 1) for all i,
PROBABILITY AND THE LOGIC OF CONDITIONALS
277
from which it follows that P(a 1) must be 0, hence P(ar-+ F) = I. Clearly, therefore, SII-A, since an arbitrarily high probability for A can be guaranteed by requiring that all formulas of S have probability of at least t. On the other hand, the same argument shows that for any finite subset S' of S, an assignment P(a1»0, and therefore P(a 1---+F) =0 is consistent with assigning arbitrarily high probabilities to all formulas of S', so it is not the case that S'II-A.
In what follows we shall be concerned exclusively with the reasonable consequence relation restricted to finite sets of premises. It will prove that the reasonable consequence relation restricted to finite sets is equivalent to several other conditions with intuitive significance, and in fact it is possible to give a system of rules of inference within a natural deduction system such that a conclusion, A, follows from a finite set, S, of premises if and only if A is derivable from S by those rules. These rules will be given in the following section, in the definition of the relation of 'probabilistic consequence', and it will be shown that derivability in accordance with these rules is a sufficient condition for a conclusion to be a reasonable consequence of premises. The proof that probabilitistic consequence is also a necessary condition for reasonable consequence (the completeness proof) is more difficult, and requires further preliminaries. 3. Probabilistic consequence. We now give a set of rules for deriving 'probabilitistic consequences' from sets of formulas, S, and show that if a formula, A, is a probabilistic consequence of S, then A is a reasonable consequence of S. The rules for deriving probabilistic consequences form the clauses of Definition 6, below. DEFINITION 6. Let S be a set of formulas. Then the set of probabilistic consequences (abbreviated 'p.c.s.') of S is the smallest set S' having S as a subset such that for all truth-functional formulas ip, lJ', and y: PCl. if
0 there exists <5 > 0 such that for all regular probability functions p, if P(B) > 1-(5 for all Bin S, then P(A»l-e. We will now consider briefly and in-
PROBABILITY AND THE LOGIC OF CONDITIONALS
315
formally the consequences of this modified criterion of reasonableness, confining attention exclusively to finite sets of premises. r-reasonable consequence is a relation intermediate in strength between reasonable consequence and tautological consequence: i.e., for premises S and conclusion A, if SII-A then Sll-rA, and if Sll-rA then SI- A. The relation II- r also satisfies two conditions for deduction relations stated in Theorem 1.3, but it fails to satisfy the substitutivity condition: if SII- A, and S' and A' result from S and A, respectively, by replacing their atomic formulas by other formulas, then S'II- A'. In particular, the relation p----* FII-, F holds, but F----*Fll-rF does not. Since SII-A entails Sll-rA, all of the basic rules of probabilistic consequence (Definition 6) are also valid for r-reasonable consequence. One rule not valid for reasonable consequence in general, but valid for r-reasonable consequence is the following "stop rule": PCr. If qJ does not tautologically imply F, then qJ----* Fll-rF. It can be proved that, in fact, if PCr is added to rules PCl-PC8 for probabilistic consequence, then any consequence A follows from a (finite) set of premises S according to the augmented rules if and only if Sll-rA. Along with the notion of r-reasonable consequence, it is possible to define the notion of an rP-ordering, which is a P-ordering (Definition 7.1) :s; such that for all truth functional qJ, qJ:S; F holds only if qJ is tautologically false. And, as would be expected, all rP-orderings are associated with uniform sequences of regular probability functions. The most important fact about the modified criterion of reasonableness is this: for any formula A and set offormulas S, if Sll-rA but not SII- A then (1) Sll-rF, (2) not SII-F, and (3) for some non-tautologically false formula tp, SII- qJ----* F. The significance of the foregoing is this: if a formula A is an rreasonable consequence of S, but not a reasonable consequence of S, then S must be 'r-reasonably inconsistent'. And, the reason for this must be that some formula of the form qJ ----* F is derivable from S, where qJ is not tautologically false. We have just seen that the only case in which r-reasonable consequence differs from reasonable consequence is that in which the set of premises, S, has a reasonable consequence of the form qJ----* F, where qJ is not tautologically false. Under those circumstances, Sll-rF, and therefore Sll-rA for all formulas A. What can be said about sets of premises S which do have formulas of the form qJ----* F as reasonable consequences? The following can be proven: that S entails such a formula if and only if it has a subset S' which is 'empirically inconsistent' in the sense next defined. A set S' of formulas is empirically in-
316
ERNEST W. ADAMS
consistent if and only if: (1) there is a truth assignment under which at least some members of S' are verified or falsified, and (2) every truth assignment which verifies some member of S' falsifies some other member of st. A pair of formulas which are empirically inconsistent are p---+q and p---+ - q, since it is possible to verify or falsify them, but any truth assignment verifying one falsifies the other. A single formula which is empirically inconsistent is p---+ - p, since it is possible to falsify it, but not to verify it. The intuitive motivation for calling situations like those described above 'empirically inconsistent' is that they represent conditional assertions such that, should any of the conditions of the assertions be 'realized', then at least one of the assertions would be falsified (and, it must be possible to realize the conditions). Thus, one may 'get away with' asserting both p---+q and p---+-q simply because it happens that p proves to be false, so neither assertion is 'contradicted by the facts'. Nevertheless, the assertions do 'conflict' in that they both give information about a contingency (that p will prove true), but that information is inconsistent - we certainly couldn't rely on these assertions to guide us in making provision for the contingency that p might be true. The foregoing argument constitutes further intuitive justification for adding the rule of inference PCr to the rules which are valid for reasonable consequence (for instance, rules PCI-PCS). What the arguments suggests is that any premises which do have a formula of the form
SUBJECT INDEX
acceptability of a generalization, 11, 10 - of singular hypotheses, 15, 18 acceptance of a generalization, rule for, 11, 13 accepted-information model of science, 97 adequate evidence, 2 atomistic universe, 130 attention, mechanism of, 64 - , principle of, 63 axioms of order, 158 background knowledge, 177, 180, 186 Bayes' theorem, 56 Bayesian approach, 198 - to the paradoxes of confirmation, 195, 198 model, 28, 37 - with memory of length one, 38 theory, 22, 27 viewpoint, 52, 61, 202 betting theory, 67 Boolean algebra, 244 - o-algcbra, 238 Carnapian inductive logic, 35 causal laws, 198, 204, 207 certainty, 307 chess, 23, 43 cognitive model, 36 commonness, measure of, 87 - , principle of maximum, 88, 94 compactness condition, 276 - theorem, 248 completeness, 292 - proof, 277 - theorem, 221, 297 comprehensive, 179, 186
- instances, 177 - , of a universal implication, 176 concept formation, 21 - identification, 22, 24, 29 conditionals, accidental, 189,190,191,193 - , connected, 189, 193 - , counterfactual, 188 - , indicative, 188 - , logic of, 265 - , subjunctive, 188 conditioning, 26 - model, generalized, 41, 46 confirmation, 176, 178, 180, 181, 182, 184, 185, 186, 196, 197 - , degree of, 219 function c* (Catnap), 73,102,123, 127, 128, 133, 149, 154 - c+ (Carnap), 123 - , paradoxes of, 175, 198, 208 - theory, 201 consistency and logical closure as conditions of acceptability, 4, 12 consistency and logical closure as epistemic principles, 3 constituent, 7, 99, 114, 133, 160 - , attributive, 7, 99, 133, 158 - , depth of a, 153 constituent, graph of a, 160, 173 - , subordinate, 151 constituents and structure-descriptions, I SO content, informative, 99, 100 contingency table, 205 contraposition, 299 - , law of, 269 contrapositive, 186, 190, 192, 194, 196, 198 - , asymmetry between it and its original, 178
318
SUBJECT INDEX
- instances, 177, 196 - instance, restricted, 182 -, probabilistic, 206 -,restricted, 179, 191, 192 - , of a universal implication, 175 corroboration, 107, 111 - , degree of, 109, 110 countable consistent set, 237 Ct-predicate, 7, 99, 133 decision procedure, 292 - theory, 21, 200 deduction relation, 298 - theorem, 311 deductive-nomological model of science, 97 degree of covering, 84, 94 - -, principle of maximal, 94 denumerably infinite model, 240 depth, 134 - of a constituent, 153 - of a sentence, 141, 158, 162 detachment, rule of, 52 - in probabilistic inference, 50 deterministic hypotheses, 205 direct product, 230 directed union theorem, 251 distributive normal form, 9, 99 Doob's separability theorem, 258 downward Lowenheim-Skolem theorem, 251 effort of description, 71 eliminating quantifiers, 241 empirical evidence, 182, 183, 184,185 - support, 178, 184, 185, 186 - -, as a pragmatic concept, 183 empirically inconsistent, 316 entropy, 66, 70, 76, 78 entropy, information and, 77 epistemic utility, 96, 98, 107 equivalence condition, 175,209,217 evidence, 58 - , concept of total, 49 evidential strength, 81 -, information-theoretical measure of, 83 --, probabilistic measure of, 83
expected utility, 98, 107, 108, 110 - - , classical theory of maximizing, 63 finitely additive probability, 222, 244 Gaifman condition, 224, 225, 256, 257 generalization, strong, 98, 99 -, weak, 98, 99 generalizations, degree of confirmation of, 6, 109, 115, 124, 126, 130, 150, 168 - and knowledge, 18 grammar, 42 graph, 168 - of a constituent, 160, 173 -, matrix associated with a, 162 - , strongly connected, 161, 162, 169 hypothesis, range of a, 179 hypothetical syllogism, 268 independent union, 227 inductive logic, 202 - - , Carnapian, 35 inference, non-deductive principle of, 56 - rule, non-reasonable, 312 infinitary formula, 220 information, 77 -, absolute, 82, 101, 106 - , amount of, 81 - and entropy, 77 - overlap, 84 - - or covering, principle of, 85 - -, principle of maximum, 94 -, principle of minimum added, 94 -, processing, 202 - -, problem of, 62 -, its relation to probability, 82, 100, 106, 107 -, relative, 82, 101, 106 -, semantic, 96, 98 - , - theoretical measure of evidential strength, 83 instance-confirmation, 7 inverse probability, 195 Kelley property, 237 knowledge, background, 177, 180, 186
SUBJECT INDEX
-, definition of, 1 - , generalizations and, 18 - and probability, 1, 5, 12 Kolmogorov axioms, 219, 272
:e, probability function for, 273 - , SD-set of, 272 -, state description for, 271 A-continuum of inductive methods (Carnap), 75, 113, 115, 119, 122, 129 language, 270 - , finitary, 220, 244 - , finite, 271, 283, 291, 308 - , infinitary, 236 lawlikeness, 180, 181 length of a state description, 67 limited relevance, principle of, 175, 176 logical closure as a condition of acceptability, 4, 12 lottery paradox, 4, 17 - , avoided, 17 Lowenheim-Bkolem theorem, downward, 251 - , upward, 252 Markov process, 78 material implication, 265 - - , fallacy of, 307 - - , its role in the paradoxes of confirmation, 180, 186, 188 mathematical learning theory, 41 matrix, 162 - associated with a graph, 162 - , Jordan form of, 165, 171 - , permutation, 165, 166 maximum likelihood principle, 89 memory, 34 - of a computer, 203 -, finite, 34 - of length one, Bayesian model with, 38 - of minimal length, 38 -, restricted, 37 Nicod's criterion, 208, 209, 217 noncontingent reinforcement, 36 non-deductive principle of inference, 56 non-reasonable inference rule, 312
319
numerical quantifier, 157 ordered universe, 155 orderings for sets of formulas, standard, 288 P-ordering, 282 paired-associate experiment, 24 paradox, lottery, 4, 17 - of universal relevance, 176 paradoxes of confirmation, 175, 198, 208 - - , Bayesian approaches to the, 195,198 - -, possible solutions to, 186, 195, 199 parameter a (Hintikka), 13, 103, 104, 113, 115, 116, 118, 130 - - as an index of caution, 117, 13l - - and inductive behavior, 127 - -, infinite, 122 - -,zero, 123, 126 - A (Carnap), 13,75,76,77,113,115,118, 122, 127 - - as a function of the number of instantiated Q-predicates, 119 parameters a and A, relation of, 118 permutation matrix, 165, 166 polynomial inequality, 249 prior distribution, 203 probabilistic consequence, 277, 288 - contrapositive, 206 - inference, 50 - -, rule of detachment in, 50 - measure of evidential strength, 83 probability assertion, 233 - , finitely additive, 222, 244 - function for 2, 273 - -, regular, 314, 315 -, inverse, 195 - and knowledge, 1, 5, 12 - law, 234 - logic, 220 - model, 235, 238 - -, denumerable, 244, 252 -, denumerable symmetric, 252 - -, with strict identity, 248 - - , symmetric, 242, 253 - - - theoretic concept, 227 probability, principle of maximum, 91
320
SUBJECT INDEX
- systems, symmetric, 231, 252 - theory, 220 product, direct, 230 -, ultra-, 230 property space, 44 propositional calculus, 265 psychological model, 24, 25 pure prudence, rule of, 63 Q-predicate, 76, 99, 114, 133 quantifiers, eliminating, 241 -, numerical, 157 Rasiowa-Sikorski lemma, measure-theoretic generalization of the, 259 range of relevance, 211, 215 -- -,natural, 216,217,218 rational behavior, 23, 35 - change in belief, 60 rationality, 23, 67 -, theory of, 63 reasonable consequence, 269, 274, 300, 302, 308 reasonableness, criterion of, 267 - , modified criterion of, 314 reduction sequence, 288 redundancy, 77, 78 requirement of the variety of instances, 125 simplicity, 66, 70, 76, 78, 96, 163, 167
- , principle of, 66, 73, 79 - of a state description, 67, 73 singularity, 140, 141, 146, 149 - , existence of, 147, 151, 153 -,observed, 146, 149 - , unobserved, 146, 149 space of properties, 44 state description for se, 271 - , length of a, 67 - , simplicity of a, 67, 73 statistical independence, assumption of, 202 - syllogism, 49,57 stimulus-association model, 24, 25 - - sampling model, 35 - - theory, 39 straight rule, 123 strict consequence, 274 strong entailment, 297, 300, 302 structure description, 71 substitutivity condition, 315 tautologically valid, but not reasonable, 311 truth condition, 265 ultraproduct, 230 uniform sequence, 284 uniformity, 76 upward Lowenheim-Skolem theorem, 252 utility, epistemic, 96, 98, 107 valuation function, 223