S U B J U N C T I V E REASONING
PHILOSOPHICAL STUDIES SERIES IN PHILOSOPHY Editors: WILFRIDSELLARS,University of Pittsburgh KEITHLEHRER,University of Arizona
Board of Consulting Editors: JONATHAN BENNETT,University of British Columbia A L A N G I B B A R DUniversity , of Pittsburgh ROBERTSTALNAKER, Cornell University ROBERTG . T U R N B U L LOhio , State University
VOLUME 8
JOHN L . POLLOCK University of Rochester
SUBJUNCTIVE REASONING
D . R E I D E L PUBLISHING C O M P A N Y DORDRECHT-HOLLAND/BOSTON-U . S . A .
Library of Congress Cataloging in Publication Data Pollock, John L Subjunctive reasoning. (Philosophicalstudies series in philosophy ; v. 8) Bibliography: p. Includes index. 1. Conditionals (Logic) 2. Reasoning. 3. Counterfactuals (Logic) 4. Probabilities. I. Title. BC199.C56P64 160 76-19095 ISBN 90-277-0701 -4
Published by D. Reidel Publishing Company, P.O.Box 17, Dordrecht, Holland Sold and distributed in the U.S.A., Canada, and Mexico by D. Reidel Publishing Company, Inc. Lincoln Building, 160 Old Derby Street, Hingham, Mass. 02043, U.S.A.
All Rights Reserved Copyright @ 1976 by D. Reidel Publishing Company, Dordrecht, Holland No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any informational storage and retrieval system, without written permission from the copyright owner Printed in The Netherlands
TO CAROL who puts up with me
T A B L E OF C O N T E N T S
PREFACE I. INTRODUCTION
1. Subjunctive Reasoning 2. The Linguistic Approach 3. The 'Possible Worlds' Approach 4. Conclusions Notes 11. F O U R KINDS O F CONDITIONALS
1. Introduction 2. The Four Kinds 3. 'Even i f Subjunctives 4. 'Might Be' Conditionals 5. Necessitation Conditionals 6. Simple Subjunctives 7. The Axiomatization of Simple Subjunctives 8. Conclusions Notes 111. S U B J U N C T I V E G E N E R A L I Z A T I O N S
1. 2. 3. 4. 5.
Introduction Rudiments of an Analysis Strong Generalizations Weak Generalizations Conclusions Notes
IV. T H E BASIC ANALYSIS O F SUBJUNCTIVE CONDITIONALS
1. Introduction 2. The Analysis of M 3. Simple Propositions
VIII
TABLEOFCONTENTS
4. Counter-Legal Conditionals 5. Subject Preference Notes V. Q U A N T I F I C A T I O N , MODALITIES, A N D C O N D I T I O N A L S
1. 2. 3. 4. 5.
Referential Opacity Transworld Identity Kripke's Observation Quantified Modal Logic Conditionals Notes
VI. THE FULL T H E O R Y
1. Syntax 2. Semantics 3. Infinitary Operators 4. The Introduction of Sets 5. Some Consequences of the Analysis Note VII. C A U S E S
1. Introduction 2. The Ontology of Causes 3. Some Causal Relations 4. Causal Sufficiency 4.1. Nomic Subsumption 4.2. Contingently Sufficient Conditions 4.3. Causal Sufficiency and Subjunctive Conditionals 5. Remarks on the Analysis 6. The Logic of Causes Notes VIII. PROBABILITIES
1. Introduction 2. Indefinite Probabilities 2.1. Relative Frequencies 2.2. Subjunctive Indefinite Probabilities
104
TABLE OF C O N T E N T S
2.3. Strong Indefinite Probabilities 2.4. Weak Indefinite Probability Statements 3. The Redefinition of M 3.1. Strong and Weak Definite Probability 3.2. Probabilistic Laws 3.3. The Analysis of M 4. Simple Subjunctive Probability 4.1. The Variety of Probabilities 4.2. Simple Subjunctive Probability 4.3. A Probability Algebra 4.4. Simple Subjunctive Probability and Simple Subjunctive Conditionals 4.5. Simple Indefinite Probabilities Notes IX. DISPOSITIONS
1. Introduction 2. Absolute Dispositions 3. Probabilistic Dispositions Notes BIBLIOGRAPHY INDEX
PREFACE
I am indebted to many people for the help they gave me in the writing of this book. I owe a large debt to David Lewis and Robert Stalnaker, on both general and specific grounds. As becomes apparent from reading the notes, the book would not have been possible without their pioneering work on subjunctive conditionals. In addition, both were kind enough to provide specific comments on earlier versions of different parts of the book, and Stalnaker read and commented on the entire manuscript. Closer to home, I am indebted to my colleagues Rolf Eberle and Henry Kyburg, Jr., my erstwhile colleague Keith Lehrer, and numerous graduate students for their helpful comments on various parts of the manuscript. Some of the material contained herein appeared first in the form of journal articles, and I wish to thank the journals in question for allowing the material to be reprinted here. Chapter One contains material taken from 'The "Possible Worlds" Analysis of Counter-factuals', published in Phil. Studies 29 (1976), 469 (Reidel); Chapter Two contains material much revised from 'Four Kinds of Conditionals', Am. Phil. Quarterly 12 (1975), and Chapter Three contains much revised material from 'Subjunctive Generalizations', Synthese 28 (1974), 199 (Reidel).
CHAPTER I
INTRODUCTION
There exists quite a variety of statements which are in some sense 'subjunctive'. The best known of these are the so-called 'counterfactual conditionals' which state that if something which is not the case had been the case, then something else would have been true. An example is 'If Kennedy had been president in 1972, the Watergate scandal would not have occurred'. Ordinary people use counterfactuals all the time, and philosophers use them freely in ordinary situations. However, when they are being careful, philosophers have traditionally felt uncomfortable about counterfactuals and eschewed their employment in philosophical analysis. Such philosophical squeamishness is on the whole meritorious and results from the recognition that counterfactual conditions have themselves stubbornly resisted philosophical analysis. Although counterfactual conditionals constitute the kind of subjunctive statement which comes most readily to the mind of a philosopher, it is far from being the only philosophically important kind of subjunctive statement. Philosophers have long recognized that laws of nature cannot really be formulated using universally quantified material conditionals, but they have not usually been prepared to go the extra distance of admitting that statements expressing such laws are really subjunctive. It turns out that laws of nature must be formulated using a special kind of subjunctive statement, herein called a 'subjunctive generalization'. Subjunctive generalizations prove to be of pre-eminent importance in discussing inductive confirmation. Causal statements constitute another category of statements which are in the requisite sense 'subjunctive'. The analysis of causal statements has always been recognized to be an extraordinarily difficult philosophical problem, but I think it has rarely been appreciated that the main source of this difficulty lies in the subjunctive nature of causal statements.
2
CHAPTER I
A traditional philosophical problem is the analysis of probability statements. There are in fact a number of different concepts equally deserving of being called 'probability'. There is not just one legitimate concept of probability. Philosophers have succeeded in sorting out and distinguishing between a number of different probability concepts, including 'degree of confirmation', 'degree of belief', 'degree of rational belief', and others. However, there are many probability statements which cannot be formulated using any of these recognized 'indicative' probability concepts. It will turn out that there are some extremely important probability concepts that are in essential ways 'subjunctive' and which have been almost entirely overlooked by philosophers bent upon the avoidance of suspicious (to them) subjunctive reasoning. A problematic concept which has usually been recognized to have a subjunctive core is that of a disposition. Dispositions constitute an important tool of philosophical analysis. Particularly in the philosophy of mind, philosophers have felt that through the use of dispositions they could clarify the structure of interesting concepts. But, of course, the extent of such clarification has been strictly limited by the apparent need for clarifying dispositions themselves. These various subjunctive concepts map out areas of what might be called 'subjunctive reasoning'. Subjunctive reasoning in general has been deemed philosophically problematic because it seems to presuppose a strange metaphysically suspicious sort of logically contingent necessity. T o say that the Watergate scandal would not have occurred had Kennedy been president in 1972, seems to be to assert some kind of necessary connection between those two states of affairs. If there were no such connection, how could the occurrence of the one possibly effect the occurrence of the other? This same kind of necessity rears its ugly head repeatedly throughout subjunctive reasoning. The necessity in question is clearly not logical necessity, but what other kind is there? Surely this is a very suspicious notion, and philosophers would be well advised to avoid subjunctive concepts at all cost! But upon sober reflection, it must be admitted that this attitude is preposterous. Subjunctive concepts do make sense - we use them all the time. The problem cannot be whether they make sense, but what sense they
INTRODUCTION
3
make. The real issue must be how they are to be analyzed, and not their legitimacy. We cannot in good conscience deny that there is this logically contingent kind of necessity which is involved in subjunctive reasoning. We cannot expunge it from our conceptual framework without leaving that framework seriously impoverished. What we must seek is an understanding of subjunctive reasoning, an analysis of subjunctive concepts in terms of other clearer concepts. Thus the task of this book will be to provide an analysis of the above subjunctive concepts in terms of relatively less problematic indicative concepts. If successful, such analyses will free philosophers to use subjunctive concepts with impunity in the course of philosophical analysis. There can be no doubt that subjunctive concepts, clearly understood, would constitute an extremely powerful logical tool. It will turn out that, in important respects, the key to understanding subjunctive reasoning in general is to understand subjunctive conditionals. There are a number of reasons why subjunctive conditionals have so stubbornly resisted philosophical analysis, not the least of which is that our language systematically conflates two importantly different kinds of subjunctive conditionals. However, in recent years the first chinks have appeared in the armor protecting subjunctive conditionals from philosophical understanding. These chinks have resulted in large part from the philosophical onslaughts of Robert Stalnaker (1968, 1972) and David Lewis, (1972, 1973, 1973a). Building upon the insights which they have wrought, I believe that it may prove possible to achieve a con~pleteunderstanding of subjunctive conditionals, and thereby of subjunctive reasoning in general. Historically, there have been two popular approaches to the analysis of subjunctive conditionals. These are the older 'linguistic' approach, and the more recent 'possible worlds' approach initiated by Stalnaker and Lewis. In the rest of this introductory chapter I will examine briefly both of these approaches and use this examination to set the stage for my own analysis which builds upon them both. The plan of the book will then be to complete the analysis of subjunctive conditionals, in the course of which I will propose an analysis of subjunctive generalizations, and then to apply our results on subjunctive conditionals to the analysis of causes, probabilities, and dispositions.
4
CHAPTER I
This is a traditional approach to the analysis of subjunctive conditionals according to which they are to be explained ultimately in terms of physical laws. Let us symbolize the subjunctive conditional 'If it were true that P, then it would be true that Q1 as " P > Q1. Then according to the linguistic theory, " P > Q1 is true just in case there is a physical law, or conjunction of physical laws L, and there are true circumstances C, such that the conjunction " C & L & P1 logically entails Q. For example, suppose we have a dry match under ordinary circumstances. If it were struck ( P ) , it would light ( 0 ) .The reason P > 0"' is true, according to this theory, is that there are physical laws regarding the chemical structure of the match which, when taken together with the circumstances in which we find the match (e.g., the match is dry, there is plenty of oxygen, etc.), and taken together with the match's being struck, entail that the match will light. As we have stated it, the linguistic theory would make too many subjunctive conditionals true. Without some restriction on what we can put into the circumstances C, it would turn out that whenever the material conditional r P Q ~ 1 is true, so is the subjunctive conditional P > 0'. This is because if r P =1 Q1 is true, we could put it among the circumstances C, and then C together with P (without even making use of any laws) would automatically entail 0. Thus some restrictions must be put upon the contents of C. It is surprisingly difficult to see what restrictions are called for. One popular view regarding what restrictions should be placed on C is that it is entirely a matter of convention. This was advocated by Chisholm (1955). Chisholm suggested that we are free to select any true statements we want to put into C, calling the selected ones our 'presuppositions', and maintained that the subjunctive conditional is correspondingly ambiguous. He supported his contention as follows: Let us suppose a man accepts the following statements, taking the universal statements to be law statements: (1) All gold is malleable; (2) No cast-iron is malleable; (3) Nothing is both gold and cast-iron; (4) Nothing is both malleable and not malleable; (5) That is cast-iron; (6) That is not gold; and (7) That is not malleable. We may contrast three different situations in which he asserts three different counterfactuals having the same antecedents. First, he asserts, pointing to an object his hearers don't know to be gold and don't know not to be gold, 'If that were gold, it would be malleable'. In this case, he is
INTRODUCTION
5
supposing the denial of (6); he is excluding from his presuppositions (S), (6), and (7); and he is concerned to emphasize (1). Secondly, he asserts, pointing to an object he and his hearers agree to be cast-iron, 'If that were gold, then some gold things would not be malleable'. He is again supposing the denial of (6); he is excluding (1) and (6),but he is no longer excluding (5) or (7); and he is concerned to emphasize either (5) or (2). Thirdly, he asserts, 'If that were gold, then some things would be both malleable and not malleable'. He is again supposing the denial of (6): he is now excluding (3) and no longer excluding (1). (5), (6), or (7); and he is now concerned to emphasize ( I ) , (2). or (5). Still other possibilities readily suggest themselves (Chisholm, 1955, p. 103).
Chisholm is certainly right that the first two conditionals he mentions are both true. But the third cannot be true. There are no circumstances under which one could reasonably assert, 'If that were gold, then some things would be both malleable and not malleable.' Similarly, one could not reasonably assert of the cast-iron. 'If that were gold, it would not be malleable' (emphasizing (7) and excluding (1) from our presuppositions). Apparently there have to be some constraints on what goes into C. It cannot be entirely a matter of convention. But the view that the contents of C are at least partly conventional is one that has recurred many times in the discussion of counterfactuals. Stalnaker (1968) calls the ambiguity stemming from this supposedly conventional aspect of C 'pragmatic ambiguity'. The view that subjunctive conditionals are subject to pragmatic ambiguity seems to be supported by the fact that one could assert either of Chisholm's first two conditionals: (A)
If that were gold, it would be malleable.
(B)
If that were gold, then some gold things would not be malleable.
It seems that the set C must differ from (A) to (B). But, popular though this view is, I think it is wrong. Although (A) and (B) look, at least at first, like they have the same antecedent, they do not. The antecedents consist of the same words in the same order, but there is a difference in what word is emphasized, and that difference in emphasis makes a difference in meaning. It seems eminently reasonable to paraphrase (B) as follows: (B*)
If some gold things were like that (i.e., had the properties that has), then some gold things would not be malleable.
6
CHAPTER I
The antecedent of (B*) is quite different from that of (A), and it is not unreasonable to maintain that (B*) is what we mean when we write (B). There are many other pairs of conditionals which have traditionally been thought to illustrate the pragmatic ambiguity of subjunctive conditionals. One fruitful source consists of what have been called 'counter-identicals', which are conditionals whose antecedents are false identity statements. For example, the following two conditionals appear to have logically equivalent antecedents and incompatible consequents:
(c)
If Richard Nixon were Golda Meir, he would be a woman.
(D)
If Golda Meir were Richard Nixon, she would be a man.
But these conditionals do not really have equivalent antecedents. Contrary to initial appearance, they are not counter-identicals at all. This can be seen by contrasting them with the following conditional which really is a counter-identical: If Richard Nixon and Golda Meir were really one and the same person (who had been fooling us by flying back and forth between countries and wearing makeup, etc.), then the United States would be less friendly towards the Arab nations. Unlike (El, it seems-that ( C ) and (D) can be paraphrased roughly as follows:
(E)
(C*)
If Richard Nixon had the non-relational properties presently possessed by Golda Meir, he would be a woman.
(D*)
If Golda Meir had the non-relational properties presently possessed by Richard Nixon, she would be a man.
It is worth noting that not all putative counter-identicals can be paraphrased in the same way. For example, (F)
If I were Gerald Ford, I would be athletic.
can be paraphrased as:
(F*)
If I had the non-relational properties presently possessed by Gerald Ford, I would be athletic.
INTRODUCTION
7
But (G)
If I were Gerald Ford, I would sign the education bill.
is paraphrased differently as: (G*)
If I were in the role of Gerald Ford, I would sign the education bill.
That (F) and (G) really d o have different antecedents is indicated by the fact that they do not jointly imply:
(H)
If I were Gerald Ford, I would be athletic and sign the education bill.
which they would imply if their antecedents had the same meaning. Another pair of examples is provided by Goodman (1955): (1)
If New York City were in Georgia, then New York City would be in the South.
(J)
If Georgia included New York City, then Georgia would not be entirely in the South.
Initially, (I) and (J) appear to have equivalent antecedents, but they can be paraphrased in such a way that it is evident they d o not: (I*)
If New York City were included within the present bounds of Georgia, then New York City would be in the South.
(J*)
If Georgia included the area presently occupied by New York City, then Georgia would not be entirely in the South.
Many subjunctive conditionals are ambiguous. The above examples illustrate this. But what is important is whether subjunctive conditionals are subject to a special kind of 'pragmatic' ambiguity arising out of a conventional element in their meaning, or whether the ambiguity in subjunctive conditionals is just the normal kind of ambiguity present in all natural language which arises naturally from our tendency to say less than we mean. The availability of the above paraphrases suggests that the latter is the case. The ambiguity can be resolved simply by being more careful and saying precisely what we mean.
8
CHAPTER I
However, it must be admitted that some of the paraphrases employed above are not as clear as we might desire. This is primarily true of those which proceed in terms of the 'non-relational properties' of an object. Just what properties are these? I believe that the above paraphrases are essentially correct, but they do not tell the whole story. In particular, they suggest that the ambiguity in subjunctive conditionals is simply a matter of ambiguity in their antecedents, because in all of the above examples, the ambiguity is resolved in that way. If we allow ourselves to talk without further elucidation about the non-relational properties of an object, then the ambiguity in the above examples can be resolved in this way. But it will be found in Chapter IV that some of these examples are really special cases of a more general phenomenon which I will call 'subject preference'. Subject preference generates a kind of ambiguity in subjunctive conditionals which is often resolved in oral communication through the use of emphasis, e.g., by saying 'If that were gold. . .' rather than 'If that were g o l d . . .'. It will be found that not all examples of ambiguity arising from subject preference can be resolved so simply as that in the above examples. In some cases we must regard the conditional itself as genuinely ambiguous. However, this ambiguity is rule-governed and has nothing to do with any supposedly conventional element in the meaning of subjunctive conditionals. Thus it will turn out that there is still no reason to regard subjunctive conditionals as pragmatically ambiguous in the sense of Chisholm and Stalnaker. Having allowed ourselves the freedom to paraphrase subjunctive conditionals as much as we have, it might be supposed that we can resolve the problem of counterfactuals altogether through the use of paraphrase. It might be supposed that counterfactuals are really just enthymematic conditionals. O r to be more precise, it might be supposed that when we assert a subjunctive conditional like 'If this match had been struck, it would have lit', we do not completely express our meaning - what we write is short for a longer conditional whose antecedent contains as explicit conjuncts those statements in the set C which we use in getting from the supposition 'This match is struck' to the conclusion 'It lights'. This proposal is very much in the spirit of Chisholm's proposal. According to this proposal, the set C consists
INTRODUCTION
9
merely of statements that were implicitly part of our meaning in formulating the antecedent of the conditional but which we did not bother to state precisely because we assumed that outaudience would understand what we had in mind. This implies that there is really no problem regarding what we can put into C and what'we cannot put into C. Membership in C is simply a matter of what we mean when we assert the conditional. This account would be very attractive if it worked, but it doesn't work. Although it seems indisputable that in many cases paraphrase is required to make explicit what a conditional means, such paraphrase cannot resolve the proplem of membership in C. This can be seen by coming down heavily on the requirement that the paraphrase must really mean what the original conditional meant. For example, consider the conditional, 'If this match had been struck, it would have lit'. The set C must contain whatever conditions are necessary for a match to light when struck. Among these are its being dry, there being plenty of oxygen present, etc. The difficulty is that I do not know how to fill out the 'etcetera'. I am confident that there are conditions present which would result in the match's lighting if it were struck, and because of this I am confident that the match would light if struck, but I cannot enumerate those conditions. If I cannot enumerate them, they certainly cannot be part of my meaning when I say, 'If this match were struck it would light'. Consequently, membership in C cannot be simply a matter of what one means when he asserts a conditional. I have argued that subjunctive conditionals are not subject to the kind of pragmatic ambiguity philosophers have often supposed. Thus there must be precise conditions determining what is included in C and what is not. Membership in C does not consist merely of implicit premises enthymematically omitted from the antecedent in formulating the conditional. Rather, our problem is one of understanding a principle of detachment for subjunctive conditionals. When r(P& R. ^> Q)' expresses a lawlike connection (obtained by instantiation in a law), under what conditions can we detach R from the antecedent to infer the conditional ^ P > Q1? It is not enough to require that R be true, but it is not clear what more should be required. Goodman (1955) has made a suggestion regarding the solution to
10
CHAPTER I
this problem. His observation is that in deciding.what would be the case if some proposition P were true, we cannot put just any true statement R into C - at the very least we must require that R would not be false if P were true. So Goodman defines R to be cotenable with P just in case ^-(P > - R y is true, and then his proposal is that C consists of the set of true statements cotenable with P. Goodman has certainly found a necessary condition for a statement to be included in C, but it is not a sufficient condition. It is not enough to require that R would not be false if P were true - we must require that R would be true if P were true. Perhaps Goodman was supposing that these are the same thing - that ^-(P> R)' is logically equivalent to ^ ( P > R)'. It is probably true that most philosophers have supposed the negation of a subjunctive conditional to be equivalent to the subjunctive conditional which results from negating the consequent. For example, at first it seems that to deny that if my car had a full tank of gas I could drive all the way to Boston is to affirm that if my car had a full tank of gas I still could not drive all the way to Boston. But I think that this is wrong. It is quite possible for both ^ ( P > -R)' and r ( P >R)' to be false. For example, suppose there were two kinds of gasoline, long-mileage-gasoline (LMG) and short-mileage-gasoline (SMG), and that a tankful of LMG would get me to Boston but a tankful of SMG would not. Then there are two ways in which the antecedent 'My car has a full tank of gas' could be true, and there need be no way to choose between them and say that if my car had a full tank, it would definitely be full of one of these kinds of gas rather than the other. Under these circumstances we can conclude neither that if my car had a full tank of gas I could drive all the way to Boston, nor that if my car had a full tank I still could not drive all the way to Boston. Rather, we are forced to say that if my car had a full tank I might be able to drive to Boston, and also if my car had a full tank I might not be able to drive to Boston. This introduces a new kind of subjunctive conditional that I have not mentioned before - the 'might be' conditional. Let us symbolize "It might be true that Q if it were true that P1 as "QMP1. Perhaps it is not obvious how 'might be' conditionals are related to 'would be' conditionals, but I think it is at least clear that "QMP1 and "(P> - Q V are incompatible. If 0 would be false if P were true, it cannot be that Q
-
11
INTRODUCTION
-
might be true if P were true. Thus ^QMP1 entails ^-(P > Q)', and similarly ^(-Q)MP1 entails ^-(P> 0)'. Hence if we have both QMP1 and "(-Q)MP1, it follows that ^ P > Q1 and ^ P > -C1 are both false. And we have seen an example in which we do have both "QMP1 and r(-Q)MP1 true. Therefore there is a difference between requiring that '--(P > -0)' be true and requiring that ^P> Q1 be true. Given this difference, I think it is clear that Goodman's requirement of cotenability is too weak. This is demonstrated by seeing that it would lead us right back to a special case of the principle that if - ( P > -Q)l is true then ^ P > Q1 is true. More precisely, Goodman's proposal implies that whenever P is false and "-(P>-Q)' is true, then ^ P > Q1 is true. This implication is established as follows. First, we need two obvious principles regarding subjunctive conditionals:
(2.1)
If ^ P > Q1 is true and Q entails R, then ^P> R1 is true.
(2.2)
If rP > (P 3 0 ) " is true, then ^P> Q1 is true.
2.1 is so obvious as to need no defense. 2.2 holds because if rP 2 Q1 would be true if P were true, then both P and ^ P=1 Q1 would be true if P were true, and hence Q would have to be true if P were true. Given these principles, let us suppose, with Goodman, that truth and cotenability are all that is required for inclusion in C. Suppose P is false, and - ( P > -0)' is true. Then by 2.1 '-(P> (P & -0))" is true, and so - ( P > - ( P 3 Q))l is true. But as P is false, "PI Q1 is true', and it follows from Goodman's proposal that ^ P > ( P = > Q)' is true. Then from 2.2 it follows that ^ P > Q1is true. But the example regarding the two kinds of gasoline is a counter-example to this conclusion just as much as it is a counter-example to the more general principle which does not assume that P is false. Thus Goodman's proposal is unacceptable. What is required for inclusion in C is not merely that "-(P> -R)' be true, but rather that R would still be true even in P were true, i.e., that " P > R1 is true. If we do have ^ P > R1, then it seems clear that we are safe in including R in C. This results directly from the intuitive validity of the following principle regarding subjunctive conditionals:
But it is equally clear that once we have made this our condition for
12
CHAPTER I
inclusion in C, we are no longer giving an analysis of subjunctive conditionals. This is because we are now using subjunctive conditionals to define the class C. Thus the linguistic theory runs into a rather severe stumbling block. This is not to say that the linguistic theory cannot be salvaged, but if salvage is to be accomplished, membership in C must be characterized in a very different way. In effect, this is the task that will be undertaken in Chapter IV. There are two fundamental problems for the linguistic theory. The first is that of membership in C, which we have been discussing. But even if we could solve this first problem, there would remain another problem. The linguistic theory proposes to characterize subjunctive conditionals partly in terms of physical laws, but the latter concept seems just as problematic as the concept of a subjunctive conditional. Philosophers have sometimes supposed that any true universally quantified material conditional (henceforth: 'material generalization') is a law, but that is rather obviously unsatisfactory. There is a clear intuitive difference between what philosophers have often called accidental and non-accidental generalizations. The material generalization 'All three thousand pound human beings have wings' is true, because there are no three thousand pound human beings. But a generalization that is true simply because it is vacuous will certainly not support subjunctive conditionals. Nor will it suffice merely to rule out those material generalizations that have vacuous antecedents. For example, if there were just one eleven-toed English- mathematician whose mother is from Dublin, and he were named 'Charlie', then it would be true that all eleven-toed English mathematicians whose mothers are from Dublin are named 'Charlie', but this generalization would be true 'purely by accident' and no one would want to count it as a law. To be contrasted with this are generalizations like 'All pulsars are neutron stars' or 'All creatures with hearts are creatures with kidneys' which, if true, are in some sense 'nomic' and would be considered laws. The problem is to say what distinguishes laws from accidental generalizations. I think that a partial answer to this question is that it is a mistake to suppose that laws can be adequately formulated using material conditionals. On the contrary, laws are fundamentally subjunctive in nature. A law tells us not just that all actual pulsars are neutron stars, but also
INTRODUCTION
13
that any pulsar there could be would be a neutron star. Similarly, the law is not just that all actual creatures with hearts are creatures with kidneys, but also that any creature with a heart would be a creature with a kidney. Let us call generalizations of the form "Any A would be a B' subjunctive generalizations. Then it seems to be the case that laws must be formulated using subjunctive generalizations. This may suffice to distinguish laws from accidental generalizations, but it makes it all the more difficult to give an adequate analysis of the concept of a law. It now appears that even if we could analyze subjunctive conditionals in terms of laws, we would only have succeeded in analyzing one class of subjunctive statements in terms of another class of subjunctive statements, whereas the basic problem is to analyze subjunctive statements in terms of non-subjunctive statements. This makes the prospects for the linguistic theory look rather bleak. Fortunately, I do not think they are as bleak as they look. Although laws are subjunctive, it will turn out that they are a much simpler variety of subjunctive statement than are subjunctive conditionals. In Chapter 111 I will argue that it is possible to give an analysis of subjunctive generalizations in terms of non-subjunctive statements. Once that is accomplished, there will be no circularity in using subjunctive generalizations in analyzing subjunctive conditionals. So the only remaining problem for the linguistic theory will be to characterize membership in the class C.I think that the key to the solution to this problem lies in the 'possible worlds' approach to the analysis of subjunctive conditionals, so let us turn now to that approach.
This approach is due originally to Robert Stalnaker (1968). Stalnaker constructed a formal semantics for subjunctive conditionals based upon the well known formal semantics for modal logic. The idea behind Stalnaker's semantics is that "(P > Q)^ is true just in case, if we add P to our stock of truths and modify them so as to make them consistent with P, but make the modification as small as possible, then the resulting sets of statements entails 0.Putting this in terms of possible worlds, we look at that world in which P is true which is most like the
14
CHAPTER I
real world, and we see whether Q is true in it. Formally, we suppose we have a two-place function f such that for each sentence P and possible world a , f ( a , P ) is that world containing P which is most like the world a. Then r(P> Q)' is true in a iff Q is true in f ( a , P). In case P is logically inconsistent, and so true in no possible world, f ( a , P) would be undefined, so we pick an arbitrary object 7, which we designate 'the impossible world', to be the value of f ( a , P) in this case. Generating a formal semantics, we can take a model to be an ordered quadruple (a, K, f, y) where K represents the set of all possible worlds, a e K, and f is the selection function which, to each p in K and sentence P assigns a member of K 2 . We will want to put some constraints on the selection function, but subject to those constraints we will say that a sentence involving conditionals is valid iff it is true in every model. Intuitively, not just any function f will constitute a reasonable selection function. It seems clear that the following three requirements should be placed upon f :
(3.1)
For any P and a , P is true in f(a, P).
(3.2)
For any P and a , f ( a , P ) = y only if there is no possible world in which P is true.
(3.3)
For all P and a , if P is true in a then f ( a , P) = a.
Principle 3.3 reflects the fact that if P is already true, then we do not have to make any changes to the set of truths in order to accommodate P. Stalnaker proposes that f is based upon a simple ordering of possible worlds with respect to their resemblance to a . This leads him to add the following:
(3.4)
For all P, 0, and a , if P is true in f(a,'Q) and Q is true in f ( a , P), then f(a, P ) = f ( a , 0).
It might be felt that if one is going to defend a possible worlds analysis of subjunctive conditionals, he should first defend an ontology which contains possible worlds. Stalnaker (1972) and David Lewis (1973, p. 84) both endorse platonistic views of possible worlds. I d o not want to say that such a view is wrong, but I do find it somewhat mysterious and I would prefer not to commit myself to it. There are
INTRODUCTION
15
other alternatives. I prefer to identify a possible world with the set of propositions true in it; or more precisely, to define a possible world to be a maximal consistent set of propositions. Lewis argues that the only way to make sense of 'consistent' here is in terms of possible worlds, but I would hope that he is wrong. I have suggested an alternative account in Pollock (1974), but even if that account is found wanting, I suspect that some account can eventually be given. My point here is not to defend any particular account of possible worlds, but merely to argue that ontological questions about possible worlds can be safely separated from the question how to analyse subjunctive conditionals. The philosopher who approaches the latter question in terms of possible worlds is not auton~aticallysubject to criticism for doing so. The conditions that Stalnaker has placed upon f are clearly not sufficient to pick it out uniquely. There will be many different functions satisfying these conditions. Stalnaker allows for the possibility that there might be additional formal constraints that could plausibly be imposed on f, but he denies that we could ever have conditions which would determine f uniquely. He maintains that there are going to be many different equally good selection functions, and that the choice between them is not determined by semantical considerations, but rather by the pragmatics of language. Here Stalnaker is harking back to the traditional view that subjunctive conditionals are subject to some kind of thorough-going ambiguity. It is supposed that the meaning of the conditional cannot resolve the ambiguity; rather, it is to be resolved by pragmatic considerations having to do with the occasion of utterance. Thus Stalnaker calls this kind of ambiguity 'pragmatic ambiguity'. It is noteworthy that Stalnaker gives no new examples of ambiguity. He bases his view that conditionals are pragmatically ambiguous on the discussions of Chisholm, Goodman, and others which were considered in Section 2. I have urged that those discussions were unconvincing. Thus Stalnaker's position on pragmatic ambiguity is subject to rebuttal on the same grounds. Those grounds do not constitute an argument to the effect that subjunctive conditionals are not pragmatically ambiguous, but rather consist of an attempt to undercut the reasons that have been given for thinking that they are pragmatically ambiguous. Claims of pragmatic ambiguity were based upon an appeal to certain kinds of
16
CHAPTER I
examples, and I have argued that those examples are more plausibly interpreted as illustrating an ambiguity in the antecedent of the conditional rather than a kind thoroughgoing ambiguity in the conditional connective itself. Perhaps the main reason Stalnaker is led to his doctrine of pragmatic ambiguity is that, on his semantics, the following principle is valid: '[(P > Q) v ( P > - Q)]'. According to this principle, for any P and Q, either Q would be true if P were true, or Q would be false if P were true. But there are many choices of P and Q for which we seem unable to decide which of these alternatives is correct. Stalnaker's answer is that the choice is to be determined by pragmatic considerations. I believe that this is manifestly false. Let P be 'The temperature outside is not 30'' and let Q be 'The temperature outside is 40''. Is it true that if it weren't 30' out now then it would be 40¡,o is it true that if it weren't 30Âout now it would not be 40° I should think that neither of these is true. Nor will it help to bring in pragmatic considerations. Pragmatic considerations are irrelevant in deciding whether it would be 40' out now. Rather than affirm either r ( P > Q)' or ^ ( P >-Q)', we should deny both and say that if it weren't 30' out now then it might be 40' out, but also it might be something else. Stalnaker's analysis ignores 'might be's'. The principle '[(P> Q ) v ( P > -Q)F should not be valid. I argued in Section 1 that "0might be true is P were true1 is incompatible with ^(P > -Q)^, and there are cases like the above in which we want to say that Q might be true if P were true, and also that Q might be false if P were true. Rather than an appeal to pragmatic considerations to resolve those cases where we cannot decide which of ^ ( P > Q)' and ^(P > -0)' should be true, we should deny that one of them always has to be true. What is required is a modification of the semantics so that ^[(P > Q ) v (P > -@)I1 is no longer valid. As we will see, this is precisely what David Lewis has accomplished. There are further difficulties for Stalnaker's semantics. Stalnaker suggests that his semantical rules (possibly augmented slightly) exhaust the semantical features of subjunctive conditionals, and that the choice of a selection function within these constraints is a matter of pragmatics. But most such selection functions would be totally preposterous. By judicious choice of a selection function, we could make virtually
INTRODUCTION
17
any conditional either true or false as we wish. For example, we could choose a function which selects the closest world in which I drop a piece of chalk (which in fact I did not drop) to be one in which Richard Nixon jumps over the moon. Then it would be true that if I had dropped that piece of chalk, Richard Nixon would have jumped over the moon. But that conditional just isn't true. In general, given any sentences P and 0 , if P is false and does not logically entail "-Q1, there will be a selection function satisfying Stalnaker's formal constraints which will make r(P> Q)"' true. But this is absurd. It might possibly be true that there is some ambiguity built into subjunctive conditionals, but no one can believe that they are subject to this kind of rampant ambiguity which would result from Stalnaker's position. Perhaps part of the difficulty is that in talking about pragmatics, Stalnaker's words (but not Stalnaker himself) suggest that this is just a matter of "pragmatic considerations" - appeals to practicality or some such thing. But the examples Stalnaker gives of the pragmatics of language are not at all like that. Pragmatics is supposed to govern such things as the reference of indexical expressions. Pragmatics, so construed, has nothing to do with practicality. It functions according to definite rules of language having to do with, for example, pointing to fix the referent of 'he'. There is nothing hit or miss about this, and nothing 'to be decided' by considerations of practicality. It would not be unreasonable to include pragmatics as part of semantics. Whether we do that or not would seem to be nothing more than a matter of terminology. But however we choose to classify pragmatic rules, there is no reason to think that there are no precise linguistic rules which uniquely determine the selection function. The fact that most potential selection functions are obviously ruled out by our understanding of the conditional (which presumably means they are ruled out by the rules of language governing conditionals) suggests that all but one may be ruled out, and it certainly indicates that there are many more constraints on the selection function than indicated by ~ t a l n a k e r . ~ Regardless of whether there is just one selection function or many selection functions, Stalnaker's semantics requires revision to rid it of the offending theorem " [ ( P> Q) v (P > Q)]'. Such revision has been provided by Davis Lewis, (1972, 1973, 1973a). Lewis begins by proposing that the selection function be understood in terms of our
-
18
CHAPTER I
ordinary notion of comparative similarity. What is at issue is whether a world a is more similar to a given world than is a world 8. Lewis writes that he means precisely the same relation of comparative similarity as when we judge the comparative similarity of cities or people or philosophies. Thus not just any old selection function will do. A proper selection function has to be built out of this notion which we take as basic for the sake of the analysis. This is certainly a step in the right direction, although as I will urge, it will not quite do. Starting with this notion of comparative similarity, Lewis observes that Stalnaker's semantics assumes there will always be just one world making P true which will be more like a given world than will any other world in which P is true. But this is implausible. Why couldn't there be ties? This seems to be just what happens when we judge that if Bizet and Verdi were compatriots they might both be French, and they might both be Italian. A world in which they are both French is just as much like the real world as one in which they are both Italian. Thus, rather than having f(a, P) be a single world, it should be a whole set of worlds. This leads Lewis to the analysis he numbers 'Analysis 2' according to which, if f(a, P ) is the set of all worlds in which P is true which are at least as similar to a as are any other worlds in which P is true, then r(P > Q)' is true iff 0 is true in every world in f(a, P). This analysis embodies the 'limit assumption' according to which there are worlds in which P is true which are maximally similar to a. This rules out the possibility that worlds might get indefinitely closer to a without limit. Lewis rejects the limit assumption on the basis of examples like the following. He draws a line slightly less than an inch long, and then he says Suppose we entertain the counterfactual supposition that at this point there appears a line more than an inch long.. . . There are worlds with a line 2" long; worlds presumably closer to ours with a line I$'' long; worlds presumably still closer to ours with a line l$' long; worlds presumably still closer. . . . But how long is the line in the closest worlds with a line more than an inch long? If it is 1 + x " for any x however small, why are there not other worlds still closer to ours in which it is 1 +&", a length still closer to its actual length? The shorter we make the line (above I"), the closer we come to the actual length; so the closer we come, presumably, to our actual world. Just as there is no shortest possible length above I", so there is no closest world to ours among the worlds with lines more than an inch long (Lewis, 1973, pp. 20-21).
INTRODUCTION
19
This leads Lewis to propose the following analysis: ( P > Q)' is true in a iff there is some world in which r(P& Q)' is true which is more like a than is any world in which '(PA -0)' is true.
Lewis' argument seems quite persuasive, but it leads to a most peculiar result. On his diagnosis, for each x greater than zero there are worlds in which the length of the line is between 1" and 1+ xt' which are closer to the actual world than are those worlds in which it is 1+ x t f . It follows that the conditional 'If the line were more than an inch long, it would not be 1+ x inches long' is true for each x greater than zero. But if the line would not be 1+x" long for any x, it follows that the line would not be more than an inch long (if it is more than an inch long, then it is more by some non-zero amount x). That is, if the line were more than an inch long, then it would not be more than an inch long. But that is certainly false. What has happened? First it should be pointed out that the above informal argument employs a principle which is not valid on Lewis' semantics. Unfortunately, this does not help Lewis because the principle ought to be valid. Let us define, where F is a set of sentences and P and Q are sentences:
(3.5)
(a) FÑ P iff every possible world making all of the sentences in 'I true makes P true.
' is the entailment relation. Entailment by a single sentence may or may not be represented by an operator in the object language, but of course the general concept of entailment by a set of sentences cannot be represented in that way. The following Consequence Principle is valid on Lewis' semantics:
(3.6)
If r(P> Q)' is true and Q -^ R, then r(P> R)' is true.
It seems clear intuitively that this principle ought to hold, so it is gratifying that it does. Because adjunctivity also holds (that is, if ( P > 0)' and r(P> R)" are true, so is " P > ( Q & R)'), we can generalize the consequence principle to obtain the Finite Consequence Principle : (3.7)
If F is a finite set of sentences, and for each QeF, ( P > 0)' is true, and F + R, then r ( P > R)" is true.
20
CHAPTER I
On intuitive grounds, it seems that the following Generalized Consequence Principle should also hold:
(3.8)
If f is a set of sentences, and for each 0 e F, "(P> 0)' is true, and F-+ R, then r ( P > R)' is true.
The generalized consequence principle seems just as obvious as either the original consequence principle or the finite consequence principle, and indeed should be true for exactly the same reason. It is surprising then that on Lewis' semantics this principle does not hold. The example about the length of the line can be turned into a proof of this fact. As the generalized consequence principle ought to be valid, what happens if we impose on the similarity relation the condition that it is valid? Surprisingly enough, this is equivalent to the limit assumption.4 Thus we are led back to Analysis 2. This might seem surprising. What are these closest worlds? They are the worlds that might be the actual world if P were true. We can make this precise as follows. Where a is a world, let us define
(3.9)
a M P iff -(3Q)[^P> -Q1 is true & Q is true in a ] .
We then prove two theorems:
(3.10)
If 3.8 holds and " - ( P > -R)' is true in a).
is true, then (3a}(aMP & R
Proof: Let F = {R}U {S; "(P> S)' is true}. Suppose there is no possible world a in which all the sentences in f are true. Then F is inconsistent, i.e., {S; r ( P > S)' is true} + ^R1. Then by 3.8, ^ ( P > -R)l is true, contrary to supposition. Therefore there is a world a in which all the sentences in F are true. For any sentence Q, if ^(P> 0)' is true then "-Q1e F, so "-Q1 is true in a , and hence Q is not true in a . Thus aMP. And as R e F, R is true in a .
-
(3.11)
If 3.8 holds, then "(P > Q)' is true iff (Va)[aMP 3 Q is true in a]. /N/
Proof: Suppose r ( P > 0 ) ' is true. Suppose aMP. Then ^(P>/--Q)' is true (by 3.8), so "-Q' is not true in a , and hence Q is true in a. Conversely, suppose " ( P > Q)l is not true. By theorem 3.10, ( 3 a ) ( a M P & Q is not true in a ) . Hence -(Va)(aMP=i Q is true in a ) .
INTRODUCTION
21
Theorem 3.1 1 tells that the closest worlds in which P is true are those worlds that might be actual if P were true. Thus the generalized consequence principle leads us inexorably back to Analysis 2. What went wrong with Lewis' argument which led him to reject Analysis 2 and adopt his more complicated analysis instead? I think that the error lies in supposing that the relation we use in selecting possible worlds is our ordinary relation of comparative similarity. Lewis' own example about the line illustrates this nicely. H e is surely right in supposing that a world in which the line is I$' long is more like the real world than one in which the line is 2" long. If comparative similarity were the relation that is operative in generating counterfactuals, it would follow that if the line were more than an inch long then it would not be two inches long. But that is false. On the contrary, if the line were more than an inch long then it might be two inches long. It might also be three inches long, or four inches long, etc. None of these alternatives is ruled out, although they clearly do generate worlds that differ in their similarity to the real world. Apparently comparative similarity is not the relation that is involved in determining which worlds we must look at in deciding whether ( P > 0)' is true. This can be seen more directly by considering what the operative relation is. What is involved in counterfactuals is the notion of a minimal change being made to the actual world in order to accommodate the counterfactual supposition. We must change the world in some way in order to make the antecedent true, but we are constrained to make the change as small as possible. The operative notion here is that of the minimality of change. Two different minimal changes can result in worlds that differ in their similarity to the real world. For example, a change yielding a world in which the line is 2" long is no greater than a change yielding a world in which the line is ly long. Each change is minimal, because each amounts .merely to changing the length of the line to accommodate the counterfactual supposition and then making whatever additional changes are required for the sake of consistency. But the resulting worlds differ in their degree of similarity to the real world. Apparently the notion of the magnitude of a change is not the same as that of the comparative similarity of worlds. Once this is recognized, we can build our semantics on the former notion rather than the latter.
22
CHAPTER I
Does this make any difference to the formal structure of the semantics? Lewis made the reasonable assumption that comparative similarity relative to any particular world constituted a simple ordering." Unfortunately, if we consider the ordering of worlds in terms of the magnitude of change required to generate them from the real world, it seems that we do not have a simple ordering but only a partial ordering. That is, the ordering is not connected. Sometimes it makes perfectly good sense to say that one change is greater than another, i.e., when the one change contains the other. But it is not true in general that we can compare changes in worlds and say either that they are the same magnitude or that one is greater than the other. This can be illustrated as follows. Let T, R, and 5 be three unrelated false sentences, e.g., the sentences 'My car is painted black', 'My maple tree died', 'My garbage can blew over'. Let P be ^ T v ( R & S)', and let Q be " T v R1. Now consider what happens when we entertain P and Q as counterfactual suppositions. In general, if we have a disjunction whose disjuncts are unrelated, then if the disjunction were true, either disjunct might be true - neither disjunct would have to be false. In the case of P and Q we have (1) that ^-(P > T)' and "-(P > -(R & S))^ are true, and (2) that "-(Q > - T ) ^ and "-(Q > -R)' are true. By (1) and Theorem 3.11, there are worlds a and 0 such that a M P and /3MP and T is true in a and '(R & S)' is true in 6. a results from making minimal changes to the real world so as to make T true, and /3 results from making minimal changes so as to make ^ ( R & S)' true. But then we will also have aMQ, because making T true is also a way of making Q true by making minimal changes. If connectedness holds, so that it always makes sense to compare the magnitudes of two changes, as we have both a M P and BMP, a and 6 must result from changes of the same magnitude. But then as aMQ, the magnitude of change involved in a must be the minimal change possible in making Q true. But (on the assumption of connectedness), Q involves the same amount of change, and Q is true in 6 (because R is true in f i ) , so we must also have PMQ. But we should not have PMQ. In constructing p we have changed the real world more than necessary to make 0 true, because we have made both R and S true when we only needed to make R true. Although a and p both involved minimal changes relative to the counterfactual supposition P, only a and not 6 involves a minimal
-
INTRODUCTION
23
change relative to the counterfactual supposition 0. Thus connectedness fails. It does not make sense to ask whether a and involve 'the same amount of change'. It only makes sense to ask whether they involve minimal changes necessary to make a certain counterfactual supposition true. The picture that emerges from this is that although it makes perfectly good sense to say that one change is larger than another when the first contains the second as part of it, it does not in general make sense to try to compare the magnitudes of changes. Changes form a lattice rather than a simple ordering. To say that a change is minimal relative to making a certain antecedent true is to say that we cannot make the antecedent true by making just part of that change. Two changes may each be minimal relative to making a certain antecedent true, but only one of them minimal with respect to making another antecedent true. Thus we cannot assume that the ordering of worlds which is generated by our selection of minimally different worlds is a simple ordering. It is only a partial ordering. As I will argue in Chapter 11, Lewis' assumption to the contrary leads him to embrace as valid certain theorems which should not be valid.
I have argued that there are some serious problems for the possible worlds approach as it has been developed to date. But, by Theorem 3.11, it follows that the basic insights are correct. As long as we agree that the Generalized Consequence Principle holds, it follows that the possible worlds approach can be made to work. But one may wonder whether the possible worlds approach really involves any advance over the linguistic theory. The possible worlds theory proceeds in terms of the notion of the minimal change to the real world necessary to make a certain sentence true. No one would hold this notion up as a model of clarity. Before we can regard the possible worlds theory as giving us anything that deserves to be called an 'analysis' of counterfactual conditionals, we must have an analysis of this notion of a minimal change. And upon reflection, the latter task seems to involve all of the
24
CHAPTER I
problems that arose for the linguistic theory. First, it seems that laws must play an important role in the notion of a minimal change. Most people's intuitions seem to agree that in making a counterfactual supposition true, we change laws only as a last resort. For purposes of counterfactuals, laws are more immutable than particular facts. Thus a clear account of the notion of a minimal change must presuppose an account of what a law is. Second, the question which particular facts can be changed and which cannot in a minimal change seems to be the same as the question what truths can be put into the set C in the linguistic theory. At this point, it may look as if the possible worlds approach has accomplished nothing. The same problems have arisen all over again in different guise. But this is misleading. As we will see in Chapter IV, the possible worlds approach provides a helpful new perspective on the problem which, in the end, facilitates its solution. However, before we can attempt to provide that solution, we must return to some of the same old problems. In Chapter I1 we will undertake to sort out various logical facts about subjunctive conditionals which have obscured the solution to the general problem of providing an adequate analysis. In Chapter 111 we will examine the notion of a law, and we will attempt to provide an analysis of it. Then in Chapter IV, having laid the groundwork, we will return to the task of analyzing the notion of a minimal change and will provide what I hope is an adequate analysis, and thereby provide an analysis for counterfactual conditionals. NOTES I If it is agreed that ^ P & O1entails rP>Q', a principle which will be defended later, all by itself and we do not need the assumption then -(P>-Q)-' entail:> "P*' that P is false. Stalnaker also included an alternativeness relation, but I omit it for the sake of simplicity. It has no direct effect on the logic of conditionals. Of course, these constraints are most likely not 'formal' in the same sense as are those which Stalnaker lists. More precisely, it is equivalent given the assumption that comparative similarity relative to any world is at least a partial ordering. More precisely, he assumed a simple ordering of the equivalence classes of worlds equally similar to the given world. D a v i d Lewis denies this, but I think that his denial stems from his supposing that comparative similarity is the notion operative in generating counterfactuals.
CHAPTER I1
FOUR KINDS O F CONDITIONALS
There are certain logical problems that must be discussed prior to undertaking the analysis of subjunctive conditionals. The discussion of these problems will clarify the nature of these conditionals and thereby facilitate their analysis. The problem o f analyzing subjunctive conditionals has been exacerbated by'the fact that there are actually several different kinds of subjunctive conditionals, with different properties and correspondingly different analyses, and philosophers have not been sufficiently careful in distinguishing between them. In this chapter I will distinguish between four main kinds of subjunctive conditionals and explore their interrelations. It will turn out that these conditionals are all interdefinable, so an analysis of one will yield analyses of all the others.
Let us say that a simple subjunctive is a conditional of the form "If it were true that P then it would be true that Q1. We will symbolize this as ^ P > Q'. It is the simple subjunctive that has been discussed most frequently in the literature and which was under discussion in Chapter I. There is a more or less traditional assumption about subjunctive conditionals that has been uniformly rejected in the recent literature. This is the assumption that a subjunctive conditional asserts the existence of a connection between the antecedent and consequent. Certainly, some simple subjunctives are true because such a connection exists, but this is not invariably the case. The existence of such a connection is a sufficient, but not a necessary, condition for the truth of a simple subjunctive. This is easily seen by considering examples. First, there are obvious examples of a simple subjunctive being true
26
C H A P T E R 11
because of the existence of such a connection. We say, 'If this match were struck, it would light', because we believe that, in some sense, striking the match would bring about its lighting. Similarly, we say 'If the bird you saw had been a raven, it would have been black', because a bird's being a raven in some sense requires (without logically entailing) its being black. In each of these examples, the truth of the antecedent in some sense makes the consequent true, or requires the consequent to be true. Let us say that in these cases the antecedent necessitates the consequent. Notice that necessitation is not always causal necessitation. We cannot say that the bird's being a raven causes it to be black. It might conceivably be true that all (contingent) necessitations are ultimately explicable in terms of causes, but it is clear that there are cases in which we cannot say simply that the truth of the antecedent causes the truth of the consequent. No one is inclined to doubt that simple subjunctives are often true because the antecedent necessitates the consequent. But it has not always been recognized that a simple subjunctive can also be true when there is no such necessitation. For example, we might say of a witch doctor, 'It would not rain if he did not do a rain dance, but it would not rain if he did either.' This conjunction of two simple subjunctives expresses the lack of a connection rather than the presence of one. Contrary to the traditional assumption, it seems clear that simple subjunctives do not express a relation of necessitation between their antecedent and consequent. Rather, the presence of such a connection is just one ground for asserting a simple subjunctive. It seems that there are basically two ways that a simple subjunctive can be true. On the one hand, there can be a connection between the antecedent and consequent so that the truth of the antecedent would bring it about, i.e., necessitate, that the consequent would be true. On the other hand, a simple subjunctive can be true because the consequent is already true and there is no connection between the antecedent and consequent such that the antecedent's being true would interfere with the consequent's being true. The latter case is typically expressed by 'even if' subjunctives: 'Even if the witch doctor were to do a rain dance, it would not rain'.
FOUR KINDS OF CONDITIONALS
27
There is a special kind of English conditional - the 'even if7subjunctive - whose function is to express the second way in which the simple subjunctive can be true. It would be convenient if there were also a special subjunctive conditional whose function is to express the first way in which the simple subjunctive can be true - i.e., conditionals which express necessitation. If there weren't such conditionals, it would be worthwhile to introduce them. However, I believe that there are several locutions in English which can be used to express these conditionals. The most straightforward way of expressing necessitation is with the locution, "If it were true that P, then it would be true that Q since it was true that P1. For example, we might say of the match, 'If it were struck, it would light since it was struck', but deny of the witch doctor, 'If he did a rain dance, it would fail to rain since he did the dance'. In using this locution, one must not fall into the common error of supposing that 'since' statements are always causal. For example, the bird is black since it is a raven. We might say that these statements are always 'necessitory', but there are kinds of necessitation which are at least not directly causal. Another locution which can often be used to express necessitation consists of using 'couldn't' in place of 'wouldn't': 'If it were true that P, it couldn't be false that Q1. Thus, for example, the simple subjunctive 'If the witch doctor were to do a rain dance, it wouldn't rain' is true, but 'If the witch doctor were to do a rain dance, it couldn't rain' is false. The latter is false because it expresses necessitation, and there is no necessitation in this case. On the other hand, both 'If this match were struck, it wouldn't fail to light' and 'If this match were struck, it couldn't fail to light' are true, because here the antecedent does necessitate the consequent. Let us symbolize these necessitation conditionals as ' P Ã 0'. It is not invariably the case that the English locution "If P were true, 0 couldn't be false1 expresses necessitation. For example, suppose we have a match which has been soaked in water. We could reasonably say of such a match, 'If this match were struck, it couldn't light', but certainly its being struck would not necessitate its not lighting. What is happening here is that the modal statement 'This match couldn't light' is true, and the above conditional really has the force of an 'even if'
28
CHAPTER II
conditional: 'Even if this match were struck, it couldn't light'. The 'couldn't' attaches to the consequent rather than, as in necessitation statements, expressing a relation between antecedent and consequent.' To further confuse matters, we sometimes use the English locution "If it were true that P then it would be true that Q1 to express necessitation rather than the simple subjunctive. For example, suppose we have a broken-down old car whose engine is beyond repair. We might reasonably assert of this car (and equally of any car), 'Its engine would not run if it had a broken piston, but it is not the case that its engine would not run if the car had a flat tire'. But, shifting gears mentally, we may also affirm, 'The engine would not run if the car had a flat tire', because we know that the engine will not run no matter what, and hence would not run even if the car had a flat tire. We seem to be contradicting ourselves here. We are both affirming and denying 'The engine would not run if the car had a flat tire'. But we are not really contradicting ourselves. What we are saying is that (1) the engine would not run even if the car had a flat tire, but (2) the engine's not running would not be necessitated by the car's having a flat tire. Apparently the English words 'If it were true that P, then it would be true that Q' are used ambiguously, most often to express simple subjunctives, but sometimes to express necessitation. These ambiguities in the English locutions used to express subjunctive conditionals have, I suspect, contributed significantly to the difficulties in analyzing subjunctive conditionals. They mean, in particular, that we cannot introduce '>' and '>>' as simple paraphrases of the English locutions If it were true that P then it would be true that Q1 and "If it were true that P, it could not be false that Q1. This is no particular hardship, because the concepts themselves seem to make good intuitive sense. We have no difficulty recognizing that the English words do on occasion mean different things. We have distinguished between simple subjunctives, 'even if conditionals and necessitation conditionals. These are three of the four kinds of conditionals referred to in the title of the chapter. The remaining kind of conditional is the 'might be' conditional: "If it were true that P, it might be true that Q1. In the following sections each of these conditionals will be discussed in turn, and the attempt will be made to make each clearer and explore its relation to the others.
.
FOUR KINDS OF CONDITIONALS
29
Let us begin with 'even if subjunctive conditionals. Let us symbolize Q even if P1 as '"QEP1. What does it mean to say that Q would be true even if P were true? At the very least, this implies that Q is true now. One might think it also implies that P is false, because we would not ordinarily say ' 0 would still be true even if P were true' if we knew that P is true. But this is one of Grice's conversational implicatures rather than a logical implication. We frequently have occasion to judge that "QEP1 is true in cases in which we do not know whether P is true. If we later discover that P was true, this does not show that we were wrong in thinking that Q would still be true even if P were true. On the contrary, this would seem to show that we were right. Consequently, "QEP1 does not entail "-P1. Q E P 1 entails 0,but obviously it entails much more besides. To get at what else it entails, consider some examples:
(1)
My car would still be white even if the maple tree in my front yard died.
(2)
Match m would still be dry even if it were struck.
(3)
-Match m would not light even if it were struck.
(4)
-This bird would still be black even if it were not a raven.
(5)
-The Japanese current would still run alongside Japan even if Japan were only fifty miles from Alaska.
The reason my car would still be white even if my maple tree were to die is that the death of the tree does not enter into anything which would necessitate my car's not being white. Analogously, match m would still be dry even if it were struck, because striking it does not enter into anything which would necessitate its not being dry. On the other hand, striking match m does enter into something which would necessitate its not being the case that it does not light, so it is false that match m would not light even if struck. These examples suggest, as a first approximation, that "QEP1 is equivalent to
But this is surely too strong. It is always possible to find some R which,
30
CHAPTER I1
together with P, would necessitate "-0'. For example, R could be "(P 2 -Q)'. A natural suggestion to remedy this would be to require that R is something that would be true if P were true:
Q & -(3R)[(P & R. >>-Q) & ( P > R)]. But this does not work for cases (4)-(6). If I point to a raven and say 'This bird would still be black even if it were not a raven', I am wrong. If it were not a raven, it might be a cardinal, or a bluejay, or all sorts of non-black birds. But there is nothing that would be true of the bird which, together with its not being a raven, would necessitate its not being black. On the contrary, it would be black if, for example, it were a crow. The reason it is false that the bird would still be black even if it were not a raven is that there are all sorts of things that might be true of it (e.g., being a cardinal) which, together with its not being a raven, would necessitate its being non-black. Analogously, it need not be the case that the Japanese current would still run alongside Japan even if Japan were only fifty miles from Alaska, because there are different ways in which Japan might be only fifty miles from Alaska. If this resulted from the Pacific Ocean's being only fifty miles wide, then perhaps the Japanese Current would still run alongside Japan (this would depend upon general oceanographic laws which are probably unknown at this time). But if Japan were only fifty miles from Alaska but a thousand miles from Asia, then the Japanese current would not run alongside Japan. In both of these cases, we infer "-(QEP)' because "(3R)[(P & R. Ã -Q) & (R might be true if P were true)]' is true. These examples support the following equivalence, which I think is correct:
(3.1)
rQEP' is equivalent to "Q & -(3R)[(P & R. >> -Q) & (R might be true if P were true)]'.
There are certain example which appear to be counter-examples to this analysis. For example, we might say of a person 'He would be fired even if he drank just a little' without meaning to imply that he will be fired. This appears to be an example in which '^QEP1 does not entail Q . ~However, I do not think that this is a real counter-example. 'He would be fired even if he drank just a little' is a shortened form of 'If he drank, he would be fired even if he drank just a little'. The latter
FOUR KINDS OF CONDITIONALS
31
can be symbolized as " D > (FEL)'. Of course, one might reply that this doesn't make any difference. Whether 'He would be fired even if he drank just a little' is short for something else or not, it is not in accord with our analysis. If one wishes to take that line, then the reply is simply that we are not attempting to analyze all possible uses of 'even if'. We are merely analyzing what is in some sense 'the standard use' of 'even if'. Now let us consider the logical properties of 'even if'. It seems clear that all of the following principles ought to hold:
(3.2)
QEP&(Q-^R).3REP.
(3.3) (3.4)
QEP & REP. 3 (0 & R ) E P .
(3.5) (3.6)
QEP&RE(P&Q).=REP. PEQ & PER. 3 P E ( Q v R ) . (P&Q)->QEP.
However, these cannot be proven until we know something about the logical properties of 'might be'.
We were forced to use conditionals of the form " 0 might be true if P were true1 in analyzing "QEP1, so let us turn to their analysis next. We can symbolize them as "QMP1. To say that 0 might be true if P were true seems to be the same as saying that it is not the case that 0 would definitely be false if P were true, which suggests the following analysis:
(4.1)
( Q M P )= -(P > -0).
This would be unexceptionable were it not that many philosophers have thought that from the falsity of '^Q would be true if P were true1 it follows that Q would be false if P were true:
Certainly "QMP1 does not entail " ( P > Q)', so either 4.1 fails, or 4.2 does not hold. Which is the culprit? I think it is fairly easy to see that 4.2 is false. If 4.2 were true, then
32
C H A P T E R I1
( P > 0 )v ( P > - Q ) l would be a truth of logic. Clearly, ^QMP1 is incompatible with " ( P > -a)-'.If Q would be false if P were true, then it is not the case that Q might be true if P were true:
Consequently:
Thus if 4.2 were true, it would be impossible to have both "QMP1 and ' ( - Q ) M P 1 true. But as we have seen, this is not impossible. For example, it is true both that if the temperature outside now were not 30' then it might be 40°and that it might be something other than 40'. Thus 4.2 cannot be true. It is not difficult to see why 4.2 fails. What ^ P > Q1 says is, roughly, that whatever might be the case if P were true, Q would be true.3 Insofar as there are different things that might be the case if P were true, it can happen that neither 0 nor r-Q1 would be true in all the circumstances that might occur if P were true. Should we conclude then that 4.1 is true? 4.1 seems right except possibly for one case. It seems that the following entailment should hold:
This will result from 4.1 except in the case where we have both P > Q1 and " P > - Q 1 true. It will be seen later that this can happen only when P is logically impossible. My intuitions fail me in this case. We could ensure the truth of 4.5 by analyzing "QMP1 as: (4.6)
-
( Q M P ) = [ ( P> 0 )v -(P > Q ) ]
rather than as in 4.1, but I see no clear way to decide which analysis is correct. I suspect that our use of the English phrase 'might be' is simply undetermined in the case where both " P > Q1 and ^ P > - Q 1 are true, in which case it makes no difference which way we analyze ^QMP1. Furthermore, it will result that it makes no difference to the analysis of 'even if which analysis of 'might be' is adopted. Consequently, I propose to adopt the simpler analysis and take 4.1 as defining ^QMP1.
FOUR KINDS OF CONDITIONALS
33
Now let us turn to conditionals expressing necessitation. How can we analyze ^PÃ Q1? Clearly, this entails ^P> Q1, but equally clearly, this is not a sufficient condition for the truth of ^PÃ Q1. This is due to the fact that ^P> Q1 may be true just because Q is already true and P's being true is irrelevant to the truth of Q and hence would not interfere with the truth of Q. This suggests that what is required for the truth of P Ã Q1 is not just that "P > Q1 is true, but that "P > Q1 would still be true even if Q weren't already true:
However, this is too strong a requirement. The difficulty arises in the case in which P and Q are both true. Characteristically, P only necessitates Q because certain other propositions are true and satisfy whatever other conditions are required for the validity of detachment in necessitation conditionals. For example, consider a match which lit (L) since it was struck (S). So "S Ã L1, S, and L are all true. However, a necessary condition for the match to light when struck is (we can suppose) its being dry. Consequently, if the match had not lit, this could be either because it was not struck or because it was not dry. That is, ^(-D)M-L1 is true. But had the match not been dry, then it would not have lit if struck, i.e., r S > L1 would have been false. Thus which might have been true if the match there is something (^-p) h a d v t and which is such that had it been true then "S > L1 would have been false. So ^ ( S > L)E-L1 is false, and hence this is too strong a requirement for the truth of "S Ã L1. The reason that ^ ( P> Q)E -Q1 will not in general be true if ^ P Ã Q1 is true is that r P > Q1 is only true because certain other propositions are also true which, in collaboration with P, bring about the truth of Q. On the hypothesis that Q is false, all that we can conclude is that either P is false or one of these collateral truths is false, and in the latter case ^ P > Q1 would not normally remain true. We can handle this difficulty by, in effect, building into our counterfactual hypothesis an explanation for why Q is false. My proposal is that our analysis should be:
34
CHAPTER I1
Whereas the hypothesis '-Q1 leaves open whether it is P or one of the collateral truths that is false, the hypothesis '-P & -Q1 tells us which is false. For example, if the match were not struck and did not light, then it would still have been dry, and so had it been struck it would have lit. Thus it seems to me that 5.1 captures precisely what is meant by the necessitation conditional. Notice that what 5.1 requires is just that '"P> Q1 would be true even if it were a counterfactual. There appears to be a certain circularity in the analyses of necessitation conditionals and 'even if' conditionals. We have analyzed necessitation conditionals in terms of 'even if' conditionals and the simple subjunctive, but in analyzing 'even if' conditionals we used necessitation conditionals. Fortunately, this circularity can be untangled. It will be shown in Section 6 that the above analyses entail simpler analyses that use only the simple subjunctive, thus avoiding the circularity. Next let us consider what logical inferences ought to be valid for necessitation conditionals. Logical entailment is a special case of necessitation. If P entails 0 , then certainly P's being true would necessitate Q's being true: (5.2) (P-^('?)~(PÈ~) Obviously a necessitation conditional entails a simple subjunctive: (5.3) (PÈQ)~(P>Q) It will be a consequence of this analysis that all counterfactual conditionals express necessitation:
This is in accord with the intuition that there are just two ways for a simple subjunctive to be true: (1) the conclusion may be true already, and the antecedent's being true would not interfere with that; (2) the antecedent's being true would necessitate the conclusion's being true. If the conditional is counterfactual, so that the conclusion is false, then possibility (1) is eliminated and the only way the conditional can be true is if the antecedent necessitates the consequent. A surprisingly wide variety of standard principles fail for necessitation conditionals. To begin with, transitivity fails: (5.5)
It is not true in general that if '"(P Ã 0 ) & ( 0 Ã R)' is true then ^(PÃ R ) l is true.
FOUR KINDS OF CONDITIONALS
35
For example, suppose we have a stick of old dynamite which would explode (E) if dropped (D), but which would not explode if first soaked in water (W). Then we have "(W & D. >> D)' and "(D >> E)', but we do not have "(W & D. ÈE)' Nor do we have the Consequence Principle that if P necessitates 0 , then P necessitates anything entailed by 0 : (5.6)
It is not true in general that if " ( P Ã 0 ) & ( 0 + R)' is true then "(P >> R)' is true.
For example, it is presumably true that my car would still be white even if the maple tree in my front yard were to die. Thus if that maple tree were to die, the conjunction 'My car is white and the maple tree in my front yard died' would be true. This is a counterfactual conditional, so it is an instance of necessitation. The consequent entails 'My car is white', but clearly this is not necessitated by the maple tree's dying. The problem is that P may necessitate a conjunction (i.e., bring it about that the conjunction is true) by making one conjunct true, where the other conjunct is already true and would still be true even if P were true; but just because P necessitates the conjunction, it does not follow that P necessitates each conjunct. In light of 5.6, it might be objected that our analysis of necessitation is defective. It may be felt that insofar as P necessitates a conclusion, it must necessitate all of that conclusion. One can certainly define such a notion of 'strong necessitation' in terms of our present notion of necessitation: (5.7)
(P>>>Q)=(R)[(Q+ R)^(P>> R)].
However, I suspect that strong necessitation is not a useful concept. For example, it is not the case that if the truth of P would cause the truth of 0 , then P strongly necessitates 0 . This is because 0 will characteristically entail all sorts of things to which P is irrelevant. For example, my pushing a button may cause a certain stick of dynamite to explode. The statement 'Stick d of dynamite explodes' entails 'There is dynamite in the world', but the latter is certainly not necessitated by my pushing the button. The notion of necessitation that I am trying to analyse here is that of the truth of one statement 'bringing it about' that another statement is
36
CHAPTER II
true. Characteristically, the truth of the second statement will be brought about by making true whatever parts of it are not already true and leaving unchanged those parts that are already true. Then the latter parts will not be necessitated by the first statement. In connection with 5.6, one should realize that necessitations are not 'necessary connections', except perhaps in a very weak sense. It can be completely accidental that one statement necessitates another, because it is accidental that the unnecessitated part of the second statement is true. Perhaps the term 'necessitation' is inappropriate for the notion I have in mind here, but I have been unable to find a better term. Contraposition fails for the same sort of reason that the Consequence Principle fails:
(5.8)
It is not true in general that if " ( P Ã Q)' is true then is true.
( - 0 Ã -P)'
For example, although my tree's dying ( D ) would necessitate the conjunction that my tree died and my car is white ( r D & W ) , its being false that both my tree died and my car is white would not necessitate that my tree did not die, i.e., " - ( D & W) Ã - D l is not true. This is 'My tree died because if ^D & W were true, then if the conj~~nction and my car is white' were false, it might be false in either of two ways: it might be false that my tree died, and it might be false that my car is white. Only the first of these requires that my tree did not die, so the falsity of the conjunction does not necessitate that my tree did not die. Surprisingly, adjunctivity fails:
(5.9)
It is not true in general that if '"(PÃ 0 )& ( P Ã R)' is true, then " P >> ( Q & R)' is true.
The best way to muster intuitions against the principle of adjunctivity is to concentrate on those instances of necessitation which are causal. Our intuitions seem to be clearest in those cases. Those cases occur when P's being true would be causally sufficient for Q to be true. Making this precise, P is causally sufficient for Q iff P's being true would cause Q to be true if P and Q weren't already true. There is a wide variety of cases in which P would necessitate Q iff P would be causally sufficient for 0. The following counterexample to the principle of adjunctivity is such a case. Suppose we have a button and two lights
FOUR KINDS O F CONDITIONALS
37
A and B. If light A is off, pushing the button results in light A's being on and light B's being off; if light A is on, pushing the button results in light A's remaining on and light B's being on. Suppose both lights are on. The button's being pushed is causally sufficient for light A to be on. Furthermore, because light A is on, pushing the button is also causally sufficient for light B to be on - that is, if light B were not on, pushing the button would cause it to come on. But pushing the button is not causally sufficient for both lights to be on; if they were not both on, then light A might be off, in which case pushing the button would cause only light A to be on. Although adjunctivity fails, it will turn out later that a weakened version of adjunctivity does hold:
The principle of dilemma also fails for necessitation conditionals: (5.11)
It is not true in general that if then "(P v Q) >> R1 is true.
>> R ) & (Q Ã R ) l is true
For example, consider a complicated circuit consisting of a light and three switches A, B, and C. If switches A and B are both closed then the light is on, and if switch C is closed then the light is on. Switches A and B can be operated independently of one another, however there is an interlock system which prevents switch C from being closed unless switch A is already closed. Suppose in fact that all three switches are closed and the light is on. Switch C's being closed necessitates that the light is on, i.e., "C >> L1 is true. Because switch A is closed, switch B's being closed also necessitates that the light is on, i.e., ^ B Ã L1 is true. But " ( Bv C) >> L1 is false. This is because if the light were off and switches B and C were both open (i.e., if ^-(B v C) & -L1 were true), then as switch C would be open, switch A would also have to be open, making "B > L1 false, and hence r(B v C)> L1 would be false. Once again, it will turn out that a weakened version of the principle of dilemma is true:
38
CHAPTER 11
It is apparent that 'Èis a peculiar kind of conditional. We must be quite careful about assuming that it satisfies standard logical laws when it does not.
We analyzed necessitation in terms of 'even if7 and simple subjunctives, we analyzed 'even if in terms of necessitation and 'might be', and we analyzed 'might be' in terms of the simple subjunctive. I have promised to untangle the circularity between necessitation and 'even if', so what remains is to give an analysis of simple subjunctives. That task will be relegated to the next three chapters. In the meantime, and preparatory to giving a full-fledged analysis, we can state some principles regarding when simple subjunctives are true and when they are false, but these principles merely relate simple subjunctives to one another rather than defining them in terms of something new. Thus, in effect, all we are doing is providing some axioms for simple subjunct i v e ~ Nevertheless, .~ I think that these axioms will throw considerable light on simple subjunctives, and will yield some interesting results concerning the relations between our four kinds of conditionals. Furthermore, the axioms will provide a condition of adequacy against which a putative analysis of simple subjunctives can be tested. If such an analysis does not make the axioms true, this is a reason for doubting the analysis. It is possible to say a great deal about the circumstances under which simple subjunctives are true. First, suppose someone asserts ^ ( P> Q)', and upon investigation we find that P and 0 are both true. This verifies the assertion that if P were true, 0 would be true; because P is true, and sure enough, 0 is true too. For example, we might affirm of a political candidate, 'If he were elected, he would end the war'. If the candidate is subsequently elected, and does end the war, this shows that we were right. So we have:
Admittedly, there is something odd about asserting a subjunctive conditional when we already know that the antecedent and consequent
FOUR KINDS OF CONDITIONALS
39
are true. If we already know that they are true, there is no good reason to use the subjunctive mood. But, to paraphrase Grice, this is a remark about conversation and not about logic. And I suspect that a further source of reluctance to accept 6.1 results from confusing the simple subjunctive with necessitation conditionals. 6.1 clearly fails for the latter. It seems evident that entailments generate subjunctive conditionals:
And it seems clear that adjunctivity ought to hold for simple subjunctives:
(6.3)
( P > Q ) & ( P > R ) . ^[P > ( Q & R ) ] .
Dilemma ought to hold too: if R would be true if P were true, and R would be true if Q were true, then R would be true if either P or Q were true:
(6.4)
( P > R ) & ( Q > R ) . 3 [ ( P vQ ) > R ] .
If Q would be true if P were true, then anything logically entailed by Q would be true if P were true:
(6.5)
( P > Q ) & ( Q -+ R ) . 2 ( P > R ) .
Obviously:
(6.6)
( P > Q ) 3 ( P 3 0).
Taking '-' to symbolize logical equivalence:
Simple subjunctives are 'defeasible' in the sense that we can have ( P > 0)' true, but for some R , ^ ( P & R . > Q)' false. However, there is at least one case in which conjoining R with P cannot defeat the conditional. If R would be true if P were true, then, in some sense, '"(P & R)' being true is not a different circumstance from P being true, so if 0 would be true if P were true, then Q would also be true if " ( P & R)' were true:
40
CHAPTER I1
There are some natural laws that do not hold for simple subjunctives. Transitivity fails. Consider our stick of old dynamite again. If it were dropped, it would explode; and if it were soaked in water and then dropped, it would be dropped; but it is not the case that if it were soaked in water and then dropped, it would explode. Similarly, contraposition fails. It might be true that it would not rain if the witch doctor did a rain dance, but false that if it were to rain he would not have done a rain dance. From 6.1-6.8 we can derive a series of interesting theorems. First, we obtain theorems which enable us to disentangle the circularity in the definitions of ' E ' and 'È'
Proof: Suppose "P Ã Q1 is true. Then " ( P >Q)E(-P & -Q)' so " P > Q 1 is true.
is true,
Proof: Suppose ^P & Q. > R1 is true. R entails " 0 3 R1, so by 6.5, "(P & Q )> (0=i R)" is true. " ( P & -Q)l entails " Q 3 R1, so by 6.2, "(P& -0)> ( 03 R)' is true. Thus by 6.4, "(P & -Q. v .P & Q )> ( Q ^R)" is true. So by 6.7, " P > .Q ^R^ is true. to symbolize logical necessity: Using '0'
Proof: Suppose 'UQ1 is true. Then Q is true. Suppose "(P & R ) Ã -Q1 is true. Then by 6.9, "(P & R)>-Q1 is true. As "DQ1 is true, - Q 1 entails "-R1, so by 6.5, "(P & R ) > - R 1 is true. By 6.10, " P > .R 3 -R1 is true, so by 6.5, " P > -R1 is true. Therefore, "QEP1 is true.
Proof: " P + Q1 entails "P > Q1, so "(P+ Q ) ^D(P > Q)' is true ( I assume that logical necessity satisfies the axioms of S4), By 6.11, D ( P > Q ) 3 (P > Q ) E Q' is true, so "(P + Q )2 ( P >> Q)' is true.
-
(6.13)
(QEP)s[Q & (P > Q ) ] .
FOUR KINDS OF CONDITIONALS
41
Proof: Suppose "QEP1 is true. Then Q is true. and (VR)'T(P & R. >> -Q)^(P> -R)I1 is true. But "P & -Q1 entails '-Q', so by 6.12, ^(P & -0) >> -Q1 is true. Thus " P > --Q1 is true, and hence by 6.5, ^ P > Q1 is true. Conversely, suppose " Q & ( P > Q)' is true. Suppose " ( P & R ) >> -Q1 is true. By 6.9, "(P & R ) > -Q1 is true. So by 6.10, ^ P > .R 3 -Q1 is true. By 6.3, " P > [ Q & ( R 3-Q)]' is true, so by 6.5, P > - R 1 is true. Thus "QEP1 is true. Theorem 6.13 allows us to break the circle between necessitation and 'even if and define both in terms of the simple subjunctive. 6.13 gives us the definition of 'even if', and consequently we have:
It is now a simple matter to derive principles 5.2-5.5, 5.10 and 5.12 which were listed above for necessitation conditionals, and principles 3.2-3.6 for 'even if'. Although we introduced the simple subjunctive in terms of axioms rather than by giving a definition of it, we can now prove that it is definable in terms of our other kinds of conditionals:
Proof: Suppose "P > Q' is true. Then "P 3 Q1 is true, and as Q entails P 3 Q1, rP > (P =3 Q ) l is true. Thus by 6.13, " ( P =3 Q)EP1is true. And "P & (P 3 0)' entails Q, so by theorem 6.12, '"[P& (P 3 Q ) ]>> Q1 is true. Conversely, suppose "REP & (P & R. Ã Q ) l is true. By (6.13), P > R 1 is true, By 6.9 and 6.10, " P > ( R =3 Q)' is true, so by 6.3 and 6.5, " P > Q1 is true. From 6.14 we immediately obtain:
This means that a counterfactual always expresses necessitation. This, together with the fact that counterfactuals are those subjunctive conditionals that philosophers have most often thought about, explains the pervasive view that subjunctive conditionals always express necessitation.
42
C H A P T E R 11
From 6.13 and 6.16 we can see that the simple subjunctive is just the disjunction of necessitation and 'even if':
Thus there are just these two ways that the simple subjunctive can be true. Either Q is made true by P, or Q is already true and P would not disrupt this. This is a very illuminating theorem. It explains why the logic of '>' is so peculiar, being, as it is, a mixture of two such different concepts. Each of our four kinds of conditionals is explicitly definable in terms of each of the others. This follows from the fact, already established, that each kind of conditional is definable in terms of the simple subjunctive, together with the following theorem according to which the simple subjunctive is definable in terms of each of the other kinds of conditionals: (6.18)
" P > Q1 is equivalent to each of the following: (i)
"-[(-
(ii)
"(P 3 Q)EP1;
Q)MPll;
(iii) "P >> (P 3 0)'. Consequently, if we can provide an analysis of any of these kinds of conditionals, analyses of the others will follow.
Principles 6.1-6.8, in effect, constitute an axiomatization of simple subjunctives. However, principles 6.2, 6.5, and 6.7 employ the concept of entailment, and thus require a modal logic for the underlying logic rather than just the propositional calculus. We can instead replace those principles by rules of inference in a more restrictive language in which entailment cannot be expressed. Let SS be the formal theory whose axioms and rules are as follows: A1
All tautologies.
42
(P>Q)&(P>R).^[P>(Q&R)].
A3
( P > R ) & (Q>R).^[(PvQ)>R].
FOUR KINDS OF CONDITIONALS
R1
If P and "(P10)'
R2
If "(P 2 Q)' is a theorem, so is " ( P > Q)'.
R3
If "(Q 3 R)' is a theorem, so is '(P > Q ) 2 ( P > R)'.
R4
If '(P = Q)' is a theorem, so is "(P > R )
are theorems, so is 0 .
(0> R)'.
I conjecture -that SS contains as theorems all true principles regarding simple subjunctives that can. be formulated in this language. It is of interest to compare SS with other well known theories of subjunctive conditionals. The best known such theories are C l of Lewis (1972) and CQ of Stalnaker (1968). SS is contained in C l which is contained in CQ. CQ contains the theorem " ( P > Q ) v ( P > -Q)l, which we have rejected. SS is weaker than C l . C l can be obtained from SS by adding the following axiom:
Unfortunately, this axiom is false. This axiom would be valid only if the ordering of possible worlds according to magnitude of change were connected, and we saw in Chapter I that it is not. We can construct counterexamples to 7.1 using the same constructions that showed the ordering not to be connected. Let S, T, and U be any three unrelated false statements, e.g., 'My car is painted black', 'My garbage can blew over', and 'My maple tree died'. The following is a substitution instance of 7.1:
From 7.2 we readily obtain the principle:
The color of my car and the state of my garbagecan are irrelevant (we can suppose) to the state of my tree, so my tree would not die even if either my car were painted black or my garbage can blew over; hence U M ( S v T)' is false. But the antecedent of 7.3 is true. Disjunctions
44
C H A P T E R I1
whose disjuncts are unrelated to one another cannot necessitate either disjunct. If we know that the disjunction is true, that leaves open which disjunct is true. In particular, it is not true that if either my car were painted black or my garbage can blew over then my car would be painted black; and it is not true that if either my car were painted black and my maple tree died, or my garbage can blew over, then my garbage can would have blown over. Thus 7.3 is false. Consequently, as we would expect from its semantics, C l is too strong. We have seen that the stronger brethren of SS contain invalid axioms. Is it true then that SS is complete? We are not really in a position to answer this question, because we do not have a satisfactory semantics for subjunctive conditionals. However, it is possible to construct a semantics for SS based upon the Stalnaker-Lewis semantics. Stalnaker has pointed out5 that SS is complete on a semantics which is identical to Lewis' 'Analysis 2' except that the ordering of equivalence classes of worlds is only a partial ordering rather than a simple ordering. In light of the discussion of the ordering of worlds in Chapter I, this is at least suggestive of the 'real completeness' of SS.
We have distinguished between four different kinds of subjunctive conditionals and explored their logical properties. Of these, it is particularly important to distinguish between the simple subjunctive and the necessitation conditional. Philosophers have often rejected theses about the simple subjunctive by appealing to intuitions only appropriate to necessitation. It has been shown that our other kinds of conditionals are definable in terms of the simple subjunctive, so if we can provide an analysis of the simple subjunctive we will have an analysis of them all. And we have a set of axioms for the simple subjunctive. These will be of importance in testing any putative analysis. If such an analysis does not make these axioms valid, this is at least a reason for being suspicious of the analysis. NOTES
'
The modality expressed here by 'couldn't', as in 'The match couldn't light', is an intriguing one. Although, as is well known, a modality can be defined in terms of
F O U R KINDS O F C O N D I T I O N A L S
45
subjunctive conditionals (as rP>(Q & -Q)'), this modality is not the same as 'couldn't'. It will result from the discussion of Chapter IV that if rP > (Q & W Q )is~true, then P is necessarily false. Thus this cannot be a correct analysis OF 'couldn't' as it occurs in this context. A different analysis will be proposed in Chapter I11 in terms of 'actual necessity'. * I am indebted to David Lewis for this example. O r more precisely, Q is true in every world that might be actual if P were true. In the choice of these axioms, I have been influenced by the work of David Lewis. In correspondence.
CHAPTER I11
SUBJUNCTIVE GENERALIZATIONS
An understanding of laws is fundamental to an understanding of subjunctive conditionals. I argued in Chapter I that laws are themselves subjunctive. A law says not just that all actualA's are B's, but that any A would be a B. Statements of this latter form will be called subjunctive generalizations. They are to be contrasted with material generalizations which have the form ^(x)(Ax3Bx)'. Our problem in this chapter is to understand subjunctive generalizations. Subjunctive generalizations look initially like universally quantified subjunctive conditionals of some sort. There is more than one kind of subjunctive conditional, but the most obvious candidate is the simple subjunctive. According to this proposal, the subjunctive generalization "Any A would be a B1 is equivalent to "(x)(Ax > BX)'.' There are several difficulties for this proposal. The first difficulty is that the simple subjunctive is automatically true if both antecedent and consequent are true, i.e., ' ( P & Q)' entails ^ ( P > 0)'. It would seem to follow that ^(x)(Ax & Bx)' entails "(x)(Ax > Bx)l. But it is quite clear that "(x)(Ax & Bx)' does not entail "Any A would be a 6'. For example, consider Black Bart who dislikes everything except redheaded women who like him. Regrettably, there are no redheaded women who like Black Bart, although were there any he would like them. If we let A be a tautological predicate and B be the predicate 'is disliked by Black Bart', then ^(x)(Ax & Bx)' is true. But it is certainly not true that anything at all would be disliked by Black Bart. On the contrary, we know of something that would not be disliked by Black Bart - namely, redheaded women who like him. Thus it appears that this analysis of subjunctive generalization fails. Apparently a subjunctive generalization cannot be expressed as "(x)(Ax > Bx)'. We have a genuinely different subjunctive here. It is still natural to symbolize it as a universally quantified conditional, but
47
SUBJUNCTIVE GENERALIZATIONS
we must use some conditional other than the simple subjunctive. Let us use '$' to symbolize "Any A would be a B1 as "(^(Ax 3 Bx)'. But this leads to unexpected difficulties. What do the quantifiers range over? They cannot be taken as ranging over just existent objects, because rAny A would be a B1 entails the simple subjunctive '(Aa > Ba)1 even for non-existent objects. For example, if I know that any raven would be black, I can conclude that if there were a raven in the next room, it would be black. This is connected with the problem just discussed for symbolizing the subjunctive generalization as . problem seems to be in part that the subjunctive ( x ) ( A x > B X ) ~The generalization is not just about all actual objects, and hence it cannot be symbolized as r(x)(Ax3Bx)' because the quantifier in this formula does range only over actual objects. A natural suggestion is that the quantifier in a subjunctive generalization ranges over all physically possible objects. For example, when we say that any raven would be black, we are claiming not just that any actual object would be black if it were a raven, but that any possible raven would be black. Putting the suggestion this way makes it sound metaphysically outrageous. What is this domain of possible objects over which the quantifiers are supposed to range? However, the suggestion can be rephrased in a way that makes it appear much more reasonable. Let us say that a proposition is physically possible, and symbolize this as ^OPT,just in case there is no physical law that is P
inconsistent with it. We then define physical necessity, rUP',
as
P
meaning r-O-P1.
Talk about quantification over physically possible
P
objects can then be replaced by talk about the physical necessity of the result of quantifying over actual objects. In other words, the proposal is that "Any A would be a B1 can be analyzed as rO(x)(Ax 3 Bx)l. P
A serious difficulty for this proposal is that we have defined physical necessity in terms of the notion of a physical law, but physical laws are subjunctive generalizations, and so it may (and in fact will - see below) turn out that this proposal is ultimately circular. However, there is another more serious difficulty - even if we could ignore the circularity, the proposed analysis is false. This can be seen by considering subjunctive generalizations whose antecedents are physical impossible. Let us
48
CHAPTER 1111
call these counter-legals, following Nelson Goodman (1955). For example, suppose it is true, as has been conjectured, that any pulsar would be a neutron star. Then it is true (non-vacuously) that any pulsar which is not a neutron star would emit periodic bursts of radiation, but it is false that any pulsar which is not a neutron star would be a neutron star. In other words, we distinguish between true and false counter-legals. They are not all vacuously true merely because the antecedents are false of all physically possible objects. Furthermore, such counter-legals entail subjunctive conditionals about physically impossible objects. For example, the above generalization entails 'If there were a pulsar just to the left of Alpha Centauri which was not a neutron star, it would emit periodic bursts of radiation'. This seems to indicate that these subjunctive generalizations are not just about physically possible objects after all. What they are about seems to be a function partly of the nature of the antecedent. I think that the attempt to analyze subjunctive generalizations in this way as universally quantified subjunctive conditionals is essentially bankrupt, for the reasons just outlined. I believe it will prove more fruitful to turn directly to an examination of the way subjunctive generalizations work rather than trying to reduce them to something else. Once we have an account of the way subjunctive generalizations work, it will prove possible to characterize them in terms of physical necessity and quantification in ways more complicated than those considered above. However, such characterizations will still not constitute analyses because of the circularity involved in appealing to physical necessity.
To simplify our writing tasks, let us symbolize "Any A would be a B1 as "Ax 3 Bxl. This is intended merely as a symbolization, and in no way reflects an analysis of these subjunctive generalizations., Notice that in this symbolization, '3' is not a conditional - it connects predicates, or more generally open formulas, rather than sentences. It is actually a binary variable-binding operator, binding, as it does, the free variables in "Ax1 and "Bxl. More generally, whenever (p and (//
SUBJUNCTIVE GENERALIZATIONS
49
are open formulas containing the same n free variables, "(p 3 says that any n-tuple satisfying
50
C H A P T E R 111
different kinds of subjunctive generalizations. When we think of subjunctive generalizations, we think most naturally of laws. Those laws that we confirm inductively must have physically possible antecedents (otherwise there would be no confirming instances for them), so the problem of interpreting the apparent quantifiers in them is greatly simplified. Insofar as "0('3x)Ax1 is true, we can think of "A 3 B1 as P
saying simply that any physically possible A would be a B. These law statements, which are in some sense about all physically possible objects, constitute one kind of subjunctive generalization. But philosophers have often overlooked the fact that there is another kind of subjunctive generalization. For example, Chisholm (1955) g'ives as an example of a law, 'Anyone who drank from that bottle would be poisoned', said of a certain bottle containing.poison. But this is obviously not a law. It is not the least bit plausible to regard Chisholm's generalization as saying that laws of nature dictate that any person who drank from that bottle would be poisoned. It would be quite possible to have someone drink from the bottle without being poisoned - all we would have to do would be to wash the bottle out first and fill it with water. And although we are well aware of this, we still regard Chisholm's generalization as true. We have distinguished between the subjunctive generalization '^(Ax^Bx)^ and the universally quantified subjunctive conditional ^(x)(Ax> B x y . It might be supposed that Chisholm's generalization actually has the latter form rather than being a true subjunctive generalization. That it does not can be seen as follows. Suppose there were no living creatures. Under these circumstances, there would be nothing which logically could be a person who drinks from the bottle. I am supposing here that it is a necessary feature of a non-living thing that it not be a person. The basic sortals under which non-living things fall are such as to logically preclude their being persons. This is not to say that a non-livingthing could not turn into a person, but this would be a substantial change and would involve the original object ceasing to exist and the person coming into existence. If a person is created out of clay, the clay ceases to exist. We do not say that the clay 'is now a person7. Assuming this is correct, in a world in which there are no living creatures, r(x)\Z\-Axl is true (where "Ax1 is x' is a person who
SUBJUNCTIVE GENERALIZATIONS
51
drinks from this bottle1). It is a consequence of our previous account of subjunctive conditionals that the following is valid: ' 0 - P 2 (P > Q)'. Presumably then, we also have ^(x)D-Ax =i (x)(Ax> Bx)' for any predicate B at all. But this would make both 'Anyone who drank from this bottle would be poisoned' and 'Anyone who drank from this bottle would fail to be poisoned' vacuously true in a world in which there are no living things. However, they would not be vacuously true; the first would be true, and the second false. Consequently, Chisholm's generalization does not have the form '(x)(Ax> B x ) l . It is a genuine subjunctive generalization. Therefore, there are true subjunctive generalizations that are not laws. Although the generalization 'Anyone who drank from this bottle would be poisoned' is not about all physically possible persons who might drink from this bottle, it is in some sense about all 'actually possible' persons who might drink from that bottle. The reason we regard the generalization as true is that although we know a way to enable a person to drink from the bottle without being poisoned namely, wash the bottle out - we also know that that is not going to happen. The bottle is our container for rat poison, and it is where we are going to continue to keep the rat poison. It is because of these facts that anyone who drank from the bottle would be poisoned. And it is because of these facts that, although it is physically possible for a person to drink from the bottle without being poisoned, it is not 'actually possible' for a person to do so. This notion of actual possibility is not a modality with which philosophers are very familiar. It has been largely overlooked, although I suspect that it may be of some importance in explaining various locutions that philosophers have always found puzzling. And I think that, although it is less familiar, it is no more yspect than the notion of physical possibility. As we will see in section:%f it can be clearly defined in a manner which should make it philosophically respectable. Apparently there are these two distinct kinds of subjunctive generalizations, one kind being, in some sense, about all physically possible objects, and the other being only about all actually possible objects. Let us reserve ' 3 'for symbolizing the first kind, and let us calllthem strong subjunctive generalizations; let us use '3'to symbolize the second kind, and call them weak subjunctive generalizations. In
52
C H A P T E R 111
analyzing subjunctive generalizations, we would like to analyze both of these kinds of subjunctive generalizations. The basic scheme for analyzing them would seem to be the same. In each case we begin by confirming a basic set of generalizations inductively, and then others are derived logically from the basic ones. It will be apparent below that the inferences employed in deriving new generalizations from the basic ones are the same for both kinds of generalizations. Thus the difference between them must arise ultimately from the way the basic ones are inductively confirmed. We do not have to look far to find the source of that difference. Inductive confirmation is defeasible. A statement of evidence E may confirm a generalization H, and yet when an additional statement E' (e.g., one containing a counter-example to H ) is conjoined with E, the resulting conjunction may no longer confirm H. We say that E' is a defeater, and that it defeats the confirmation of H by E. Explaining how confirmation works is a matter of explaining both what statements constitute evidence and what statements constitute defeaters4. I think that the difference between strong and weak generalizations lies in what statements are defeaters for their inductive confirmation. There is a simple defeater for the confirmation of "A 3 B1 which is not a defeater for "A+B1. The former is about all physically possible objects, but the latter is not. Consequently, a defeater for "A 3 B1 which is not a defeater "A ^> B1 is "It is physically possible for there to be an A which is not a B1. To define the difference between '9'(law statements) and '^>' in terms of this defeater as I have just stated it is circular, because we defined physical possibility in terms of laws. But the circularity can be avoided by restating this defeater in terms of the grounds we have for thinking that it is true. These grounds are basically inductive. We discover inductively that certain kinds of things are always possible. For example, we might confirm inductively that the 9: 14 train from Boston is always late. This is a weak subjunctive generalization rather than a strong subjunctive generalization. It is certainly not physically impossible for a 9 : 14 train from Boston to arrive on time. For example, suppose the reason the train is always late is that the railroad has only old broken-down equipment. If the equipment were replaced by new equipment, the train would be on time. These observations
SUBJUNCTIVE GENERALIZATIONS
53
constitute a defeater for the strong generalization according to which it would be a law that the 9 : 14 train from Boston is always late, and it is because of this defeater that we only affirm the weak generalization. The way in which this defeater works is the following. We begin by ascertaining that it is always physically possible to replace old brokendown equipment with new equipment. We can ascertain this because it follows from our definition of physical possibility that whatever is true is possible. Thus if we know of a number of cases in which brokendown equipment has been replaced, we also know of a number of cases in which it is physically possible to replace broken-down equipment. So we have a number of confirming instances for the generalization that it is always physically possible to replace broken-down equipment. Of course, we also know of many cases in which broken-down equipment has not been replaced, but this is irrelevant to the induction. The only thing that would be relevant to the induction as a counterexample to what is being confirmed (that it is always physically possible to replace broken-down equipment) would be a case in which we know it is physically impossible to replace broken-down equipment. To have such a case, we would have to have confirmed a law which entails that the equipment could not be replaced in some particular case. We do not know of such a law, so we have confirmation for the generalization that it is always physically possible to replace broken-down equipment. This entails that it is physically possible to replace the broken-down equipment used on the 9 : 14 train from Boston. We are supposing we know that if the equipment were replaced, the train would be on time. And I assume the following principle regarding simple subjunctives:
Thus we can conclude that it is physically possible for the 9 : 14 train from Boston to be on time. Hence we have a defeater for the strong generalization that the 9 : 14 train is always late, and are left with only the weak generalization. This illustrates the basic scheme by which we defeat strong generalizations by having physically possible counter-examples. I will not attempt to give a general characterization of this scheme here, but that should not be too difficult to do. What has been said should be
54
C H A P T E R 111
sufficient to indicate the source of the difference between strong and weak subjunctive generalizations.
Now let us turn to the task of elucidating the inferences that allow us to derive new generalizations from those that we confirm directly by induction. Let us begin with strong generalizations. Those confirmed directly constitute the set N. Then additional generalizations are derived from these. There are inferences valid in the predicate calculus that are not valid here. For example, although ^(x)(Ax& Bx)' entails ( x ) ( A x=1 Bx)', we have seen that it does not entail '""(A 3 B)'. Upon reflection, however, it may seem that all those inferences in the predicate calculus which proceed exclusively from universally quantified conditionals to universally quantified conditions are valid. In other words, letting "Vip' stand for the universal closure of the open formula ip, it might be supposed that the following principle holds: (3.1)
If the universally quantified material conditionals 'V(P, => Q& . . ,rV(Pn3 Q x imply rV(R 3 S)' in the predicate calculus, then the strong generalizations Ql)', . . . ,'""(P,, 3 On)' entail ' ( R 3 S)'.
It can be shown that a necessary and sufficient condition for 3.1 to hold is for the following to hold:
At first, it may seem that 3.2-3.7 are unexceptionable. But they
SUBJUNCTIVE GENERALIZATIONS
55
cannot all hold. Jointly they entail: (3.8)
(P^Q)3(P&-Q.^>R).
Principle 3.8 says that a physically impossible antecedent implicates anything at all. There are clear counterexamples to 3.8. For example, suppose it is a law of nature that no human being can live for more than 200 years, and also (probably contrary to fact) that a human being acquires more and more knowledge the longer he lives. From this we can conclude: A three hundred year old human being would know more than a 100 year old human being. On the other hand, we cannot conclude: A three hundred year old human being would know less than a 100 year old human being. And yet both of these inferences would be licensed by 3.8. Thus 3.8 must be false. It is rather obvious what is wrong with 3.8. Principles 3.2-3.7 all hold as long as the antecedents of the generalizations are physically possible, i.e., are compatible with all physical laws. But if the antecedent of a generalization is incompatible with some physical law, then we cannot use that law in deciding what would happen if the antecedent were true. For example, if ' ( P 3 0)"' is true, we cannot assume this law in deciding what characteristics would be possessed by something satisfying '(P & -0)'. In particular, we cannot conclude that '[(P & - 0 ) 3 QI1is true. In deciding what an antecedent implicates, we must rule out all laws incompatible with that antecedent. For example, let us suppose that it is a law that all white dwarf stars must have radii smaller than that of the sun. We cannot conclude from this that any white dwarf larger than the sun would be smaller than the sun. On the other hand, we can non-vacuously conclude that a white dwarf having a radius larger than that of the sun would have a mass greater than that of the sun. This is because the laws used in deriving the latter conclusion are compatible with the supposition of a white dwarf having a radius larger than that of the sun. More generally, an antecedent may conflict with several laws conjointly but none individually. To illustrate, suppose we have the law
56
CHAPTER III
statements
For example (to compound our tale of science fiction) these might be the laws, 'Any white dwarf is a stellar object of radius smaller than the sun', 'Any stellar object of radius smaller than the sun is of luminosity less than Lo' (where Lo is some fixed luminosity), 'Anything of luminosity less than Lo cannot be seen at a distance greater than one billion light years', and 'Any white dwarf would be more massive than the sun'. Then "(P & -R)' does not conflict with any of these statements individually. Assuming these laws, we can conclude "(P & -R. 3 T)' (i.e., 'Any white dwarf of luminosity no less than Lo would be more massive than the sun'). This comes directly out of the law ( P ^> T)', which is compatible with the antecedent "(P & -R)'. But (i.e., 'Any white dwarf of we cannot conclude " ( P & -R. luminosity no less than Lo could not be seen at a distance greater than one billion light years'). This is because, in order to get this, we must use all three of '(P 3 Q)', "(Q 3 R)', and "(R 9 SIT, and although ( P & -R)' is compatible with each of these individually, it is incompatible with the set of them. How do we characterize which of these inferences are valid? We begin with the set N of basic laws that we confirm directly by induction. Then we want to obtain derivative laws from those in N. The above picture suggests that '(P 3 0 ) ' follows from N just in case there is some subset N p of N which is consistent with P and which, by some reasonable principles of inference, implies '(P 3 0)'. However, this is still not quite right. The difficulty is that there may be more than one such set Np, and different such sets might lead to different conclusions. For example, suppose we have in N the two laws "(A 3 B)' and " ( C a -B)l. The above principle would allow us to derive both '(A & C. 3 B)' and '(A & C. 3 -B)' from N. But this is clearly wrong. Under these circumstances, we should not be able to derive either conclusion from N. Or to take another example, suppose N is the set
We want to know whether '(P & T. 9 S)l follows from this. For
SUBJUNCTIVE GENERALIZATIONS
57
example, the first three laws might be as before, and the fourth might be 'Any pulsar is of luminosity at least Lo'. Intuitively, we cannot conclude that any white dwarf which is a pulsar could not be seen from a distance of one billion light years. This is because, although its being a white dwarf favors that conclusion, its being a pulsar blocks the normal way of inferring this conclusion from the fact that the object is a white dwarf. What is required for a conditional "(P 3 Q)' to be inferrable from N is not for there to be some subset N p consistent with P from which we can infer r ( P 3 Q)l, but for every maximal subset of N consistent with P to be such that we can infer "(P 3 QY from it. In other words, there may be different ways to render N consistent with P by omitting minimally many things from N. The supposition that P is true amounts to supposing that one of those minimally altered subsets of N constitutes the actual set of basic laws, but no particular such subset is favored over any other. Any of them might constitute the actual set of laws, and so it is only when every such set allows us to infer ^(P3 Q ) l that we can conclude that "(P 3 0)"' is true given N as it actually is. Let us say that a set is P-consistent just in case it is logically consistent with "(SxI), . . . (3xn)P1 (supposing P to have n free variables). If X is a set of sentences, and Y is a subset of X, let us say that Y is a maximal-P-consistent subset of X just in case (i) Y is P consistent, and (ii) there is no Z such that Y Ã § = Z c and Z is P-consistent. Thus the maximal-P-consistent subsets of N are those sets that result from making minimal deletions in N so as to render it consistent with P. Then the above account of strong generalizations amounts to saying that "(P3 Q)' is true iff "(P3 Q)' can be inferred from every maximal-P-consistent subset of N. But we have not yet explained what sorts of inferences are allowable in inferring r(P3 Q)' from the maximal-P-consistent subsets. Suppose N* is such a subset. As P is consistent with N*, there would seem to be nothing wrong with using 3.2-3.7 in making such inferences. These would seem to constitute a minimal set of rules of inference allowable here. But there is also a rather obvious upper bound to what inferences are allowable. It seems clear that no inference from strong generalizations to strong generalizations is valid when the corresponding inference from material generalizations to material generalizations
58
C H A P T E R 111
is invalid in the predicate calculus. The actual set of allowable inferences must lie somewhat between what is derivable from 3.2-3.7 and this upper bound. But, as was remarked earlier, 3.2-3.7 are conjointly equivalent to principle 3.1, which is the strongest principle that does not transgress this upper bound. Consequently, the upper bound limits us exactly to 3.2-3.7, or equivalently, to 3.1. The inferences that are allowable in getting from N* to "(P 3 Q)' are precisely those inferences that are allowable in the predicate calculus in going from material generalizations to material generalizations. This means that we can give a very simple characterization of what strong generalizations are true in terms of the set N. Let VN be the set of material generalizations corresponding to the strong generalizations in N. Then we have: (3.9)
"(P 3 Q)^ is true iff every maximal-P-consistent subset of VN entails "V(P 2 Q)'.'
Before proceeding further, we must be a bit more precise about the basic generalizations that make up the set N. So far all we have said about them is that they must be projectible, but some additional restrictions are required if our account of the way in which laws are derived from N is to work. The difficulty is that projectible laws need not be; in the Appropriate sense, 'basic'. For example, if "(A 3 B)', "(B3 C)'e N, then "(A 3 C)' which is derivable from them is also projectible.6 However, we should not count "(A 3 C)' as basic for the purposes of deriving additional generalizations. If we did treat "(A 3 C)' as basic, then we could infer the truth of "(A & -B. 3 C)'. But we should not be able to infer this because part of the reason "(A 3 C)' is true is that "(A 3 B)"' is true, and the antecedent "(A & B)' conflicts with the latter law. Thus we should require that in order for "(A 3 C)' to be in N, there cannot be open formulas Bo, . . . , Bn such that "(A 3 Bo)", "(Bo 3 B,)', . . . ,"(a,, 3 C)'e N. Similarly, we should preclude the case in which "(A 3 Bo)"', "(A 3 Bi)', . . . ,"(A 3 Bn)', "(Bo & . . . & B,,. 3 C)'e N. Combining these restrictions we obtain the general constraint:
-
(3.10)
If "(A 3 C ) ' e N, then there cannot be finite sets Qo, . . . , Qn of predicates such that: (i) for each
59
SUBJUNCTIVE GENERALIZATIONS
,,
there is a F c U such (ii) for each i < n and
*
(3.11)
N = \J {X; X c N+ and X satisfies constraint 3.10 and for any generalization " ( A 3 B ) " in N+, VX entails 'V(A 2 B)'}
Given this definition of N, I believe that our account of derived subjunctive generalizations, and hence our definition of ^ ( P + Q)', works correctly. Given our definitions, we can explore the logical properties of strong generalizations. We have been saying that a sentence is physically possible just in case it is consistent with all physical laws. But by 3.9, this is the same thing as being consistent with VN. Thus: (3.12)
^OPT is true iff P is consistent with VN. P
Let us also define, in case P is an open formula: (3.13)
"OPT is true iff "O(Sx,). . . (3xn)P1 is true. P
P
60
C H A P T E R III
We define in the conventional way:
(3.14)
"UP1 is true iff VN entails P. P
Our definitions of physical possibility and physical necessity make reference to N. It would be nice to be able to define these modalities simply in terms of strong generalizations without referring to N. It follows from 3.9 that most of the.natura1 ways of defining modalities in terms of conditionals yield logical necessity rather than physical necessity:
(3.15)
( P 3 .Q & - Q ) e D - P .
(3.16)
( P 3-P)-0-P.
However, there is one normal way of defining a modality which does give us physical necessity:
We can list a number of theorems that result from our definition of
'3'and the physical modalities: (3.18)
( P 3 Q ) & ( Q + R ) . 3 ( P ^> R ) .
(3.19)
( P + 0 )=> ( P 3 0).
Certain normal principles hold only with the additional assumption \ that antecedents of generalizations are physically possible:
61
SUBJUNCTIVE GENERALIZATIONS
Now that we have an account of the way in which strong subjunctive generalizations work, we can return to the task of characterizing them in terms of physical necessity and quantification. A natural proposal was that "(A 3 B)' is equivalent to "UV(A => B)'. This seemed to be P
correct for non-counter-legal generalizations, but it failed for counterlegals. It is now easy to see both that it is correct for non-counterlegals and why it fails for counter-legals. In the non-counter-legal case, VN itself is the only maximal-A-consistent subset of VN, so 3.9 gives us the result that "(A 3B)' is true iff "V(A 3 B)' is entailed by VN, i.e., iff rV(A => B)' is physically necessary: (3.28)
O A =>[(A3 B ) = DV(A 3 B)]. P
P
But if "(A 3 B)' is counter-legal, then what is required is not that V ( A => B)" be entailed by VN, but rather that "V(A 2 B)' be entailed by every maximal-A-consistent subset of VN. Thus 'UV(A 3 B)' does P
not entail "(A 3 B)'. However, there is a natural way of thinking of the different maximal-A-consistent subsets of VN. If, contrary to fact, it were physically possible for there to be an A, then N could not be the actual set of basic strong generalizations. Instead, the set of basic strong generalizations would have to be consistent with there being an A. The different maximal-A-consistent subsets of VN represents the the different ways of minimally modifying N in order to make it consistent with there being an A, and as such it is reasonable to think of them as representing the sets of basic strong generalizations in the different worlds that might be actual if it were physically possible for there to be an A. Then to say that every such maximal-A-consistent subset of VN entails 'V(A 3 B)' is the same as saying that "V(A 3 B)' is physically necessary in every world that might be actual if it were physically possible for there to be an A, i.e., (3.29)
(A 3 B) = [OA > DV(A 3 B)]. P
P
The above reasoning is, I think, persuasive, but we cannot assert 3.29 as a theorem because we do not yet have an analysis of subjunctive conditionals. However, when we finally do get a complete analysis of
62
C H A P T E R III
subjunctive conditionals in Chapter VI, we will be able to prove 3.29. Thus this is a correct characterization of strong subjunctive generalizations. Of course, it is not an analysis because it employs the notions of physical possibility, physical necessity, and the simple subjunctive, all three of which are ultimately defined in terms of strong subjunctive generalizations.
Now let us turn to weak generalizations. Their characterization is rather obvious given our treatment of strong generalizations. Once again, we begin with a set W of basic weak generalizations that have been confirmed directly by induction, and then we derive new generalizations from those in W. The set W is constructed from the set W^ of true projectible weak subjunctive generalizations in just the way N was constructed from N + . If we had no strong generalizations to contend with, we could define "(P =^> Q)' in a way completely analogous to principle 3,9:
( P => 0)'is true iff every maximal-P-consistent subset of W entails "V(P 3 Q)'. However, our characterization is made more complicated by the presence of strong generalizations. The difficulty is that it is not sufficient to render V N and VW each consistent with P individually. We must render them jointly consistent with P. For example, if V N contains the single generalization "V(P 2 R)', and V W contains the single generalization "V(Q => -R)', then both V N and V W are consistent with r(P & 0)'. But V(N U W) is not consistent with "(P & Q)'. If we ignored this inconsistency, we would be able to infer both "(P & Q. 3 R)' and "(P & 0 . 3-R)l. This should obviously be prohibited. However, it is not sufficient to just require that V(N U W) be consistent with P. Laws take precedence over weak generalizations. For example, it is a law that any object released in a vacuum at the earth's surface would fall towards the center of the earth. We also have the weak generalization that any helium filled balloon released at the surface of the earth would rise. Given these two generalizations, we can conclude that any helium
SUBJUNCTIVE GENERALIZATIONS
63
filled balloon released in a vacuum at the surface of the earth would fall towards the center of the earth; and we cannot conclude that such a balloon would rise. When a strong generalization conflicts with a weak generalization, we make our inferences on the basis of the strong generalization. To say that laws take precedence over weak generalizations is to say that in rendering V(NU W) P-consistent through deletion, we first render VN P-consistent by making as few deletions as possible, and then we delete whatever we must from VW to render V(NU W) P-consistent. Or, more precisely, we first find a maximal-P-consistent subset Np of VN, and then we find a maximal-P-consistent subset Wp of V(NU W), subject to the restriction that N p s Wp. This yields the following analysis: (4.1)
'(P 3 Q)l is true iff for every maximal-P-consistent subset Np of VN, and every maximal-P-consistent subset Wp of V(N U W) such that Np G Wp, Wp entails 'V(P 3 0)'.
Having explicated '^>', it seems that we can now define 'actual possibility' in a manner completely analogous to the definition of physical possibility: (4.2)
"OPT is true iff P is consistent with V(NU W). a
If P is an open formula: (4.3)
'OPT is true iff 'O(3xI) . . . (3x,,)P1 is true. a
a
'Actual necessity' is defined in the normal way: (4.4)
'UP1 is true iff '-0- P1 is true. a
a
We can easily obtain a number of theorems about weak generalizations, most of them analogous to theorems about strong generalizations: (4.5)
D P = ( Q v - 0 . 3 P). a
64
C H A P T E R 111
We have explained weak subjunctive generalizations in a way precisely analogous to the way in which we explained strong subjunctive generalizations, and we have defined actual possibility and actual necessity in a way precisely analogous to the way in which we defined the better known modalities of physical possibility and physical necessity. Despite all this, one is apt to have the lingering feeling that he does not really understand actual possibility and actual necessity. I think that the lack of understanding here is more of a psychological lack than a logical lack. Given our examples, it becomes completely obvious that there are weak subjunctive generalizations, and in fact that most of the subjunctive generalizations we affirm are weak generalizations rather than strong generalizations. Furthermore, the characterization of weak subjunctive generalizations in terms of the way in which they are confirmed seems to me entirely adequate, and given that, the formal definitions of actual possibility and actual necessity cannot be faulted. What is lacking is not a logical characterization of these concepts, but rather an intuitive feel for how they fit into our ordinary thinking and reasoning and how they are related to other concepts. We can provide a bit of intuitive feel for actual possibility and actual necessity by showing that these concepts really are employed in our ordinary reasoning despite the fact that logicians appear to have completely overlooked them. We often have occasion to assert that
SUBJUNCTIVE GENERALIZATIONS
65
something might be the case, or that something else just wouldn't be the case. For example, in talking about the 9 : 14 train from Boston, I could assert that, on the one hand, it might show up with all the cars painted fluorescent pink (they do strange things like that on this railroad), but on the other hand, it just wouldn't show up on time. The English language puts a bit of a strain on us here. In talking about an actual 9 : 14 train (e.g., the one this morning), it is not entirely appropriate to use the subjunctive 'wouldn't' and say 'It just wouldn't show up on time'. Instead, we say 'It just won't show up on time'. But notice that this is intended to be stronger than a prediction. Part of the purpose of the 'just' in the sentence is to indicate that 'won't' is functioning as a modality rather than in its purely indicative sense. In saying that the train just won't show up on time, we are saying that it is in some sense necessary that it won't show up on time. In these sentences 'might', 'wouldn't', and 'won't' are functioning as singulary modal operators, and not as parts of conditionals. We use these modalities all the time, and they are precisely the operators of actual possibility and actual necessity. Additional feel for weak subjunctive generalizations can be provided by seeing how they are related to other concepts. As in the case of strong subjunctive generalizations, it will turn out that they can be expressed in terms of actual possibility, quantification, and the simple subjunctive:
This principle should seem plausible for the same reasons principle 3.28 did, and we will be able to prove it in Chapter VI. But perhaps 4.17 is not very helpful, because it expresses in terms of actual possibility, which if anything is a less familiar concept itself. A much more helpful characterization of weak subjuncthan '3' tive generalizations can be obtained in another way. Weak subjunctive generalizations are true because of physically contingent facts about the world. The 9 : 14 train from Boston is always late because the railroad has only old broken-down equipment; anyone who drank from Chisholm's bottle would be poisoned because the bottle contains rat poison. In general, a weak generalization "(Ax Bx)' is true only
'+'
66
C H A P T E R III
when there is a true statement P such that r ( A x & P. 3 Bx)' is true. However, it is quite clear that P's merely being true is not enough to ensure the truth of the weak generalization. What other constraints are required? In the case of Chisholm's bottle, P is the statement that the bottle contains rat poison. It is at least required that P would still be true even if someone were to drink from the bottle:
(4.18)
( A x 3 Bx) 2 ( 3 P ) [ ( A x & P. 3 B x ) & PE(3x)AxI.
However, 4.18 cannot be turned into a biconditional. The requirement ' P E ( 3 x ) A x 1 is not yet strong enough to ensure the truth of ' ( A x B x ) ~This . is because if the bottle does contain poison and someone has in fact drunk from it, then ' P E ( 3 x ) A x 1is automatically true. This follows from our analysis of 'even if' in Chapter 11. But this is not enough to make r ( A x Bx)' true. We must require that P would still be true even if someone else drank from the bottle. It seems that the general requirement should be that P would still be true even if there were a new A , i.e., an A which does not now exist:
+
+
+
( A x Bx) = ( 3 P ) [ ( A x& P. 3 B x ) & P would be true even if there were an A which does not now exist]. How can we make the final clause of 4.19 precise? The only way I can see to do this is by talking about sets of objects. There are no generally accepted conventions regarding the behavior of sets across possible worlds, so we are free to make up our own. It seems reasonable to regard a set in one world as the same set as a set in another world just in case they contain the same objects. In other words, we identify sets across possible worlds in terms of their members, just as we identify sets within a world. This will be our convention. It will be discussed in more detail in Chapter VI. Then to talk about there being an A which does not now exist is to talk about there being an A which is not a member of the set of all objects existing in this world: (4.19)
( A x & P. 3 B x ) & P E ( 3 x ) ( A x & x& XI]. This seems to me to capture exactly what we mean by the weak subjunctive generalization. Given our full analysis of subjunctive conditionals in Chapter VI, 4.20 will be equivalent to the simpler characterization:
'
SUBJUNCTIVE GENERALIZATIONS
(x)(Ax 2 Bx)E(3x)(Ax & x&X)]. This also seems to be a very intuitive characterization of the weak subjunctive generalizations. r(Ax 3 Bxy certainly entails r ( ~ ) ( A 2 BX)' ~ and it is plausible to suppose that what more is required for the truth of r(Ax Bx)' is that if there were a new A (one that does not now exist)? it would be a B too. This is just what 4.21 requires. However? if 4.21 is to be acceptable, we must somehow ensure that a certain kind of counter-example cannot occur. Let X be the set of all the objects existing in this world. If the generalization r(Ax 3 x E X)' were true and logically contingent (i.e., r(x)(Ax += x E X y is false)? this would constitute an immediate counter-example to 4.21. Can a subjunctive generalization of this form ever be true? In order for it to be true, our generalizations would somehow have to dictate that there could not be any A's that do not already exist. This would be very odd. Certainly our generalizations might dictate the total number of A's there are in the universe (in particular, a physical law might dictate that there is just one A), and hence they might preclude there being additional A's, but this is not what r(3x)(Ax & x&X)' requires. The latter does not require that there be more A's than there are now, but just that there be a difierent A. In fact, I think it is logically impossible to have a true subjunctive generalization of the form r(Ax 3 x E Y)' for any set Y (except in the trivial case where rAx' entails rx E Y1). This results from the fact that subjunctive generalizations must either be projectible or else entailed by projectible generalizations. In order for any generalization of the form r(Ax 3 X E Y)' to be true, some generalization of that form would have to be projectible. But it is quite obvious that no such generalization could be projectible. If rx E Y1 were a projectible predicate, then we would find ourselves immediately confirming, for every projectible predicate rAx', that nothing could be an A other than what is already an A. Clearly, such conclusions are not automatically confirmed, so rx E Y1 cannot be a projectible predicate. Consequently, no contingent generalization of the form r(Ax 3 x E Y)' can ever be true? and by virtue of principle 4.17, this implies the validity of the following principle:
+
68 (4.22)
CHAPTER I I I
U ( x ) ( A x 2 x E X ) & O(3x)Ax. 2 (x)(Ax + x E X ) . a
a
For the same reason, we should have (4.23)
U(x)(Ax
2
U(x)(Ax
2
a
. x = al v
. . . v x = a,,)&
.x = al v
. . . v x = a,,).
0(3x)Ax.> a
Given these principles (whose validity will be built into the semantics of Chapter VI), it will become possible to prove the correctness of 4.20 and 4.21. Thus the latter principles do constitute a correct characterization of weak subjunctive generalizations, and derivatively of actual possibility and actual necessity. Of course, principles 4.20 and 4.21 cannot be regarded as providing analyses of these concepts, because 4.20 and 4.21 employ subjunctive conditionals which are themselves analyzed in terms of weak subjunctive generalizations. However, the purpose of 4.20 and 4.21 was not to provide an analysis anyway, but to provide more of an intuitive feel for the way weak subjunctive generalizations are related to other concepts. We already have a logically adequate analysis of weak subjunctive generalizations in terms of the way they are confirmed.
In this chapter I have proposed what I believe to be a logically adequate analysis of both strong and weak subjunctive generalizations in terms of non-subjunctive statements. This should free us to use the notion of a subjunctive generalization in the analysis of subjunctive conditionals without feat of circularity. We have, in fact, got a tentative solution to one of the two major problems facing the traditional linguistic theory of subjunctive conditionals. NOTES
' This is suggested by Stalnaker (1968). See Pollock (1974) for a precise definition of projectibility and an argument to the effect that most subjunctive generalizations fail to be projectible. ' A complete defense of the position that an account of the justification conditions of a concept constitutes an analysis of that concept is provided by Pollock (1974).
SUBJUNCTIVE GENERALIZATIONS
69
A fuller discussion of this can be found in Pollock (1974). As will be seen in Chapters IV and VI, this analysis must be made a bit more complicated. However, the present version of the analysis is sufficient for the time being. ' A defense of this can be found in Pollock (1974). rIIrlis the conjunction of r. 'A @ B1 is defined to be '(A 3 B ) & (B 3 A)7. '1 am indebted to Rolf Eberle for this observation.
' '
C H A P T E R IV
T H E BASIC ANALYSIS O F SUBJUNCTIVE CONDITIONALS
Having laid the groundwork, we can now attempt to construct an analysis of subjunctive conditionals. The basic tool for this analysis is provided by Theorem 3.11 of Chapter I. According to that theorem, a subjunctive conditional "(P> 'Q ) is true iff Q is true in every possible world that might be actual if P were true. That is, assuming the Generalized consequence Principle, we have: (1.1) ,
"(P > 0 ) ' is true in the actual world iff for every possible world a , if a M P then Q is true in a ; "QMP1 is true iff for some a such that aMP, Q is true in a
This is not yet a philosophically satisfactory definition of the simple subjunctive, because the relation M was defined in terms of '>', but if we can provide an alternative analysis of M, principle 1.1 will constitute an analysis of '>'. That will be our strategy here. Let us say that a is a P-world when ~ M P . 'Thus our task is to analyze the notion of a P-world. In this chapter we will restrict our attention to subjunctive conditionals whose antecedents and consequents are indicative, and we will identify a possible world with the set of its indicative truths. In the next two chapters we will generalize the analysis to handle non-indicative antecedents and consequents.
The basic idea underlying my analysis of M was introduced in Chapter I. This is that a P-world is one that is obtained from the real world by making minimal changes which suffice to make P true. In constructing P-worlds, we make changes in the real world only insofar as we are
SUBJUNCTIVE CONDITIONALS
71
forced to do so in order to accommodate P's being true. Truths of the actual world that are in the appropriate sense 'irrelevant' to P must also be truths of any P-world. Gratuitous changes are disallowed. The problem is to say just what constitutes a non-gratuitous change. Let us begin with the case in which P is an indicative statement which is consistent with all true subjunctive generalizations. In this case, it seems clear that any P-world must preserve all of those subjunctive generalizations. If P is, for example, 'I dropped this piece of chalk five minutes ago', then if in constructing a world, we alter the law of gravity instead of concluding that the chalk would have fallen to the ground, that would be a gratuitous change. Such a world would not be a P-world. Subjunctive generalizations take precedence over other truths in the construction of P-worlds. Let G be the set of universally quantified material conditionals corresponding to those subjunctive generalizations that are true in the actual world. Then if P is consistent with G and a is a P-world, we must have G c a. Most changes are ruled out as gratuitous. What changes are not gratuitous? The most obvious case occurs when Q is true in the actual world, but G entails '^(P3 -Q)^. Then we must make Q false in any P-world. On our assumption that P is consistent with G (i.e., P is 'actually possible'), it follows that G entails ^(P^R)" iff P weakly implicates R, i.e., iff "(P=>R)^ is true. Let us say that P counterimplicates R when '^P^>-R1 is true. So if P counter-implicates 0 , Q must be false in any P-world. But, this cannot be the only time that a change is allowed. For example, we may have two true propositions 0 and R such that the conjunction " ( 0 & R)' is counter-implicated by P, but neither 0 nor R by itself is counter-implicated. We cannot have both Q and R true in a P-world, so some change is required. For example, let P be 'Bizet and Verdi were compatriots', and let 0 be 'Verdi was Italian' and R be 'Bizet was French'. Then "P+-(Q & R)" is true, but P does not counter-implicate either Q or R. How do we decide whether to change the truth value of 0 or that of R ? There is no way to decide. There is no basis for giving preference to one over the other. The situation is rather that either Q or R might be false - there is a P-world in which Q is false (but R true), and another in which R is false (but Q true) but we cannot conclude of either Q or R that it would be false. If
72
CHAPTER I V
Bizet and Verdi were compatriots, they might both be French, and they might both be Italian, but it is not true that they would both be French, or that they would both be Italian. This suggests that in constructing P-worlds, we simply take the set of actual truths a n d then make minimal changes so as to render it P-consistent. The result of any such set of minimal changes describes a P-world. As there may be more than one way of minimally changing the actual world in order to achieve P-consistency, there may be more than one P-world. Making this precise, if a. is the set of actual truths i.e., no is the real world), and a is a possible world, the proposal is: (2.1)
a M P iff G c a and a D a. is a maximal Pconsistent subset of an.
ANALYSIS I:
In other words, it is required that a preserve as many of the actual truths as possible while still making P true. Although I believe that Analysis I is on the right track, there are immediate difficulties for it. As it stands, it would lead to the result that if P is false then absolutely no truth is preserved in every P-world: that is, given any truth 0 , there would be a P-world in which Q is false. This is because if P is false, then both Q and r ( P 3 -Q)' are true. Thus in making minimal changes to a. so as to accommodate the truth of P, we have a choice between preserving Q and preserving " ( P 3 -Q)', and in the latter case we will be forced to make 0 false. For example, if P is 'My maple tree died' and Q is 'My car is painted white',. Q should be preserved in every P-world. But according to Analysis I, we have a choice between preserving 'If my maple tree died, my car is not painted white' and 'My car is painted white', and in those P-worlds in which we preserve the former, it is not true that my car is painted white. Evidently not all truths are equally good candidates for being preserved in P-worlds. In the above example, 0 is a candidate for preservation, but ^(P=>-Q)' is not a candidate. Let us call those propositions which are candidates for preservation stable propositions. It appears that there is a class S of stable propositions, and in deciding whether a change is minimal or gratuitous, we only look at what happens to the members of S. In making minimal changes, we seek to make minimal changes in the stable propositions, and we ignore what
SUBJUNCTIVE CONDITIONALS
73
happens to other propositions. This leads to:
(2.2)
ANALYSIS 11:
a M P iff
If S is the class of all stable propositions, then
P e a and G <= a and S ("I a ("I an is a maxima1-P-
consistent subset of S
("I an.
We have seen that not all propositions are stable. Which ones are? As we have just seen, conditionals, and hence disjunctions, generally fail to be stable. On the other hand, a conjunction of stable propositions is automatically stable: If Q and R are both stable, they will both be preserved (and hence their conjunction will be preserved) in any P-world unless there is some set A of stable propositions true in the actual world whose conjunction with P is counter-implicated by either 0 or R ; but then the conjunction of A with P would also be counter-implicated by "(0& R)'. Thus a conjunction of stable propositions is automatically preserved unless it comes into conflict with some other true stable propositions, which is to say that the conjunction is stable. The fact that conjunctions of stable propositions are stable provides some insight into just which propositions are stable. Conjunctions of stable propositions are stable, but only derivatively so because their conjuncts are stable. Those conjuncts might themselves be conjunctions which are stable because their conjuncts are stable, and so on. However, this cannot go on indefinitely. Eventually we must reach conjuncts that are stable in their own right. These conjuncts are still (as a matter of logic) equivalent to conjunction^,^ but those conjunctions are conjunctions of disjunctions, and hence there is no reason to expect their conjuncts to be stable. In examining stable propositions, we may in the above manner be able to take them apart into simpler and simpler components from which they inherit their stability, but eventually this can no longer be done and at that point the stability is basic. The stable propositions which are in this way basic are in a certain sense 'simple'. They are those which cannot, except artificially, be regarded as conjunctions, disjunctions, or in general as compounds or logical constructions out of simpler propositions. These simple propositions would seem to be propositions ascribing 'simple states' to objects. These are non-contrived and not explicitly
74
CHAPTER IV
compound states like colors, dispositions, shapes, etc. Other propositions, e.g., those describing events, are not simple. For example, an event always involves a change of state, and hence is compound (saying that the state is one way at one time and another way at another time). I feel that these notions of a simple proposition and a simple state make good intuitive sense. However, any endorsement of simple propositions is bound to meet with suspicion in light of the recent history of philosophy. The logical atomists built outlandish theories around simple propositions, and the notion of a simple proposition thereby acquired a stigma. However, this is guilt by association. Just because logical atomism was a bad theory which made use of the notion of a simple proposition is no reason to think that there is something wrong with simple propositions themselves. Talk about simple propositions is just a matter of taking logical form seriously. We do distinguish, for example, between those propositions which really are conjunctions, and those which are only equivalent to conjunctions. The proposition that my car is white is not a conjunction, although it is, as a matter of logic, equivalent to a conjunction. All of this is going to be repugnant to many recent philosophers who have supposed that logical form really does not make sense as applied to propositions. They have supposed that although sentences have a structure, with the result that one can be contained in another, one can be a conjunction while another is only equivalent to a conjunction, and so forth, propositions have no such structure. This comes from looking only at the properties of propositions which can be constructed out of the purely logical relations of entailment and equivalence. Using only these relations, there is no way to order propositions in terms of logical complexity, and no sense can be made out of a putative difference between, for example, a proposition being equivalent to a conjunction and really being a conjunction. But propositions do have a structure of a different sort - they have an epistemological structure. Certain propositions are epistemologically basic. These basic propositions provide grounds for believing others, which in turn provide grounds for believing still others, and so on. There is a kind of natural epistemological order h e r e . It is on this basis that we can make sense of simple propositions. We must not make the mistake of supposing that they are to be identified with the epistemologically basic ones. Rather, they are
SUBJUNCTIVE CONDITIONALS
75
those which are not logically complex in the sense of being conjunctions, disjunctions, etc. The notion of a proposition being a conjunction or a disjunction is reflected in (and, I would propose, reducible to) what possible grounds one can have for believing it. For example, what distinguishes a conjunction is that the two conjuncts provide a conclusive reason for believing the conjunction, and the denial of either conjunct provides a conclusive reason for rejecting the conjunction. One might suppose that this is also true of a proposition which is merely equivalent to a conjunction, but that would be a mistake. If P is merely equivalent to '^Q & R', then r-Q' does not by itself constitute a reason for rejecting P. One must also have reason to believe that P is equivalent to '^Q& R1.If one does not know that such an equivalence holds, then clearly, knowing things about Q and R gives him no reason for believing or disbelieving P. I think that by appealing to epistemological considerations of this sort, it will prove possible to give a precise definition of the notion of a simple proposition. This will be undertaken in section three. However, given the epistemological way of thinking of logical form, I think it must be admitted that even without a precise definition, the notion of a simple proposition makes sense and that we do talk about the logical forms of propositions despite our not having an adequate theory of what logical form is all about. Thus for now I will make free use of the notion of a simple proposition without attempting further clarification. Stable propositions form a broader class than just the simple propositions. For example, conjunctions of simple propositions are stable. However, this results directly from the notion of stability and does not have to be built into our analysis. Can we then just replace the class of stable propositions in Analysis I1 by the class of simple propositions? Almost, but not quite. Although this does not follow from the definition of stability, it is rather obvious that what we may call the 'internal negations' of simple propositions are also stable. In constructing Pworlds, we do not just try to preserve those simple states that an object has - we attempt to preserve whether or not it has a given simple state. For example, if an object is insoluble, this is something that would be preserved just as much as its solubility were it soluble. An unnecessary change in either direction would be judged gratuitous and disallowed in the construction of P-worlds. The proposition 'x is insoluble' is not
76
CHAPTER I V
what would ordinarily be called the negation of 'x is soluble'. This is because both 'x is soluble' and 'x is insoluble' entail the existence of x. Logicians have traditionally distinguished here between 'internal' and 'external' negation. The external negation is ordinary negation. The internal negation of a proposition of the form ^x is F1can be defined in terms of external negation as rx exists but -Fxl. Thus to preserve whether or not an object has a particular simple state is to preserve the truth value of the corresponding simple proposition and its internal negation. Apparently, then, what we seek to preserve are those truths in the class consisting of all simple propositions and internal negations of simple propositions. Thus we are led to: in: If S is the class of all simple propositions and their internal negations, then a M P iff P E a and G G a , and S D a D an is a maximal-P-consistent subset of S D ao. However, a difficulty arises for Analysis 111. The only changes countenanced by this analysis consist of deleting truths from a (replacing them by their negations). However, a subjunctive hypothesis may force us to add new propositions too. For example, suppose ^(3x)Fx1 is false in ao. Taking this as our subjunctive hypothesis may force us to introduce a new object not existing in %. This may constitute a smaller change than making one of the objects already in a. an F when it is not now an F. Then if aMP, a will contain simple propositions or the internal negations of simple propositions which were not true in 0.0. Thus the total change in going from a. to a may consist of the deletion of some members of a. and the addition of some new propositions. This change has two parts: the deletion of some propositions in ao, represented by ( a o - a ) ; and the addition of some new propositions, represented by ( a -ao). It will be algebraically convenient to represent this change as a set of propositions indexed by 0 and 1. The propositions indexed by 0 will be those deleted, and the propositions indexed by 1 will be those added. To this end, let us define: (2.3)
ANALYSIS
(aAl3) = [ ( a -B)x{oIlU [03 - a ) x { l I l . The algebraic point of defining 'a A p ' in this way is that the inclusion of one change in another is now represented by ordinary classinclusion.
SUBJUNCTIVE CONDITIONALS
77
In minimizing the change from a" to a , what we want to minimize is [(S f l ao)i^(S n a)]. Letting Sa = S f l a , we are led to:
(2.4)
ANALYSIS IV: If S is the class of all simple propositions and their internal negations, then a M P iff P e a and G c a , and there is no world fi such that P e p and G c f i and (Sai^SJc(SaASa,,).
Analysis IV still suffers from some rather grave shortcomings. Thus far I have been pretending that when '^P+ -(Q & R)' is true, but neither r P + -Q1 nor "P^>-R1 is true, then there is no basis for choosing between Q and R, and hence "(-Q)MP1 and r(-R)MP1 are both true. But this is not always the case. For example, suppose for the sake of simplicity that dry matches always light when struck. Then consider a particular match I had five minutes ago which was dry (D) and in normal surroundings. If I had struck it (S), it would have lit (L). But this cannot be accommodated on our present account of subjunctive conditionals. We have "(S & D ) => L' true, and hence ^S ^> -(D & -L)' is true. Thus our is true. But neither "S - D l nor '^S --L1 account would lead us to conclude that the match might not have been dry if it had been struck ("(-D)MS1 is true), which is clearly mistaken. On the contrary, we would actually conclude that the match would still have been dry even if it were struck, and hence that '^S> L1 is true. What is our basis for concluding this? We are looking for a basis for choosing between two propositions Q and R when their conjunction is counter-implicated by P but neither proposition individually is counter-implicated by P. When do we preserve Q in preference to R ? Sometimes the answer is, 'When part of the reason R is now true is that P is false'. To illustrate, consider the match case again. We preserve D ('The match is dry') in preference to "-L1 ('The match did not light'), because part of the reason the match did not light is that it was not struck, but this is not part of the reason it was dry. More precisely, there was a chain of antecedent circumstances implicating that the match was dry, and another chain implicating that it did not light. These chains of circumstances represent the-way in which it actually came about that the match was dry and did not light. But among the circumstances implicating the match's not lighting was the fact that it was not struck. If we remove that from the
+
+
78
CHAPTER I V
antecedent circumstances, the remaining circumstances are no longer sufficient to implicate that the match did not light. Consider another case. Suppose we have a metal cylinder of volume vo filled with an ideal gas (one obeying the Boyle-Charles law) under pressure po and temperature to. If we had heated the cylinder, the pressure of the gas would have increased. We conclude this because we know the Boyle-Charles law (the pressure equals a constant times the ratio of the temperature and volume), and we believe that the volume would remain unchanged even if the cylinder were heated (let us pretend there is no thermal expansion in the metal). Why, in constructing P-worlds (where P = 'The cylinder was heated'), do we preserve the volume of the gas at the expense of the pressure of the gas? Because part of the reason the pressure was po is that the cylinder was not heated, but that is not part of the reason the volume was vo. More precisely, the historically antecedent circumstances implicating that the pressure was po contained in an essential way the fact that the cylinder was not heated, but this is not true of the circumstances implicating that the volume was VO. The above two examples have the characteristic that there are circumstances historically antecedent to R which implicate R, and then others preceding those circumstances and implicating them, and so on. This chain of circumstances represents the way in which it actually came about that R was true. If we trace out this chain of historically antecedent circumstances, we find that they contain '^-P1, so that by adopting the counterfactual hypothesis that P is true, we 'undercut' the reason R is true. Then given a conflict between one proposition R which is undercut and another Q which is not, we preserve Q at the expense of R. In order for P to undercut R, it is not necessary for the historically antecedent circumstances implicating R to actually contain - P 1 ; more generally, P might counter-implicate some part of those circumstances. To illustrate this, consider an oak tree standing in a field. If a tree of this structure were subjected to a 200 mph wind for a period of five minutes, it would break: ( W & S. =?> B). Furthermore, let us suppose that if certain meteorological conditions M were to occur over terrain of this structure, there would be 200 mph winds for a period of at least five minutes: ( M & T.+W). As "W & S. =?> B' is true, we have ^ W 3 -(S & -B)" true. W does not undercut S, but W
SUBJUNCTIVE CONDITIONALS
79
does undercut "-B1: the historically antecedent circumstances implicating "-B1 include that the tree has not been subjected to certain forces, and the circumstances implicating the latter include that there have not been 200 mph winds. Thus we are led to conclude that W > B 1 is true. Furthermore, as "M & T . 3 W is true, we have that "(M & T ) 3 -(S & -B)' is true. " M & T1 does not undercut S, but it does undercut ^-B1 because it implicates W which undercuts "-B1. Thus we should conclude that r M & T.>B1 is true, and sure enough, that is precisely what we would conclude. As a further illustration of my diagnosis of how these conditionals arise out of undercutting, it is worth noting that in the above example we would actually go further and draw the stronger conclusion that M > B1 is true. Intuitively, this is because we are confident that the terrain would retain its structure even if the meteorological conditions were to occur. Formally, this results once more from undercutting. As we have both " W & S. 3 B1 and ^ M & T . W , we have M & T & S. 3 B1, and hence " M 3 -(T & S & -B)'. M does not undercut either T or S: no matter how far we trace out their historical antecedents, we find nothing counter-implicated by M. But M does undercut "-B1, because we have seen that the historical antecedents of "-B1 include that the tree has not been subjected to 200mph winds, and the circumstances implicating the latter include that conditions M have not obtained. So once again, we can explain the truth of the counterfactual in terms of undercutting. Let us contrast these examples with the BizetIVerdi case. If Bizet and Verdi had been compatriots, then Bizet might have been Italian, and Verdi might have been French. Here we do not preserve either 0 ('Bizet was French') or R ('Verdi was Italian') at the expense of the other. This is because their not being cornpatriots does not undercut the reason that either man had the nationality he did. For each of 0 and R, there is a chain of historical antecedents going back indefinitely far in time which at no point is counter-implicated by P (although P does, of course, counter-implicate the combined historical antecedents of 0 and R). I would urge that in this notion of undercutting lies the solution to understanding how counterfactuals work. But how do we capture in a precise way this notion of P undercutting R ? I believe that this can be
80
CHAPTER I V )
done by preserving simple propositions with early dates in preference to those with laterdates. I argued that simple propositions are propositions ascribing 'simple states' to objects. As such, simple propositions are dated. That is, they ascribe a state to an object at a time. For a possible world a and time t, let us define Sa{t) to be the set of simple propositions or internal negations of simple propositions true in a with date no later than t. Then I suggest that we impose the following requirement on a in order to have a M P : (2.5)
REQUIREMENT OF TEMPORAL PRIORITY: If aMP, then for each time t, S a ( t ) A S a o ( t )is minimal, i.e., there is no (3 such that P e (3 and G i= (3 and ( S P( t )A San(t))<= ( S a( t ) A Sa,,(t)).
This requirement has the effect that in deciding what is true in a at a t,ime t, we must first decide what is true for all prior times. We cannot modify a truth with a late date t and then set about modifying propositions with earlier dates just to accommodate it, because that would lead to gratuitous changes in S a ( t ) , the set of truths preceding the one in question. We can only modify S a ( t ) in response to 'internal stresses', and not just to bring about some change later than t. It is not obvious how this requirement relates to the notion of one proposition undercutting another, so let me try to establish the connection. We ordinarily suppose that states have 'historical antecedents', i.e., combinations of earlier states of objects which implicate them. This is, essentially, the traditional assumption that every event has a cause. Regardless of whether this is always true, consider two simple propositions, Q and R which do have historical antecedents, and whose historical antecedents have historical antecedents, and so on indefinitely. Suppose rP => -(Q & R)l is true, but neither " P => -Q1 n'or "P R1 is true. What does the requirement of temporal priority tell us to preserve. We cannot just alter one or the other of Q and R without also altering its historical antecedents, and the requirement of temporal priority tells us that we must take things in temporal order, so we must decide which of the historical antecedents to preserve before we decide which of Q and R to preserve. Two possible cases can arise: First, it may happen that at no point in the past are the historical antecedents of either 0 or R counter-implicated by P. This is like the
+-
SUBJUNCTIVE CONDITIONALS
81
Bizet/Verdi case. In constructing a P-world, we have a choice between including the historical antecedents of Q and those of R. Either choice is possible, because in either case we can construct the set of truths so that for each t, (Sa(t)ASa.,(t)) is minimal and contains the historical antecedents to that point of Q or R (whichever is chosen to be preserved in a ) . Thus there will be P-worlds containing Q and P-worlds containing R, and so we conclude of either that it might be true if P were true, but not of either that it would be true to the exclusion of the other. Second, it may happen that at some point t in time, the historical antecedents of one of Q or R (let us suppose it is R ) are counterimplicated by P. In other words, P undercuts R. As those historical antecedents are counter-implicated, they cannot be included in Sa(t). But given that the historical antecedents of R are precluded from Sa(t), there is nothing in Sa{t) with which the historical antecedents of Q conflict, and hence the historical antecedents of Q must be included in Sa(t) - to omit them would be a gratuitous change and is ruled out by the requirement of minimal change. The historical antecedents of Q implicate Q, so Q must be included in a , and hence R must be precluded. Thus we are led to conclude that Q would be true if P were true, and R would be false. Thus the requirement of temporal precedence seems to correctly capture the idea that if P counter-implicates a conjunction and undercuts one of the conjuncts but not the other, then the conjunct that is undercut is sacrificed and the other preserved. This can be illustrated by returning to the match example. The historical antecedents of the match's being dry include such things as that it has not recently been rained upon or dunked in a bucket of water. These in turn have historical antecedents having to do with the location of the match, local climatic conditions, the actions of nearby agents, etc. ~ h e s ehistorical antecedents themselves have historical antecedents (presumably), but at no point in tracing out this chain of historical antecedents do we encounter anything counter-implicated by the match's being struck. On the other hand, the-historical antecedents of the match's not having lit include such things as its not having been heated to a certain temperature, or exposed to certain chemicals, or struck. This is immediately counter-implicated by the match's being struck. Hence we conclude that the historical antecedents of the
82
CHAPTER I V
match's being dry would still have been true even if the match had been struck (as they conflict with nothing not counter-implicated by the match's being struck); and hence that the match would still have been dry if it had been struck; and thereby we are forced to conclude that the match would have lit if it had been struck. The requirement of temporal priority seems to give the right answer when applied to states all of which have historical antecedents. But it is not obviously a necessary truth that all states have historical antecedents, and it has actually been proposed at various times that certain kinds of states (quantum mechanical states, miracles, etc.) do not have historical antecedents. How does the requirement of temporal priority fare in connection with these states, if indeed there are such states? First, suppose that neither Q nor R have historical antecedents. Suppose further that the date of Q is earlier than that of R. Then the requirement of temporal priority tells us to preserve Q at the expense of R. This seems reasonable: if P were true, then at the time Q occurred, nothing would yet have happened to preclude Q's being true, so to make Q false would be a gratuitous change; but then once Q is true, when the date of R comes up, something has occurred namely Q in conjunction with P - to preclude R's being true. Thus it seems correct to give preference to the earlier state. On the other hand, if Q and R have the same date, there is no reason to give preference to either - either might be true - and this is just what the requirement of temporal priority tells us. Finally, let us suppose R has no historical antecedents, but Q does, and the historical antecedents of Q extend back in time prior to the date of R. Once more, if P were true, then prior to the date of R there would be nothing to preclude the occurrence of the historical antecedents of 0 , and hence they would still occur even if P were true. But those historical antecedents implicate 0 , and hence together with P they implicate -R. Therefore, there is something to preclude R's being true when its date comes up. Once again, the requirement of temporal priority gives what seems to be the correct answer. I believe that in the requirement of temporal priority lies virtually the entire solution to the problem of relating the truth conditions of subjunctive conditionals to the notion of a minimal change. At least in the case in which P is consistent with G, this requirement seems to be sufficient to account for all of the judgments we make regarding
SUBJUNCTIVE CONDITIONALS
subjunctive conditionals. This leads to the analysis: (2.6)
v: a M P iff P Ea and G c a and for every time t, there is no world /3 such that P E/3 and G G /3 and (Sao(t)ASp(0)<= (S<JOAsa(t)).
ANALYSIS
There remains a surprising difficulty for Analysis V. In order to appreciate the nature of this difficulty, it must be recognized that Analysis V incorporates two distinct elements - an account of how minimal changes enter into the truth conditions of subjunctive conditionals, and an analysis of minimal change. Up to this point all of our torturous twistings and turnings have been concerned with the first element and we have just assumed that a quite simple characterization of minimal change is adequate. That assumption must now be questioned. The two elements of our analysis can be separated as follows. Where X is a set of propositions and Y is a set of propositions indexed by 0 and 1, let us define: (2.7)
X + Y = ( X U { Q ; ( Q , l ) e Y})-{Q; ( 0 , O ) e Y}.
Thus X + Y is the result of deleting from X those propositions in Y indexed by 0, and adding those propositions indexed by 1. In constructing P-worlds, we want to make minimal changes which will result in P being true along with all of the members of G. Thus we are interested in changes to worlds which result in certain sets of propositions (in particular, G U{P}) being true. So let us define: (2.8)
If F is a set of propositions, a set X of indexed propositions is a F-change to a. at time t iff there is a world a such that Sa(t)= Saa(t)+X, and r<= a.
For simplicity, let us also say that X is a P-change (relative to G ) iff X is a G U {P} change. Definition 2.8 defines the notion of a F-change, but does not tell us what is required for a F-change to be minimal. However, ignoring this oversight for the moment, we can restate that part of Analysis V which relates subjunctive conditionals to minimal changes as follows:
(2.9)
a M P iff P e a and G c.a and for every time t, (Sao(t)ASa(t))is a minimal P-change (relative to G ) to a. at time t.
84
CHAPTER I V
I believe that this much of Analysis V is correct, at least for the noncounter-legal case in which P is consistent with G. Turning next to the notion of a minimal change, we can define:
(2.10)
X is a strictly minimal F-change to a0 at time t iff there is no F-change Y to a. at time t such that Y <= X.
Analysis V embodies the apparently reasonable identification of minimal F-changes with strictly minimal F-changes. However, this initially reasonable identification is ultimately indefensible. The difficulty arises from the fact that there can be P-changes which do not contain strictly minimal P-changes. For example, suppose there are finitely many F's in ao, and let P be the counterfactual hypothesis that there are infinitely many F's. Any P-change must result in our adding infinitely many F's. But given any such change, there is always a smaller P-change - one adding one fewer F. Thus there are no strictly minimal P-changes in this case. If we identify the minimal P-changes with the strictly minimal P-changes, this leads to the conclusion that all subjunctive conditionals having P as their antecedent are vacuously true. But this conclusion is clearly incorrect. It is not true, for example, that if there were infinitely many F's then there would be finitely many F's. Or to take a concrete example, suppose the universe contains only finitely many stars. It is certainly not true that if there were infinitely many stars in the universe then my car would be painted black. Thus there being no strictly minimal P-changes cannot be adequate to make vacuously true all subjunctive conditionals having P as their antecedent. If a P-change X does not contain a strictly minimal P-change, then there must be an infinite sequence Xi, X z , . . . , X,,, . . . of progressively smaller P-changes such that Xi <= X and for each \ X,+, c Xi.Settheoretically, these P-changes constitute a nest whose lower limit (the intersection of all the P-changes in the nest) is not itself a P-change. In such a case, how do we evaluate a counterfactual whose antecedent is P ? A natural suggestion would be that r P > Q1 is true iff for every such sequence there is an i such that for every j 2 i, the P-change X, makes Q true. This would be analogous to David Lewis' analysis which we discussed in Chapter I . ~Unfortunately, it is subject to similar difficulties. Most important, it would lead us to affirm subjunctive
)
SUBJUNCTIVE CONDITIONALS
85
conditionals which it seems ought not to be affirmed. For example, if there are infinitely many F's in the real world, then for each natural number n, this proposal would make true the conditional 'If there were only finitely many F's, then there would still be more than n F'sl. But this cannot be correct. For at least some natural numbers n it must be true that if there were only finitely many F's, then there might be nF's. This example also shows that the Generalized Consequence Principle would fail on the present proposal, and it still seems that that principle should be true. Thus I think that this cannot be the correct way to analyze the truth conditions of subjunctive conditionals. When we have such an infinite descending sequence of P-changes not containing any strictly minimal P-change, then at least some of the P-changes in the sequence must constitute minimal changes themselves. For example, if P is the proposition that there are finitely many F's, then any change which consists merely of changing the set of F's so as to make it finite and altering other truths minimally to accommodate this change must be considered a minimal change, although as we have seen, such a change will not be a strictly minimal change. In order to avoid the above sorts of difficulties, our conception of a minimal change must be such that every P-change contains a minimal P change. This can be considered a criterion of adequacy for any analysis of the notion of a minimal P-change. How can we construct an analysis of the notion of a minimal P-change which will satisfy our criterion of adequacy? Let us say that a descending sequence (a nest) of P-changes is unbounded when there is no P-change which is contained in every member of the sequence. Then we might be tempted to suppose that every member of any unbounded sequence of P-changes is minimal. This would accommodate our observation that any change which 'just' makes the set of F's finite would be considered a minimal P-change when P is the proposition that there are finitely many F's. Unfortunately, this proposal errs in the direction of being too inclusive. It does not allow us to discriminate between gratuitous and non-gratuitous changes. Given any unbounded sequence of P-changes, we can construct a new unbounded sequence of P-changes by adding a new first member which results from making all sorts of gratuitous changes to the first member of the original sequence. On the present proposal, this new
86
CHAPTER I V
first member would have to be considered minimal, but it clearly should not be. In order to sort this out we must consider how the necessity for unbounded sequences of P-changes arises. Let us say that P itself is unbounded (in a world a at time t ) when there are unbounded sequences of P-changes. Unbounded propositions frequently arise as limits of sequences of progressively stronger or progressively weaker bounded propositions. For example, consider the unbounded proposition "There are infinitely many F'sl. This can be regarded as the limit of the sequence of progressively stronger propositions of the form "There are more than n F'sl. This limit can be thought of as the infinite conjunction of the propositions in the sequence. This can be made precise as follows. I do not really want to commit myself to there being propositions which are infinite conjunctions, but the effect of infinite conjunctions can be captured unobjectionably as follows. Let us define: (2.11)
P *->UP iff for every possible world a, P is true in a iff every member of F is true in a.
Intuitively, 'P <->ZIP means that P is equivalent to theconjunction of all the propositions in F. However, what is being defined is the entire relation '<-> n ' , and we are not employing the expression 'HF' as a term purportedly denoting an infinite conjunction. If P is unbounded in a, then there is often a sequence {Qi; i e w } of bounded propositions such that (i) P <-> H{Qi; i e w } , and (ii) for each i e w, Q i + i+ Qi.In this kind of case, a minimal P-change should be one that 'just' makes all of the 0 , ' s true. Such minimal P-changes can be regarded as the upper limits of sequences of minimal changes making the 0;'s true for progressively larger i. Precisely: (2.12)
If {Qi; i e w } is a sequence of bounded propositions such that ?or each i e w , Qi+i + Qi, and P <-Ã H{Qi; i w } , and {Xi; i e w } is a sequence such that for each i e w , Xi is a strictly minimal 0,-change and Xi G Xi+i, then [I{Xi; i e w } is a minimal P-change.
There is a second way that a proposition can be unbounded by virtue of being the limit of a sequence of bounded propositions. We have
87
SUBJUNCTIVE CONDITIONALS
seen that an unbounded proposition may be the upper limit (conjunction) of a sequence of progressively stronger propositions. It can also be the lower limit (disjunction) of a sequence of progressively weaker propositions. For example, if there are infinitely many F's to begin with, then "There are finitely many F'sl is unbounded and can be regarded as the lower limit of the sequence of propositions of the form "There are no more than n F'sl. Let us define:
(2.13)
P<-> XT iff for every possible world a , P is true in a iff some member of F is true in a.
Then P can be unbounded by virtue of there being a sequence {Qi; i a)} of bounded propositions such that P <-> S{Qi ; i e a)}, and for each i e a), Qi + Qi+i. If P is the limit (i.e., disjunction) of such an infinite sequence of progressively weaker bounded false propositions Qi, then it seems initially that for any i a), a strictly minimal Qi-change should be considered a minimal P-change. However, this will not work. The difficulty is that we could gratuitously strengthen the first member of the sequence by conjoining it with an unrelated proposition R. P would still be the disjunction of the resulting sequence of propositions, but a minimal "(Qo & R)'-change should not be considered a minimal P-change. The solution to this difficulty seems to be to turn things around. If P <-> S{Qi; i e w}, then "-P1- H{"-Qil; i e a)}. Then we can rule out gratuitous changes by saying that a minimal P-change is a P-change which results in a world a which is such that we can get back to our original world q, by making a minimal "-P1-change to a . Precisely: (2.14)
If {Qi; i e a)} is a sequence of false bounded propositions such that for each i e w, Qi + Qi+i, and P S{Qi; i e a)}, then X is a minimal P-change (relative to G) to a. at time t iff there is a world a such that Sa(t)= S a ( t ) + X and G U{P}c a , and there is a Y which is a minimal "-P1change (relative to G ) to a at time t such that Sno(t)= Sa (t) + Y. @
Unbounded propositions can be either upper limits (conjunctions) or lower limits (disjunctions) of sequences of progressively stronger or
88
CHAPTER IV
weaker bounded propositions. However, we cannot stop there. We can also have propositions which are unbounded by virtue of being limits of other unbounded propositions which are themselves limits of bounded propositions. For example, if there are now finitely many F's but infinitely many G's, then the proposition "There are infinitely many F's and finitely many G'sl is unbounded, but it is not a limit of a sequence of bounded propositions. However, it can be regarded as the upper limit of the sequence of propositions "There are at least n F's, and there are finitely many G'sl, and the latter propositions are themselves lower limits of the sequences of propositions of the form T h e r e are at least n F's, and there are no more than m G'sl. This indicates that we must generalize our account and make it recursive. We first define the notion of a minimal "There are at least n F's, and there are finitely many G'sl-change as above, and then by repeating our constructions we define the notion of a minimal T h e r e are infinitely many F's and finitely many G'sl-change. But before making the above precise, we must note a further difficulty. This is that P may be unbounded not because of its own logical character, but because of the limit nature of propositions which follow from the combination of P with G. Thus we must generalize all of the above to talk about the case where the whole set G U { P } is the upper or lower limit of a sequence of progressively stronger or weaker sets of propositions. To this end we first define: (2.15)
If F is a set of propositions and V is a collection of sets of propositions, then: (a) F ++ ST iff for every possible world a, all of the members of F are true in a iff all of the members of some element of V are true in a ; (b) F ++ U^ iff for every possible world a , all of the members of F are true in a iff all the members of every element of V are true in a.
Let us also generalize our notion of entailment to hold between sets of propositions: (2.16)
If F and A are sets of propositions, then F + A iff for every possible world a , if all the members of F are true in a then all the members of A are true in a.
SUBJUNCTIVE COND~TIONALS
89
We can now formulate three principles regarding minimal changes:
(2.17)
If X is a strictly minimal F-change then X is a minimal F-change.
(2.18)
If T is a sequence { T i ; i e w} of sets of propositions, and F <-> IIV, and for each i e w, Ti+1 -+ Ti, and y is a sequence {xi; i e w} such that for each i e w, x, is a minimal ^,-change and xic xi+l, then \J ,y is a minimal F-change.
(2.19)
If T is a sequence { T i ; i OJ} of sets of propositions, and F- S T , and for each ie w, Ti -+Ti+1, and there is a Y such that if Neg F is the set of negations of members of F then Y is a minimal Neg F-change to a at t and Sa,(t) = at time t. Sa(t)+ Y, then X is a minimal F-change to
My conjecture is now that these three principles completely characterize the notion of a minimal change. In other words, proceeding recursively, for each natural number k we define the notion of a minimal); change as follows: (2.20)
X is a minimalk F-change to a. at time t iff either: (a) X is a strictly minimal F-change; or
(b) there is a sequence T such that F <-Ã I l T , and for each i e w, !Pi+, + Ti, and there is a sequence ,y such that for and for some j < k , xi is a minimal, each i e w , xi c T,-change, and X = (J x; or (c) there is a sequence T such that F <-> S^ and for each i e w, V, -+ T i + i ,and there is a world a such that S,,(t) = S a ( t )+ X and FC a, and for some j < k , there is a minimal Neg F-change Y to a at time t such that S%(t)= Sa (t) + Y.
Then my conjecture is: (2.21)
X is a minimal F-change iff for some natural number k , X is a minimalk F-change.
Principle 2.21 constitutes my proposed analysis of the concept of a minimal change. Several remarks must be made about this analysis.
90
CHAPTER IV
Two undefended assumptions are built into principle 2.21 and definition 2.20. First, it is assumed that limit propositions are always limits of (it-sequences rather than limits of sequences of longer transfinite length. Second, it is assumed that a minimal change is always a minimalk change for some finite k. The form of definition 2.20 is such that it could trivially be extended to a definition by transfinite recursion of 'minimalo for all ordinals /3. My reason for adopting these 'finiteness' assumptions is simply that appeal to examples has not convinced me of the necessity of making the analysis more complicated than it already is. However, should we become convinced of that necessity, the required amendment to definition 2.20 is of a trivial nature, and it is quite obvious how to make the amendment. I have defended a criterion of adequacy for any analysis of the notion of a minimal change. This is that every .T-change ought to contain a minimal F-change. Is this criterion satisfied by the above analysis? Unfortunately, I do not know. My only evidence on this point is the negative evidence of not having found any counter-examples to the satisfaction of the criterion of adequacy. I do not know how one could give a general proof that the criterion of adequacy is satisfied. This is because of the extreme generality of the question (it is about all propositions) and uncertainties regarding the notion of a proposition. In Chapter VI, where we restrict our attention to propositions formulable in a certain kind of formal language, the question becomes, at least in principle, more readily answerable, although even there I am unable to supply a proof of the sort desired. Nevertheless, the lack of counter-examples to the satisfaction of the criterion of adequacy leads me to conjecture that it is satisfied. Principles 2.9 and 2.21 constitute what will be called 'Analysis VI'. This is my analysis for non-counter-legal conditionals, and I believe that it is correct at least for the restricted case it is intended to capture. However, before we can claim to have a full fledged analysis of subjunctive conditionals, we must accomplish three more things. First, our analysis turns upon the notion of a simple proposition, so we must examine that notion more carefully and, hopefully, provide an analysis of it. This will be undertaken in the next section. Second, we must extend the analysis of subjunctive conditionals to include the case in which the antecedent is inconsistent with the set of true subjunctive
SUBJUNCTIVE CONDITIONALS
91
generalizations. This is the case of counter-legal conditionals, and it is the topic of section four. Third, we must extend the analysis still further to encompass the case in which the antecedent and consequent of the conditional are not indicative. We want to be able to deal with conditionals whose antecedents and consequents are themselves subjunctive conditionals or subjunctive generalizations. This task will be undertaken in the next two chapters.
Our analysis of subjunctive conditionals turns rather heavily upon the notion of a simple proposition. Simple propositions will also play important roles when we turn to the analysis of causes and probabilities. The important role which simple propositions play in these analyses is unfortunate. Contemporary philosophers do not like simple propositions, and even if we can succeed in giving an adequate analysis of the notion of a simple proposition, their use here is going to be repugnant to many philosophers. Perhaps, then, we.should review how we got into this fix. The basic idea lying behind our analysis of subjunctive conditionals is that they have to do with making minimal changes to accommodate the truth of the counterfactual hypothesis. But upon inspection, it seems that the notion of a minimal change only makes sense given a notion of a simple proposition. In making minimal changes we cannot treat all propositions on an equal footing, because that would have the result that virtually all changes would be minimal. Given any finite set of false proposition which we want to make true, we could always form their conjunction and then regard making that conjunction true as a single change. The only way to rule this out is to somehow rule out conjunctions and disjunctions from the propositions we look at in deciding whether a change is minimal. But supposing that this can be done is just to suppose that there is a class of propositions which are not conjunctions, disjunctions, etc., i.e., it is to suppose that there is a class of simple propositions. Thus it seems that the only way to make sense of minimal changes is in terms of simple propositions. It appears we have no option but to try to make sense of simple propositions.
92
CHAPTER IV
In fact, I think that the notion of a simple proposition makes perfectly good sense when viewed from an epistemological point of view. There is an intuitive distinction between those propositions which one can know 'in one fell swoop' and those propositions which one knows by first knowing pieces of them. More precisely, we want to rule out conjunctions, disjunctions, and in general logical compounds as not being simple. It is clearly not satisfactory to rule out all propositions equivalent to conjunctions, disjunctions, and the like, because that would rule out all propositions. We want to rule out those propositions that are, in some sense, really conjunctions or disjunctions. I suggest that what characterizes such propositions is that the only non-inductive way of coming to know their truth is by ascertaining the truth of their conjuncts or disjuncts and then performing the appropriate logical inference. For example, the only noninductive way of knowing the truth of 'Either the car is white or it has four wheels' is to know the truth of one of the d i s j ~ n c t s But . ~ although 'The car is white' is equivalent to the disjunction 'Either the car is white and has four wheels, or the car is white and does not have four wheels', we do not have to know the truth of either of these disjuncts before we can know that the car is white. The case of conjunctions is analogous to the case of disjunctions. The only non-inductive way of knowing the truth of the conjunction 'The car is white and has four wheels' is to know the truth of both conjuncts. In contrast to the situation for conjunctions and disjunctions, I would propose that a simple proposition is one whose truth can be known non-inductively without first coming to know the truth of some proposition or propositions which entail it. Thus disjunctions and conjunctions are not simple, because to (non-inductively) know a disjunction you must infer it from one of its disjuncts, and to (noninductively) know a conjunction you must infer it from its conjuncts. But a proposition like 'The car is white' would seem to be simple, because it is possible to come to know its truth by proceeding directly in ways determined by its meaning and without inferring it logically from other propositions that entail it.6 This makes our notion of a simple proposition, at base, an epistemological notion. A simple proposition is one whose truth can be known non-inductively by proceeding
SUBJUNCTIVE CONDITIONALS
93
in some direct manner which is dictated by the meaning of the proposition and which does not involve deducing the proposition logically from some simpler (in the order of epistemological complexity) propositions it is possible for us to know first. Thus my proposal is: (3.1)
P is simple iff it is logically possible for one to know the truth of P non-inductively without first knowing the truth of each proposition in some set F which entails P.'
It is unfortunate that the success of this analysis turns upon the truth of certain epistemological theories. Although I have defended those theories elsewhere, a person who does not accept them cannot be expected to accept the above analysis of a simple proposition either. But I do not think that anything can be done about this. The notion of a simple proposition is basically epistemological, and hence its characterization must turn upon one's epistemological theories.
4. COUNTER-LEGAL CONDITIONALS A subjunctive conditional whose antecedent is inconsistent with the set G of all true subjunctive generalizations is called a counter-legal conditional. Thus far we have been avoiding counter-legals. Now the analysis must be extended to include them. We can do this by making use of certain conclusions defended in Chapter 11. We still restrict our attention to subjunctive conditionals having indicative antecedents and consequents. To facilitate our discussion, let us now adopt a more complicated view of possible worlds. Thus far we have been identifying possible worlds with the sets of propositions true in those worlds. It is now more convenient to identify a possible world a with the ordered triple ( T a ,Na, W a )where Ta is the set of indicative propositions true in a , Na is the set of basic strong generalizations true in a, and W a is the set of basic weak generalizations true in a. In Chapters V and VI we will adopt even more complex representations of possible worlds, but for now such ordered triples are sufficient for our purposes. Now let us begin by considering strong subjunctive generalizations. There is a basic class N whose members are confirmed inductively, and
94
CHAPTER IV
then other strong generalizations are derived logically from those in N. Let V N be the set of material generalizations corresponding to the subjunctive generalizations in N. Given two predicates F and G, if the supposition "(3x)Fx1 is consistent with V N , then "(Fx 3 Gx)' is true iff ^(x)(Fx => Gx)' is entailed by the generalizations in V N . But if the supposition ^(3x)Fx1 is inconsistent with V N (so the generalization itself is counter-legal), it was argued that "(Fx 3 Gx)' is true iff ( x ) ( F x => Gx)' is entailed by every maximal-r('3x)Fx'-consistent subset of V N . For present purposes, the best way to think of this is as follows. The different maximal-r(3x)Fx1-consistentsubsets of V N represent the results of making minimal changes to N in order to accommodate the truth of P, which in turn represent the possible choices regarding what might be the set of all true basic strong generalizations if "('3x)Fx1 were true. In other words, these are the sets of basic strong generalizations in different r(Sx)Fxl-worlds. This immediately suggests a general way to deal with counter-legal subjunctive conditionals. If P is inconsistent with V N , then the results of making different minimal P-changes to N represent the sets of true basic strong generalizations in different P-worlds. Thus to accommodate counter-legal conditionals whose antecedents are inconsistent with the set of strong generalizations, we impose the following requirement on
M: (4.1)
CONSERVATION OF STRONG GENERALIZATIONS: If P is inconsistent with V N , and aMP, then Na, the set of basic strong generalizations true in a, must be the result of making a minimal {P}-change to N.
The idea is that P may force us to reject some of our strong generalizations, but we are constrained to reject as few as possible. As there may be different choices regarding which to reject in order to render the resulting set P-consistent, we must look at all of those choices in order to determine what would be true if P were true. In Chapter 11, we identified the results of minimal P-changes to N with the sets of subjunctive generalizations corresponding to the material generalizations in maximal-P-consistent subsets of V N . This, in effect, is to identify minimal P-changes to N with strictly minimal P-changes to N, and in light of the discussion of section two, such an
SUBJUNCTIVE CONDITIONALS
95
identification is at least suspicious. It is plausible to suppose that we should instead proceed in a manner analogous to our analysis of minimal P-changes to S a ( t ) . I have been unable to find persuasive examples indicating that this level of sophistication is necessary at this point, but in the interest of safety, it seems best to follow this course anyway. The P-changes to N that arise from indicative propositions are simply deletions, so it seems that we can analyze this notion as follows: (4.2)
If X and Y are sets of subjunctive generalizations F is a set of propositions, and k is a natural number, then Y is the result of making a minimalk F-change to X iff either: a ) VY is a maximal-F-consistent subset of VX; or (b) there is a sequence A of sets of propositions such that F <-Ã UA, and for each i e w, A -+A,+i, and there is a sequence x such that for each i e QJ, xi+1 c xi and for some j < k, xi is the result of making a minimal, A,change to X, and Y = U x; or (c) there is a sequence A of sets of propositions such that F <-> SA, and for each i w, A, + Y is the result of making a F-change to X, and for some j < k, X is the result of making a minimal, Neg F-change to Y.
(4.3)
Y is the result of making a minimal F-change to the set of X of subjunctive generalizations iff for some natural number k, Y is the result of making a minimalk F-change to X.
Next consider what happens when P conflicts with the class of weak generalizations. Weak generalizations have a structure analogous to that of strong generalizations. There is a class W of basic weak generalizations, and then others are inferred from the basic ones together with the strong generalizations. When P conflicts with the weak generalizations, or more generally with N U W, we first modify N as above to render it P-consistent, and then we delete as few things as possible from W so as to render Wa consistent with VNa U{P}. The basic idea is that strong generalizations take precedence over weak generalizations, so we first conserve as many of the strong generaliza-
96
CHAPTER IV
tions as possible, and then having done that, we proceed to conserve as many of the weak generalizations as possible: (4.4)
CONSERVATION OF WEAK GENERALIZATIONS: If P is inconsistent with VW, then We, the set of weak generalizations true in a , must be the result of making a minimal VNnU{P}change to W.
If we put all of these restrictions together, we obtain what 1 believe is a complete analysis of M for the case of indicative antecedents. Let no be the actual world. Then the analysis of M can be stated as follows: (4.5)
ANALYSIS VII:a
M P iff (i) P e Ta; (ii) Na is the result of making a minimal P-change to Na,,; (iii) Wa is the result of making a minimal VNa U {P}-change to Na.; (iv) for every time t, (SQ,,(t)ASa(t)) is a minimal (VNa U V Wa U{P})-change to Ta,, at time t.
This analysis, together with the definition: (4.6)
DEFINITION: rP>Q1 is true iff, for every possible world a , if aMP then Q Ta.
constitutes my analysis of simple subjunctives for the case in which P and 0 are both indicative statements. Although our analysis is restricted to the case in which the antecedent and consequent of a conditional are indicative, it is still of some interest to see what the analysis entails about the logical properties of these conditionals. First, if the analysis is correct, logical necessity, and hence logical entailment, can be defined in terms of the subjunctive conditional. I have argued that a correct analysis must have the consequence that every P-change contains a minimal P-change, and this has the result that ^ P > -P1 is true iff there are no P-changes. Thus, on the as yet unverified assumption that my analysis does satisfy my stated criterion of adequacy, we have: 'UP1 is true iff r(-P > P)' is true. (4.7) Second, the principles 6.1-6.8 of Chapter I1 are all valid. This means that the axiomatic theory SS formulated in Chapter I1 is sound. Third,
SUBJUNCTIVE CONDITIONALS
97
the principle 7.1 of Chapter 11, which distinguished SS from David Lewis' theory C l , is not valid. Thus Analysis VII is in accord with the conclusions defended intuitively in Chapter 11. Now what remains is to generalize Analysis VII to include the case in which P and Q are not indicative sentences. This is basically a matter of turning the analysis into a recursive definition, and will be the topic of Chapter VI. W e will also want to combine quantifiers with subjunctive conditionals. The groundwork for this will be laid in Chapter V, and then the full theory will be developed in Chapter VI.
Before proceeding with the further analysis of the simple subjunctive, let us turn to a different subjunctive conditional which is really a variant of the simple subjunctive. In Chapter I, in discussing whether subjunctive conditionals are subject to pragmatic ambiguity, we had occasion to discuss pairs of conditionals like: (A) (B)
If that were gold, it would be malleable. If that were gold, some gold things would not be malleable.
A t that point it was suggested that the apparent conflict between these two conditionals could be resolved by resolving an ambiguity in the antecedent. It was claimed that despite appearances, the antecedents of these conditionals d o not have the same meaning. The difference in emphasis changes the meaning so that (B) actually means something like: (B*)
If some gold were like that (i.e., had the properties that has), then some gold things would not be malleable.
I believe that as a resolution of the conflict between the members of this particular pair of conditionals, this is acceptable. However, this pair of conditionals constitutes an example of a more general phenomenon which cannot always b e dealt with so simply. Conditional (A) is a perfectly straightforward simple subjunctive conditional and is to be analyzed in accordance with Analysis VII. But conditional (B) is not a simple subjunctive conditional, although it is a close cousin to the
98
CHAPTER I V
simple subjunctive. Conditional (B) is an example of what I will call a 'preferred subject conditional'. Before considering how preferred subject conditionals are to be analyzed, let us consider some additional examples. An interesting pair of conditionals is: (C)
If I were a member of the Lakers, I would be a good basketball player.
(D)
If I were a member of the Lakers, they would not have a very good team.
In defense of (C) we simply observe that there is a weak subjunctive generalization to the effect that no one could be a member of the Lakers unless he were a good basketball player. Thus the truth of (C) is entirely in accordance with our analysis of the simple subjunctive. But in defense of (D), we observe that I am a lousy basketball player, and hence were I a member of the Lakers they would not have a very good team. The effect of emphasizing 'I' in the antecedent of (D) seems to be to indicate that in evaluating the truth of this conditional we seek to preserve my attributes in preference to those of the Lakers. We might try to paraphrase (D) as we did (B) to get something like: (D*)
If some members of the Lakers were like I am (i.e., had the properties I have), then the Lakers would not have a very good team.
But we can see that neither (D*) nor (B*) is an entirely adequate paraphrase. The difficulty is that, presumably, one of my properties is that of not being a member of the Lakers, but we do not want (D*) to say anything like 'If there were some members of the Lakers who were not members of the Lakers, t h e n . . .'. We are naturally led to amend (D*) to talk about 'non-relational properties', but this is not a very clear notion and it will turn out below that such a move is not adequate to deal with other examples of subject preference anyway. As a final e x a m p ~ e consider ,~ the Shah who has a famous collection of blue-white diamonds, and consider a grubby little industrial diamond. Then we have:
(E)
If that were one of the Shah's diamonds, it would be blue-white.
SUBJUNCTIVE CONDITIONALS
(F)
99
If that were one of the Shah's diamonds, not all of the Shah's diamonds would be blue-white.
The antecedent of (E) or (F) is ambiguous between 'If that diamond belonged to the Shah,. . .' and 'If that were one of a i , . . . , a,, (where a i , . . . ,a,, are the diamonds belonging to the Shah). . .'. On the former reading, this becomes analogous to (C) and (D) above. But let us take it instead in the latter, counter-identical, sense. Then we would agree that both (E) and (F) are true. However, neither (E) nor (F) would be true according to Analysis VII. In order to make the antecedent true, we must modify some simple truths regarding either the industrial diamond or the Shah's diamonds, but we have a choice regarding which to modify. Thus all that the results from Analysis VII is the uninteresting conclusion that if the industrial diamond 'were one of the Shah's diamonds, it might be blue-white, or it might instead be the case that not all of the Shah's diamonds would be blue-white. It seems that the effect of emphasizing 'that' in (F) is to give preference to the properties of the industrial diamond over the properties of the Shah's diamonds in modifying the set of simple truths so as to make it consistent with the supposition that the industrial diamond is one of the Shah's diamonds. The effect of emphasizing 'Shah's diamonds' in (E) is just the opposite - to give preference to the properties of the Shah's diamonds. This seems to be what happens in general in cases of subject preference. In (B) the effect of emphasizing 'that' is to give preference to the properties of the piece of cast iron, and in (D) the effect of emphasizing 'I' is to give preference to my properties. This seems to be the effect in general of preferred subjects. In a conditional "If it were true that P then it would be true that Q1, if there is no preferred subject, then in constructing P-worlds we treat everything on a par with everything else, throwing all simple truths into the pot together. But if there is a preferred subject, then we give preference to the simple truths regarding that subject. That is, we only modify them insofar as they are themselves incompatible with P, and then we modify the other simple truths subject to the constraint that the result must be compatible with the surviving simple truths about the preferred subject. Let us symbolize a preferred subject conditional by subscripting the connective with the individual term for the preferred subject: ^ P 2 Q1.
100
CHAPTER I V
What is required for this to be true is, not for 0 to be true in every P-world, but for Q to be true in every (P, a)-world, where these are the worlds constructed by giving preference to the simple truths about the preferred subject. How, exactly, do we construct (P, a)-worlds? First, let us ignore any effect on the subjunctive generalizations and just look at what happens to the simple truths. Letting Sa be the set of simple propositions about the preferred subject a, we first set about making (So I") Tao) P-consistent and then afterwards we render (S ("I T a ) consistent with both P and our decisions regarding ( S o n Tan). In deciding how to modify (Sari TaJ, it may naturally seem that we must still take undercutting into account. However, let us reserve our opinion on this for the moment, and simply assume that (SanTa) must be P-consistent. Ignoring any changes in the subjunctive generalizations, it seems that we should modify Analysis VII by replacing clause (iv) by the two clauses: (iv) (v)
((SonTao)A(SanTa)) is a minimal (VNaUVWaU{P})change to (Sa ("I T,J (in the sense of definition 2.20); for every time t, (San(t)ASa(t)) is a minimal (VNa U V Wa U (Sa I")Ta) ("I {P})-change to Tau at time t.
So far we are ignoring any effect that the counterfactual hypothesis may have on the subjunctive generalizations, but before looking at that we must observe that a slight generalization of the above is required in order to deal with the example of the Shah's diamonds. We can handle (F) as above, but in the case of (E) our preferred subject is not an individual but rather a set of individuals (the set of all the Shah's diamonds). In evaluating the truth of (E), we consider worlds constructed by giving preference to the simple truths about each of the Shah's diamonds. Thus we must define a more general kind of preferred subject conditional: "^P$Q1 where X is a set of individual terms. We can then define 'P 2 Q1 = ^P> Q1. Defining:
we can modify clauses (iv) and (v) above by replacing ' S a throughout by 'Sx'.
SUBJUNCTIVE CONDITIONALS
101
Next, let us ask what effect subject preference has on the subjunctive generalizations. We might naturally suppose that it has no effect, i.e., that clauses (ii) and (iii) of Analysis VII are left unchanged. However, examples (B) and (D) indicate that this is incorrect. In (D) we are using the simple truths about my athletic prowess to override the weak subjunctive generalization that no one could be a member of the Lakers unless he were a good basketball player. And in (B) we are using the simple truths about the piece of cast iron to override the strong subjunctive generalization that all gold is malleable. Thus in a preferred subject conditional, the simple truths about the preferred subject take absolute precedence over everything else. We first set them consistent with the counterfactual hypothesis, and then we proceed to make our other modifications to the world in accordance with the clauses of Analysis VII. As we deal with the simple truths about the preferred subject before we deal with subjunctive generalizations, we cannot deal with them in terms of their historical antecedents, and hence undercutting cannot be involved. So I propose the following:
(5.2)
a is a (P, X)-world Iff (i) P E Ta; (ii) ((Sx fl Tan)A(Sxn To)) is a minimal P-change to (Sx fl Ta,,); (iii) Na is the result of making a minimal ((Sx fl TJ f l { P } ) change to Nan; (iv) Wa is such that (WaU N a ) is the result of making a minimal ((Sx fl TJ U {P})-change to ( W^U Nao); (v) for every time t, (Sao(t)ASa(t))is a minimal (VNa U V Wa U(Sx fl Ta) U {PI)-change to Tau at time t.
(5.3)
^P>Q1 is true iff 0 is true in every (P, X)-world.
One glaring shortcoming of this analysis is that it proceeds in terms of the notion of a statement being 'about' a particular object. This is a very problematic notion in general, and I am not prepared to give an account of it. However, some solace can be gained from the observation that the application of this notion seems relatively more obvious and transparent in the case of simple propositions than it is for
102
C H A P T E R IV
propositions in general. In particular, for the case of simple propositions 'about' appears to be extensional:
This means that we can think of the subscripts on preferred subject conditionals as simply picking out individuals or sets of individuals rather than thinking of them linguistically as picking out individual terms or sets of terms. It should be emphasized here that preferred subject conditionals constitute a genuinely different kind of subjunctive conditional. They are close cousins of simple subjunctive conditionals, but they are not the same. We have given a different analysis for preferred subject conditionals. Actually, turning things around, we can regard the simple subjunctive as a kind of limiting case of preferred subject conditionals, viz., the case in which the set of preferred subjects is empty:
In Chapter I, in arguing against the purported pragmatic ambiguity of the simple subjunctive, I maintained that many putative counteridenticals are not really counter-identicals. We can now see how that claim should be understood. What is true is that many putative counter-identicals are not counter-identical simple subjunctive conditionals, but many of them do turn out to be counter-identical preferred subject conditionals. For example, consider: (G) (H)
If Richard Nixon were Golda Meir, he would be a woman. If Golda Meir were Richard Nixon, she would be a man.
These can be analyzed as: (G*)
RN
= GMZMRN is
(H*)
G M = RN>GM
a woman.
is a man.
In counter-identicals, the preferred subject is often indicated by the word order. However, this can often be overridden through the use of emphasis. We have seen that some putative counter-identicals really are
SUBJUNCTIVE CONDITIONALS
103
counter-identical preferred subject conditionals. However, there still remain a number of putative counter-identicals which are not really counter-identicals on any reading, viz., (1)
If I were Gerald Ford, I would sign the education bill.
This can only be analyzed in terms of roles as meaning: (I*)
If I were in the role of Gerald Ford, I would sign the education bill.
It seems that preferred subject conditionals are of some importance in understanding a number of our subjunctive locutions. I will not pursue their analysis further as we go on to consider quantification and the case of non-indicative antecedents and consequents, but it should be quite obvious how to extend the analysis to those cases. NOTES
'
My terminology differs from that of David Lewis, who calls any world in which P is true a P-world. For example, Q is equivalent to " ( Qv R ) & ( Qv R)'. This is defended in Pollock (1974). ( X A Y ) is defined to be (XU Y ) - ( X f l Y ) . Notice that we do need the qualification 'non-inductive'. It is quite possible to have an inductive reason for believing a disjunction which is not composed of reasons for believing each disjunct separately. More precisely, in the jargon of Pollock (1974) the justification conditions for the statement 'The car is white' include non-conclusive non-inductive logical reasons, whereas the justification conditions for conjunctions and disjunctions include only conclusive non-inductive logical reasons. This has the effect of excluding subjunctive generalizations more or less arbitrarily. It makes no difference to the analysis of subjunctive conditionals whether we count subjunctive generalizations as simple, because they are preserved anyway in constructing P-worlds. I am indebted to Keith Lehrer for this example, and for getting me to think about subject preference in the first place.
-
'
CHAPTER V
QUANTIFICATION, MODALITIES, AND CONDITIONALS
In this chapter I want to begin the investigation of what happens when we combine quantifiers with subjunctive conditionals. This is a more difficult problem than might at first be supposed. It is not entirely obvious that quantification into subjunctive conditionals makes sense. Part of the problem can be seen as follows. We have seen that if our analysis is correct, then logical necessity can be defined in terms of subjunctive conditionals:
Thus quantification into subjunctive conditionals contains as a special case that of quantification into modal contexts, and many philosophers believe the latter to be illegitimate. The difficulty can be made manifest as follows. Ordinarily, r ( x ) $ l is taken to mean that every object in the universe satisfies the open formula $. Thus if quantification into $ is to be legitimate, it must make sense to talk about an object satisfying $. If $ is referentially opaque, it seems not to make sense to talk about an object satisfying $. For a referring term t, let us write r$tl for the result of replacing the free occurrences of x in $ by t. Then it seems that a necessary condition for an object to satisfy i,b is that if t is any term denoting the object, then r$tl must be true. In other words, if the object itself satisfies $, then it must make no difference how we describe the object. But this requires that $ be referentially transparent. If $ is referentially opaque, then it seems that the most we can say is that the objecl cum description satisfies 4, which is tantamount to saying that it is not objects but rather individual concepts that satisfy $. The received view on modal operato s is that they generate referenabw tially opaque contexts, so in light of the equ~valence-3& it would follow that subjunctive conditionals are also referentially opaque. We can
QUANTIFICATION, MODALITIES, CONDITIONALS
105
apparently verify this by turning to particular examples. For example, Neil Armstrong was the first man to set foot on the moon. It is true that if the first man to set foot on the moon had been only three feet tall, the rungs on the ladder of the lunar lander would have been close together. But it is not true that if Neil Armstrong had been only three feet tall then the rungs on the ladder of the lunar lander would have been close together - on the contrary, had Neil Armstrong been only three feet tall, he would not have been admitted to the astronaut training program and someone else of normal height would have been the first man to set foot on the moon. Are we to conclude then that subjunctive conditionals are referentially opaque and hence that it is illegitimate to quantify into them? The situation is not as clear cut as all that. The difficulty is with the notion of referential opacity itself. The customary way of defining referential transparency is that a formula qx is referentially transparent iff the substitutivity of identity holds for it:
However, this is an unreasonably strong requirement. A term may denote a certain object when taken in isolation, but denote a different object in the context of a certain statement. For example, in isolation the definite description 'the first man to set foot on the moon' denotes Neil Armstrong, but it does not denote Neil Armstrong in the context of the subjunctive conditional 'If the first man to set foot on the moon had been only three feet tall, then the rungs on the ladder of the lunar lander would have been close together'. Let us say that a singular term is used rigidly in a statement if, in the context of that statement, it denotes the same thing it denotes in isolation. Many referentially opaque contexts arise from the fact that terms are not used rigidly in those contexts. But this should not be taken as showing that quantification into those contexts is illegitimate. In order for an object to satisfy a formula rqxl, it is not required that if t is any term denoting the object in isolation then rqt' is true. That is much too strong a requirement. All that should be required is that if t denotes the object in the context rptl then rqtl is true. Thus referential transparency, as traditionally defined, is not required for the legitimacy of quantification.
106
CHAPTER
v
What the above considerations really indicate is that the traditional definition of referential transparency does not capture the notion it is intended to capture. A reasonable course would be to retain the requirement that quantification presupposes referential transparency but redefine referential transparency in terms of individual terms used rigidly: (1.1)
rqxl is referentially transparent iff it is possible to use a term a rigidly in rqal, and for any terms a and b used rigidly in rqal and rqbl, the following is necessarily true: r a = b 3 (qa = qb)'.
Notice that if rqxl is referentially transparent in this weaker sense, then the following principle must hold:
Are modal and subjunctive contexts referentially transparent in this weaker sense? I believe they are. Referential opacity requires the failure of substitutivity of identity7but it requires more than just that it requires such failure even in the case of terms used rigidly. It is easy to find cases of the failure of substitutivity in modal and subjunctive cases, but it is not so easy to find such cases where the terms are used rigidly. The clearest examples of failure of substitutivity involve definite descriptions7 and in those cases the terms are rather obviously not used rigidly. This is illustrated by the example of Neil Armstrong and the first man to set foot on the moon. In contrast to that example, it is quite possible to use a definite description rigidly in a subjunctive context. For example, 'If the man at the end of the bar were outside now, it would be much quieter in here' clearly does not mean 'If it were the case that there is a unique man who is at the end of the bar and he is outside, t h e n . . .'. Rather, it means, 'There is a unique man who is at the end of the bar, and if he were outside now, t h e n . . .'. In this case, the substitutivity of identity goes through without difficulty. For example, if the man at the end of the bar is Edward, we can conclude, 'If Edward were outside now, it would be much quieter in here7. The difference between the rigid and non-rigid use of a definite description in a subjunctive conditional seems to coincide with a
QUANTIFICATION, MODALITIES, CONDITIONALS
107
difference in the scope of the definite description. r[q(7x8x)> $1' is ambiguous between the following two formulas, which represent wide scope and narrow scope respectively: (3!x)Ox & (3x)[(8x & (qx > $11 n
[(3!x)Ox & (3x)(Ox & 9x1. > +] If the conditional is understood in the former way (wide scope), then the description is being used rigidly and we can substitute coreferential terms for it with impunity. Appealing to the scope ambiguity to account for the failure of substitutivity does not automatically resolve the problem in favor of quantification into subjunctive conditionals, because it begs the question by assuming that such quantification makes sense in the first place. But such an appeal does undercut the standard argument against quantification by showing, in effect, that it begs the question against quantification by assuming that we cannot make the distinction between wide scope and narrow scope. To resolve the issue of the legitimacy of such quantification we must appeal to other considerations. Considerable support for the legitimacy of quantification into subjunctive conditionals can be mustered by appealing to examples. First, we have examples of subjunctive conditionals containing definite descriptions which seem clearly to have wide scope. We already discussed the example, 'If the man at the end of the bar were outside now, it would be much quieter in here'. It seems that the definite description in this conditional can only be construed as having wide scope, and if wide scopes are possible in such contexts it follows that quantification into subjunctive conditionals must make sense. We can support this same conclusion more directly by appealing to examples which appear to be quite straightforward cases of quantification into subjunctive conditionals. Two such examples are: Somewhere there is a man who would solve a11 our problems if he were elected president. There is a lion out there who would eat you if he caught you.
108
CHAPTER
v
These seem very obviously to be examples of quantification into subjunctive conditionals, and they seem to make perfectly good sense. I believe that these considerations satisfactorily meet the difficulties involving the putative referential opacity of subjunctive conditionals. If it were not for the difficulties discussed in the flext section, we could leave it at this and assume that quantification into subjunctive conditionals, and hence into modal contexts, is unobjectionable. But as we will see, there are further difficulties for this view.
The semantics for subjunctive conditionals and modal operators require us to look at more than one possible world. '0q1 is true iff q is true in every possible world, and 'q > $' is true iff $ is true in every q-world. By analogy, it would seem that r(3x)E!q1 is true iff there is some object (in the actual world) which satisfies the open formula q in every possible world. But it seems that in order to know whether an object in one world satisfies a formula in another world, we must somehow locate that object in the second world. Thus the legitimacy of quantification into modal contexts seems to presuppose that it makes sense to talk about an object in one world being the same as an object in another world, i.e., it presupposes the meaningfulness of transworld identity. But what could it possibly mean to say that an object in one world is the same object as one in another world? The attempt to answer this question has sometimes led philosophers to talk about essences and to suppose that objects can be reidentified across possible worlds in terms of their essences.' Although this seems at first to be the only way t~ make sense of transworld identity, it also seems totally implausible to suppose that objects really have such essences. This is not to deny that objects might have some essential attributes. For example, it is not implausible to suppose that certain basic sortal properties are essential (Brody 1973, Pollock, 1974, pp. 157-174). But it is not to be supposed that objects have sufficiently many essential attributes to enable us to reidentify them across possible worlds in terms of those attributes. For many years the above considerations led me to despair of making sense of transworld identity and hence to disdain quantification
QUANTIFICATION, MODALITIES, CONDITIONALS
109
(over individuals) into modal and subjunctive contexts. However, just because we reject quantification over individuals, we are not precluded from endorsing some other sort of quantification which is related to individual quantification and in terms of which we can try to make sense of putative examples of quantification into modal and subjunctive contexts. There are two prime candidates for such alternative modes of quantification. A simple and safe way of making sense of quantification into modal contexts is to let the quantifiers range over individual concepts. Formally, an individual concept can be taken to be a function on possible worlds which picks one object out of each world to be the designatum of the concept in that world3 (Kanger, 1957, Kaplan, 1964). I can find nothing objectionable about quantification over individual concepts. The only difficulty is that such a device does not enable us to make sense of some statements that seem to be perfectly sensible. Although it may be reasonable to interpret quantifiers in terms of individual concepts, it is not reasonable to interpret individual terms as expressing individual concepts. For example, I would be saying something true if I were to point to a particular table and say, 'Although that table could be red, it could not be the number two.' There are infinitely many different individual concepts which designate that table in this world. Some of those individual concepts do not designate a red object in any world, and others designate the number two in some worlds. Thus we cannot interpret the above statement as meaning that every individual concept which designates that table in this world is such that it designates something red in some world and designates the number two in no worlds. Apparently we are not talking about every possible individual concept which designates the table in this world. Are we then talking about some unique 'privileged' individual concept chosen from among those that designate the table in this world? I can see no way to pick out such a concept. Such a move would only make sense on the kind of essentialism we have already rejected. Thus I think we must conclude that the appeal to individual concepts does not allow us to make sense of many perfectly sensible modal statements. Something more is required. The difficulty is that we really do seem to make modal and subjunctive assertions which are literally about some particular thing, such as
110
CHAPTER V
the table. The natural way to make sense of such assertions is in terms of transworld identity, but that involves us in immense difficulties. David Lewis [I9681 has suggested an ingenious alternative. This is that although other worlds do not literally contain the same individuals as this world, they may contain 'counterparts' of individuals in this world. Lewis proposes that if a is an object in one world and b is an object in a second world, b is the counterpart of a in the second world just in case b bears at least a minimal similarity to a and is more similar to a than is any other object in that second world. Then the proposal is that to say that something is necessarily true of a is to say that it is true of a in this world and true of all counterparts of a in other worlds. Thus, for example, the table could be red but could not be the number two because there are worlds containing counterparts of the table which are red, but there are no worlds containing counterparts of the table which are also counterparts of the number two. The appeal to counterparts is attractive. It allows us to make sense of de re modal statements and quantification into modal contexts while absolving us of having to make sense of transworld identity. However, as Lewis formulated it, counterpart theory leads to difficulties when we consider subjunctive conditionals. Counterpart theory leads us to say that a conditional "Fa > G a l is true iff G is true of the counterpart of a in every "Fa1-world, where an "Fa1-world is defined in terms of the counterparts of a. But look what happens when we consider a conditional like 'If I had been born in the place of Richard Nixon, with all of his genes, etc., and raised as he was raised, and he in turn had been born and raised in my place, then I would have been a president threatened with impeachment and he would have been an interested bystander'. If we symbolize the antecedent of this conditional as "Fabl, where a designates Richard Nixon and b designates me, then to evaluate its truth we look at all "Fabl-worlds. But according to counterpart theory, there are no "Fabl-worlds. This is because in an "Fabl-world, the counterpart of Richard Nixon would be less like Richard Nixon than would my counterpart be, and vice versa, in which case they could not be Richard Nixon's counterpart and my counterpart. The above example indicates that it is unreasonable to require that the counterpart of an object be more like that object than is anything
QUANTIFICATION, MODALITIES, CONDITIONALS
111
else existing in the same world as the counterpart. It makes perfectly good sense to assert counterfactuals about what would happen if one object were quite different than it is and a second object was just like the first object is now. We must construct the counterpart relation more liberally. But this leads to new difficulties. If we cannot define the counterpart relation in the simple way proposed by David Lewis, then what we need is a criterion for counterparthood. But what could such a criterion be? The difficulties that arise here are precisely the same as those that arose in looking for a criterion for transworld identity. It seems that the only way to make sense of counterparts is in terms of some kind of essentialism, but essentialism is no more plausible now than it was before. It appears that the appeal to counterparts in place of transworld identity solves nothing.
Faced with the above difficulties, it would be very tempting to conclude that de re modalities and quantification into subjunctive conditionals do not make sense. But such a conclusion flies in the face of what appear to be clear examples to the contrary. I do not see how anyone can deny the meaningfulness of either 'That table could be red, but it could not be the number two' or 'There is a lion out there who would eat you if he caught you'. Where do we go from here? The key to the solution to our problem has been provided by Saul Kripke (1971, 1972). Kripke's point is very simple. We are thinking of possible worlds in the wrong way. We are thinking of them as being given to us 'structurally', by giving a domain of objects and saying what properties those objects have, and then leaving it up to us to somehow identify the objects in the domain on the basis of those properties. Why shouldn't we instead have the identity of the objects be part of the specification of the world? Recall once more what the basic intuition is behind our analysis of subjunctive conditionals. We begin with the set of all true statements, and then modify that set as little as possible to accommodate the truth of the counterfactual hypothesis. This intuition is in terms of truths, not worlds. It was asserted that we could capture this intuition by talking
112
CHAPTER V
about how possible worlds compare with one another, but in order to accomplish this the possible worlds must be closely related to sets of propositions. What is important here is that the truths which go into the makeup of the possible worlds are truths about particular objects in this world. Thus if we identify a possible world with the set of propositions true in it, the identity of the objects in that world will not be open to question. Their identity will be given by some of those truths. The truths are truths about particular objects. Here is a place in which the platonistic view of possible worlds is apt to mislead us (although it need not), whereas if we construe possible worlds to be sets of propositions we will not be led astray. As Kripke puts it, at least for present purposes we should think of possible worlds as counterfactual situations. As such, the identity of the objects which exist in any possible world will be given as part of the specification of the world. There is no need for a criterion of transworld identity. The search for such a criterion was doomed from the start because most any object can be most any way, and hence an object described in terms of its properties in some possible world could be identified with most any object in the actual world. There is no question of getting the identification right. We can identity objects however we choose, and that identification will be part of the specification of the world. This follows from the simple fact that we can have non-vacuous counterfactual hypotheses (and hence counterfactual situations, i.e., possible worlds) in which we suppose most anything we like about an object. The above considerations seem to show that transworld identity makes sense after all, in a trivial sort of way. Instead of solving the problem of transworld identity, we have resolved it by showing that there isn't really any problem. But in spite of appearances, all is not yet smooth sailing. Suppose I entertain the counterfactual hypothesis, 'If my house were painted b l u e . . .'. This is a hypothesis that is literally about my house. Does it follow then that possible worlds in which this hypothesis is true contain an object that is literally the same object as my house (in this world)? Building this hypothesis into the construction of a possible world in some sense fixes the identity of an object in that world, but does this really mean that the object in that world is the same object as my house? In other words, is the transworld relation between objects 'of the same identity' literally that of identity?
QUANTIFICATION, MODALITIES, CONDITIONALS
113
It may seem that the answer to this question is trivially 'Yes'. After all, the statement 'My house is painted blue' is literally about my house. But it is not clear what this establishes. One must be wary of putting too much weight on the word 'about'. After all, the negative existential 'My house does not exist' is also about my house, but it certainly does not follow that my house exists in any world in which the negative existential is true. It may be protested that it is unfair to appeal to negative existentials, because everyone knows that they are peculiar, but there is another kind of statement which makes it seem very suspicious to suppose that objects in different worlds are literally identical. This is that kind of counterfactual called a 'counteridentical'. We sometimes assert counterfactuals about what would be the case if two objects which are in fact distinct were one and the same object. For example, I might entertain the counterfactual hypothesis that Richard Nixon and Donald Nixon are one and the same person, this being brought off through the clever use of makeup, with the person in question sneaking away from San Clements occasionally to play the role of Donald Nixon, and then sneaking back to play the role of Richard Nixon. A counterfactual with this antecedent is just as literally about both of the two Nixons as the statement 'My house is painted blue' is about my house. One might feel some temptation to reply that the conditional is really about one of the two Nixons and the other would not exist if the antecedent were true, but this is unsatisfactory because there is no way to say which of the Nixons it is about and which would not exist. I think it must be admitted that the relation between the two Nixons in this world and the single Nixon in the world in which the counterfactual hypothesis is true is precisely the same as the relation between my house in this world and my house in the world in which it is painted blue. But in the former case it is a relation between two distinct objects in one world and a single object in another world. How can this be identity? Let us return to Lewis' terminology and call the relation between objects of the same identity in different worlds the counterpart relation, leaving open for the moment whether the counterpart relation is literally a relation of transworld identity. Let us write ' x C y ' for ' x is a counterpart of y'. Counteridenticals indicate that we can have xCy and x C z but y # z. If C is the identity relation, this amounts to having x = y and x = z, but y # z, which is to deny that identity is transitive. No
114
CHAPTER
v
doubt many philosophers will regard the non-transitivity of identity as manifestly absurd, in which case it follows that the counterpart relation is not a relation of transworld identity. Personally, I am not so sure. I don't really know what to say about putative cases of the nontransitivity of identity. At this point I want merely to counsel caution. We must not be overly hasty in claiming that the counterpart relation is not an identity relation on the grounds that if it were then identity would not be transitive. I think the real problem here is that there is no clear meaning to the question whether the counterpart relation is or is not the relation of identity. Before we can answer this question, we must know what it would mean to have two objects in two different possible worlds literally identical or not identical. This is a notion I can get no clear grasp on. Furthermore, as far as I can see it makes not the slightest difference to anything whether the counterpart relation is an identity relation or not. It seems far better to simply call the relation 'the counterpart relation' and not worry about whether it is identity. But now what becomes of Kripke's observation that there is no problem of transworld identity because the identity of an object is part of the specification of the world in which it is found? We can translate this into an observation about the counterpart relation. Lewis' counterpart theory floundered on the difficulty of finding a criterion for one object being a counterpart of another. We now see that what Kripke's observation amounts to is that the identity of an object is built into the world in which it is found in the sense that the counterpart relation is part of the specification of that world. Starting from the actual world, part of the specification of a possible world consists of saying what is a counterpart of what objects in the actual world. There is no problem of finding a criterion for counterparthood. Counteridenticals show that the counterpart relation can converge two distinct objects in the actual world can have a single counterpart in some possible world. Can the counterpart relation also diverge? That is, can a single object in the real world have two distinct counterparts in some possible worlds? To establish this, we would naturally turn to what we might call 'counter-nonidenticals'. It is easy enough to formulate counterfactual hypotheses which deny identities that hold in the actual world. For example, we can reason about what would be the case if the tallest man on the basketball team were not also the worst
QUANTIFICATION, MODALITIES, CONDITIONALS
115
shot on the team. But this shows nothing about the counterpart relation, because the individual terms involved are not being used rigidly. What we need is a counter-non-identical in which the terms flanking the identity sign are used rigidly to denote the same object. Unfortunately, I cannot find an intelligible way of even formulating such a counter-non-identical. If we try to do it by telling a story as in the Donald and Richard Nixon case, we do not get the desired result. For example, if we hypothesize that the elder Nixons had two twin brothers, and that they have been fooling people by alternately hiding and playing the role of Richard Nixon, this would not be a case in which we could literally say that Richard Nixon is two people. On the contrary, what would really be the case is that there is no such person as Richard Nixon. This seems to be what happens in general when we try to formulate counter-nonidenticals with individual terms used rigidly. To suppose that one thing is really two things is to suppose that that one thing does not really exist. I must admit that these examples are not as compelling as we might wish. To further muddy the waters, it does seem to be possible to formulate counter-non-identicals using proper names, and it has sometimes been maintained that proper names can only be used rigidly. It does not seem possible to maintain the latter thesis with respect to negative existentials involving proper names, and I do not really think it can be maintained with respect to counter-non-identicals either. But the situation here is not clear. For example, what are we to say about a conditional which begins, 'If Hesperus and Phosphorus were distinct heavenly bodies, t h e n . . .'? My own intuitions here alternate between on the one hand really taking seriously the denotation of the terms 'Hesperus' and 'Phosphorus' and finding this conditional unintelligible, and on the other hand taking 'Hesperus' and 'Phosphorus' as short for definite descriptions on the order of 'the first heavenly body regularly seen in the morning' and 'the last heavenly body regularly seen in the evening'. I think the weight of intuition is thus on the side of denying that the counterpart relation can diverge, but these intuitions are not clear and are not really compelling. What we need is an argument based upon other considerations which will settle the matter. Such an argument can be provided. On purely intuitive grounds it seems that quantification into modal and subjunctive contexts makes
116
CHAPTER
v
sense. The appeal to the counterpart relation explains how such quantification makes sense. But it was argued in section one that a necessary condition for quantification into a context r(px' to make sense is that the context be referentially transparent in the sense of 1.1. This in turn implies that principle 1.2 holds:
Applying this to modal contexts, because quantification makes sense in such contexts, the following must hold: (3.1)
(x)(y){x= y ^[U((3z)x
=
z
3
x =x)=
n((3z)x =z
3
x = y)]}.
( 3 z ) x = z1 is a way of saying that x exists. It is clearly true of anything
in this world that in any other world in which it exists, it is selfidentical: (3.2)
(x)U((3z)x = z
3
x
= x).
This immediately implies: (3.3)
(x)(y){x= y ^ U [ ( ~ Z ) = Xz
^x =
y]}.
Principle 3.3 is precisely the statement that the counterpart relation cannot diverge. It says that if x and y are the same object in this world, then they are the same object in any world in which they exist. Thus it seems that the very legitimacy of quantification into modal contexts requires that the counterpart relation cannot diverge. The counterpart relation must be a many-one relation, i.e., a function.' This generates a picture of possible worlds which is rather different from the classical one. In the next section I will attempt to make it precise and investigate what sort of modal logic results from it. Then in the next chapter we will apply these results to an examination of quantified subjunctive conditionals.
Let us begin by constructing a formal language for a first-order modal logic. The logical constants are '(', ')', '-' '&', '=', and '0'.We have a denumerable set Cn of individual constants, a denumerable set Vr of individual variables, and for each n e w , a denumerable set !H,, of
QUANTIFICATION, MODALITIES, CONDITIONALS
117
n-place relation symbols. We define the set Fm of formulas and the set Sn of sentences (closed formulas) in the normal way. We define ' v ' , ' ~ ' ,'=', '3', and '0' in the usual way. Our semantics will proceed in terms of possible worlds. A possible world has two parts - a structure consisting of a domain of objects and a determination of what properties each object has, and an identification of some of the objects in the domain with objects in the real world. The latter is in terms of the counterpart relation. Before considering the details of this, notice that this makes our definition of 'possible world' always relative to the real world, because the counterpart relation must relate objects in the domain of the possible world to objects in the real world. This is rather like those semantics which contain an accessibility relation together with the ruling that not all worlds are possible (or accessible) relative to a given world. Notice further that we can have many different possible worlds having the same structure. Such worlds will differ only in their counterpart relation. Different such worlds correspond to similar counterfactual hypotheses about different objects. It would be impossible to have different worlds with the same structure on the traditional views according to which counterparthood or transworld identity is a function of the structural properties of the objects in the world. A possible world will be an ordered pair (S, C) where S is its structure and C the counterpart relation. In specifying a structure, we must first specify a domain D. Then we must specify the extensions of all relation symbols in this domain. This can be accomplished by a function p which assigns to each n-place relation symbol some subset of D" (the set of ordered n-tuples of members of D). So let us define: (4.1)
A structure is an ordered pair ( D , p ) where D is a set of
objects and p is a function such that for each n e w and R e %,, p ( R ) c D". Now consider the counterpart relation. Let Do be the set of objects in the real world. The domain of C may be a proper subset of Do, because some actual objects may not have counterparts in the possible world. Thus C may be any function from a subset of Do into D. Before giving a precise definition of a possible world, we must decide how to represent the actual world. Is it also a possible world? In
118
CHAPTER
v
fact, we can represent the actual world by its structure alone, because insofar as there is a counterpart relation in the actual world, it is the identity relation. In general, given any structure we can define the notion of a possible world relative to that structure, which is to say what the set of possible worlds would be if that structure represented the actual world:
(4.2)
A possible world relative to a structure (D, p ) is an ordered pair (S, C ) where S is a structure (E, p*), and C maps a subset of D into E.
How should we define validity for our modal language? Informally, we want a sentence to be valid iff it comes out necessarily true no matter how we interpret the non-logical constants of our language.3 From an informal point of view an interpretation of the language does two things. First, it assigns denotations in the real world to the individual constants of the language. We are going to want to talk about models in which the 'real world' is a world that is just a possible world relative to this world, and in such a world some individual constants may fail to denote, so we must make provision for this. The second function of an interpretation of the language is to assign meanings to the relation symbols. Such an assignment has the effect that certain structures are ruled out as being 'really inconsistent' given the meanings of the relation symbols. This second function can be accomplished formally by simply specifying what the actual set of permissible structures is. However, a fact which logicians have generally overlooked is that not just any set of structures can constitute the set of all permissible structures under an assignment of meanings to the relation symbols. For example, if we consider a set of structures in which no structure has a domain of cardinality 3, to take this as our set of permissible structures would be to suppose that the meanings of the relation symbols somehow make it logically impossible for there to exist just three things in the universe. But the meanings of the relation symbols have nothing at all to do with what size universes are logically possible. I am supposing that the residents of our possible worlds are ordinary contingent objects like tables, chairs, electrons, etc., and for such objects it seems that it must be logically possible for the universe to have any finite or denumerable size. I don't know whether it should
Q U A N T I F I C A T I O N , MODALITIES, C O N D I T I O N A L S
119
be possible for the universe to be non-denumerably infinite, but fortunately, this is a question we can leave undecided as long as we are only dealing with first-order logic, because countenancing larger universes makes no difference to the satisfiability of any formula. At any rate, it is clear that the set of permissible structures must contain structures having domains of all finite or denumerable sizes. Furthermore, in light of the Skolem-Lowenheim theorem, we will only consider countable structures, i.e., structures having countable domains. Should we impose any other restrictions on sets of permissible structures? There is no need to do so. The only contingent facts that can be expressed in an uninterpreted first-order language are those having to do with the size of the universe, so any set of structures satisfying our restriction concerning the variety of cardinalities will correspond to some assignment of meanings to relation symbols. To say that a sentence is necessarily true under every interpretation of the language is to say that no matter what structure we start with as the actual world, the sentence comes out true under every interpretation of the language. So let us define: (4.3)
A model is an ordered triple ((D, k), T}, H ) where H i s a set of countable structures, (D, k ) e H, for each a <. w there is a structure S in H whose domain is of cardinality a, and T ) maps a subset of Cn into D.
Then we define: (4.4)
A sentence
is valid iff a> is true in every model.
What remains is to define truth in a model. This definition is relatively straightforward except for the modal case. We first define: (4.5)
A model (S*, T ) * , H*) is an a-variant of a model (S, T),H ) iff a e Cn, S* = S, H * = H, and T)* agrees with T ) except possibly for the value of ?*(a).
Next we must decide how to handle sentences containing nondenoting individual constants. I will adopt what can probably be called 'the standard procedure' of giving every sentence a truth value, ruling that atomic sentences containing non-denoting constants are automatically false. Given this we can define truth in a model by induction on
120
CHAPTER V
the length of a sentence. Letting 9 ( q ) be the domain of q : (4.6)
If M is the model ((D, p ) , q, H), then: (i) if a, b e Cn then "a = b1 is true in M iff a, b e @ ( q ) and q(a)= d b ) ; (ii) if R 9tn and a l , . . . , a n â Cn, then "Ra1,. . . an1 is true in M iff a l , . . . , an @ ( q ) and (q(a,, . . . , q(an)) dR); (iii) "-
/< are both true in M ; (v) V
The only clause of definition 4.6 that needs explanation is clause (v). We want rCl
If t is an individual term, "Etl is "(3x)x = t1
The modal logic resulting from this semantics has a number of peculiar features. Some of these arise out of the fact that our modal operator is basically a de re operator. We can quantify into modal contexts, and appending 'D' to a sentence yields a sentence that is literally about the objects mentioned in the sentence. In effect, the modal operator of logical necessity is a de re operator, and although de dicto modal statements do occur, they are just limiting cases of de re
Q U A N T I F I C A T I O N , MODALITIES, CONDITIONALS
121
statements. Let us define: (4.8)
(p is a de dicto sentence iff (p is a sentence and no subformula of (p having the form "El^ contains occurrences of any free variables or individual constants. (p is de re iff (p is not de dicto.
It is easily verified that the logic of de dicto sentences is completely normal, satisfying exactly the propositional modal laws of S5. Surprisingly enough de re sentences behave differently, satisfying only the laws of 54. It is trivial to verify that sentences in general satisfy the modal laws of S4. The 'characteristic law' of S5 which is not satisfied is :
Given the valid principle
r(p 2 O(pl,
4.9 implies:
Due to the fact that our counterpart relation can converge but not diverge (that is, it is a function but not necessarily one-one), it is simple to construct counter-examples to 4.10. First, notice that because the counterpart relation cannot diverge, both of the following are valid:
(4.12)
(x)(y)[x = y ^O(Ex & Ey. -)I.
That is, individual constants and variables act as 'rigid designators'. By virtue of the validity of 4.11, ^ 0 a = b. => O D ( E a & Eb. 2 a = b)l is valid, and hence r O a = b. => - 0 0 [ E a & Eb & a # b]' is valid. But this implies the invalidity of:
This is a counterexample to 4.10. 4.13 is invalid for the following reason. Suppose 'Ea & E b & a # b' is true in the real world. There are worlds possible relative to the real world in which ^ a = b1 is true, i.e., " 0 a = b1 is true in the real world. But then "-DOfEa & E b & a # bI1 is true. Lest it be thought that the only counterexamples to 4.9 and 4.10 are those involving identity, notice that the following is invalid for exactly
122
CHAPTER V
the same reason 4.13 is invalid:
(4.14)
( E a & Eb & Fa & -Fb)=3DO(Ea & Eb & Fa & -Fb).
Thus our semantics gives us a version of quantified S4. The treatment of identities is interesting. Because the counterpart relation is a function, we get the validity of 4.11 and 4.12. However, because it need not be one-one, neither of the following is valid:
We have a version of quantified S 4 , but as we have seen, it is not quite the same as any of the familiar versions of quantified S4. This results in part from the treatment of identity and the counterpart relation. Another source of differences results from our 'variety of cardinalities' requirement on models. For each n e w , let rEnl be a first-order sentence which says that the universe contains exactly n objects. Then "OEnl is valid on our semantics. No doubt some logicians will find this objectionable. They will reason that our logic should be adequate for dealing with any subject matter, and some subject matters will restrict the size of the universe. For example, we might want to talk about the natural numbers. Thus, it will be urged, we should eliminate the 'variety of cardinalities' requirement. I have no particular objection to building a modal logic on such a basis. However, unlike ordinary first-order logic, I do not find modal logic very useful for talking about anything except contingent objects. As long as we are going to take our domains to consist exclusively of contingent objects, then we should impose the 'variety of cardinalities' requirement on models. Of course, there is nothing to prevent one from having both kinds of modal logic, using one for some purposes and the other for other purposes. Another unusual valid sentence is "Elpa 3 0 ( x ) p 1 (where no individual constants occur in q ) . This theorem results from the fact that in constructing a model, we take the set of structures as basic and then look at all the possible worlds that can be built out of those structures using different counterpart relations. The more conventional procedure is to take the set of possible worlds itself as basic in constructing a
QUANTIFICATION, MODALITIES, CONDITIONALS
123
model. However, this conventional procedure seems to me quite definitely wrong. Insofar as different sets representing the set of all possible worlds are supposed to result simply from different assignments of meanings to the relation symbols, there should be no restrictions on the counterpart relations. If a certain structure is permissible in terms of a certain assignmeht of meanings, then it should be possible for any object in the real world to have the properties of any object in the domain of that structure. Thus given the possibility of one world having a certain structure, any other world having that same structure but a different counterpart relation should also be possible. Hence the validity of '"Ucpa=>D(x)D(x} D(x) x = a)' is invalid. If we eliminate this source of invalidity by requiring that
124
CHAPTER
v
Barcon formula: (3x)(x = x) 2 [(x)Dq => U(x)q] (where a> contains no individual constants) follows immediately from the validity of 'Uqa ='D(x)ql. A related formula that is often considered problematic is 'U('3x)9 2 (3x)Dq1. This formula is valid without restriction on our semantics, but this is because r-D(3x)(D1 is valid. This results from the variety of cardinalities requirement.
It would now be a simple matter to add to our language subjunctive conditionals with indicative antecedents and consequents and extend the semantics to deal with them. This would involve modifying the notion of a structure so as to include N and W, the sets of basic strong and weak subjunctive generalizations, as part 6f the structure, and it would require the introduction of some way to compare the dates of simple propositions. There are a number of little intricasies in this, but they are all rather straightforward. However, rather than delve into these details just for the special case of conditionals with indicative antecedents and consequents, let us go on to the full theory in which we allow conditionals to have modal and subjunctive antecedents and consequents. As we will see, some new problems arise when we undertake this. NOTES See Chisholm (1967) for a critical discussion of this. In trying to convince me of the contrary, philosophers have repeatedly appealed to the fact that both fission and fusion are possible in temporal contexts. That, of course, is quite true, but I do not see what it has to do with the issue at hand. I do not see any way out of the argument given in the above paragraph, so we should conclude that the temporal modalities do not work like the alethic modalities. See Pollock (1967) for a discussion of this notion of validity.
CHAPTER VI
T H E FULL T H E O R Y
The objective of this chapter is to extend the analysis of subjunctive conditionals in such a way as to incorporate both quantifiers and conditionals with non-indicative antecedents and consequents. This is a matter of combining Analysis VII of Chapter IV with the treatment of quantification discussed in Chapter V, and then making the whole thing recursive in order to handle non-indicative antecedents and consequents. We will begin by constructing an artificial language within which to express our conditionals, and then our analysis will take the form of a formal semantics for this language.
Our language must be a temporal language, because in order to express Analysis VII we must be able to talk about the dates of simple propositions. The simplest way to accomplish this is to make our language a two-sorted language with constants and variables for times, and a temporal relation '5'between times. Then the 'date' of an atomic formula will be indicated by a temporal subscript. In order for this to make sense, we must ensure that our atomic sentences always express propositions of the sort that have discrete dates. This is easily accomplished as follows. We.must have some way of picking out the set Sim of simple sentences (those expressing simple propositions). Simple propositions are supposed to be those that are not logical compounds, so it seems reasonable to require that our atomic sentences be the simple ones. This will ensure the appropriateness of our temporal subscripts in atomic formulas. The logical constants of our language will be '(', ')', '-', '&', '=', '>', ' 3 ''=>','D', 'D', and '5'. We must now have a non-denumerable set P
a
Cn of individual constants, a denumerable set Vr of individual variables, a denumerable set CT of temporal constants, a denumerable set
126
CHAPTER VI
+
VT of temporal variables, and for each n w such that n 0 we have a denumerable set '3in of n-place relation symbols. We define the set of formulas in almost, but not quite, the usual manner. The difficulty is that the account of subjunctive generalizations in Chapter 111 only applies to subjunctive generalizations having indicative antecedents and consequents and it is not at all obvious how to extend it to the case of non-indicative antecedents and consequents. Hence we are led to count "((p 9 if/)', "((p 4)^, "UI&', and "D$^ as well-formed only when
+
P
a
and are indicative. This leads, in effect, t o a two-step syntax. W e begin with a two-sorted first-order language with the logical constants ' ')', '-'. '&', '=', and ' 5 ' .W e define the set of atomic formulas as follows: (p
(1.1)
A n expression
(p
is in At iff either:
(i) there are n e w , R e & , x,, . . . , x,,e(VrUCn), t e (VrUCr), such that
=
yl; o r
t
(iii) there are
t,,
t 2 e (VTU CT) such that
(p = "tl
W e define the set Fmo of first-order formulas in the usual way. Then we define our full set of formulas recursively as follows:
(1.2)
DEFINITION:
(i) if ( p e F m o then ( p e F m ; (ii) if (p l Fmo then "Dy' e F m ; P
iii) (iv) (v) (vi) (vii) (viii) (ix) (x)
if (p e Fmo then rDipl F m ; if (p, @ Fmo then "((p 9 i^ e Fm ; if (p, t+b Fmo then "((p => if/)1l F m ; if a> e F m then "--(p'e F m ; if (p e F m then r O ~ l Fe m ; if (p e Fm and a e Vr, then "(a)(pl e F m ; if (p, I/J e Fm then "((p & 4)"' Fm ; if (p, if/ l F m then r((p > if/)^ e Fm.
THE FULL THEORY
127
In defining 'open formula' and 'sentence' ('closed formula'), we proceed in the usual manner except that and correspondingly and '0' and '0' are treated as variable binding operators, binding
'a', '+',
P
a
all of the variables in the formula to which they a t t a c h . Sn is the set of ' O ' , 'O', ' Ã ˆ ' M ' , and 'E' sentences. We define '3\ 'v', '=", '=', '0' P " in the normal ways. W e define ^{ip -+ + ) l to be 'U((p =' +)^ and "(ip <-> +I1 t o be "D(ip = 4)'. A s before, "V(pl is the universal closure of ip, and let ' 3 i p be the existential closure (i.e., r-V-ipl). We also define: If G is a set of subjunctive generalizations, then V G = 3 W; '(ip 3 $1' G o r '(ip ^> GI.
(1.3)
V(ip
V G is the set of material generalizations corresponding to the subjunctive generalizations in G. ip is indicative iff ipeFmo. Znd is the set of indicative sentences.
(1.4)
'+','^>',and '>' the subjunctive operators.
W e call '0' '0' P
(1.5)
a
SD(ip), the subjunctive depth of ip, is the maximum number of nested subjunctive operators in ip.
O u r definition of truth will be recursive on the subjunctive depth of a sentence. In our definition of the relation M we will have to make use of internal negations, so now we must give a precise definition of that notion. It is simple to d o this in the case of atomic sentences: (1.6)
If (p is an atomic sentence and a , , . . . , a,, are the individual constants occurring in ip, then "-ipl=
"{^x)x = a l & . . . & (3x)x = an & -ipl.
It is more difficult to extend this definition t o the case of sentences in general. The definition must be recursive on the length of the sentence. A t this point it is not worth the effort to carry this out, because it will turn out that we only need internal negation for atomic sentences.
128
CHAPTER VI
Our semantics will be based upon the semantics for quantified modal logic developed in Chapter V. As usual, validity will be defined as truth in all models, and a model will consist of an interpretation of the language and a specification of what possible world is the actual world. As in the case of modal logic, an interpretation of the language will determine entailment relations between sentences by specifying the set of allowable structures. Structures are more complicated than they were in Chapter V because our language is now a temporal language. We represent times by real numbers. Letting Rl be the set of real numbers: (2.1)
A structure is an ordered pair ( D , p ) such that D is a countable set of objects and p is a function and: (i) for each t E Rl, p(t) is a set Dt; (ii) D = U D,; teR1
(iii) for each t E Rl, n e w, and R e %,, p ( R , t) s D';. (iv) for each t e Cr, p(t) E R 1 . In a structure ( D , p), D, is that subset of the domain consisting of objects existing at time t. If t e Cr, then p ( t ) is the time denoted by t. Then we define: (2.2)
An interpretation is a set H of structures such that for each n e o and t e Rl, there is a structure ( D , p ) in H for which Dt has cardinality n.
In this definition of an interpretation we have extended our variety of cardinalities requirement to the domains at each instant of time. This is because in our uninterpreted temporal .language we can express contingent propositions that we could not express in the language of Chapter V. These are propositions describing the size of the domain at a particular time. An interpretation of the uninterpreted symbols of our language cannot turn such propositions into necessary truths, so if the notion of an interpretation is to capture this intuitive notion our strengthened variety of cardinalities requirement is required. Given an interpretation of the language, we can define the notion of
THE FULL T H E O R Y
129
a possible world relative to that interpretation. If we were to follow strictly the treatment of possible worlds contained in Chapter V, we would require possible worlds to contain four pieces of information: (1) a structure S; (2) a counterpart relation relating S to the real world; (3) a specification of what basic strong generalizations are true; (4) a specification of what basic weak generalizations are true. However, an inspection of the definition of truth for quantified modal logic reveals that the counterpart relation plays only a heuristic role and does not enter into any of the formal definitions. It is replaced there by the function q which assigns denotations to individual constants. The same thing will be true in our present language. In Chapter V q was taken to be part of the interpretation of the language, but I now propose to include it in the possible world, using it to replace the counterpart relation. Thus a possible world will become an ordered quadruple (S, q,N, W). Normal procedure would be to require that q maps a subset of Cn into the domain D of the structure S. This was the procedure followed in Chapter V. However, in the present context we must require instead that q map a countable subset of Cn onto D. We must require every object to have a name. The reason for this is that in our definition of the relation M we want to talk about the set of all simple truths in a world, and if we are to do this by talking metalinguistically about what simple sentences are true in the world, we must have every simple truth expressed by some simple sentence. This can only be done if every object has a name. This is also the reason we require our language to have non-denumerably many individual constants. Otherwise we could have a world a which uses them all up, in which case there could be no worlds (3 possible relative to a (i.e., 'accessible' in the sense of 2.9) containing all of the objects of a and some new objects in addition. We cannot define a possible world to be just any quadruple (S, q, N, W) of the appropriate sort. A possible world must be 'coherent' in the sense that if a subjunctive generalization is included in N or W, then the corresponding material generalization is true in the world. This seems to put us in the position of having to define truth in a model before we can define the notion of a model, which would quickly become circular. Fortunately, there is no real difficulty here.
130
CHAPTER V I
T h e ordered pair (S, q ) constitutes a first-order model for the indicative fragment of our language, and we can define the set T of true indicative sentences in the normal way in terms of (S, q ) . It was argued in Chapter 111 that a logically contingent generalization of the form r(x)(q =Â . x = a , v . . .v x = a d 1 cannot b e actually I necessary. This constitutes a constraint on the sets N and W. Combining this with the previous observations leads to the following definitions: (2.3)
A possible world relative t o an interpretation H is an ordered quadruple (S, q, N, W ) such that: (i) (ii) (hi) (iv) (v) (vi)
S is a structure (D, a ) ; SE H ; q maps a countable subset of Cn onto D ; N is a set of strong generalizations; W is a set of weak generalizations; If T is the set of indicative truths, then VN c V W G T, and there is n o indicative formula (p, individual constants a } , . . . , a,,, and re Cr such that V W entails 'CI(x)(q => .x '(x)((p 3 .x = a l v . . .v x = a,,)l but
7
1
t
a l v . . . v x = a,,)' is false in ( S , q, N, W ) . I
Clause (vi) looks as though it might be circular because it refers to the truth of a modal sentence in the possible world. However, there is no circularity because the truth value of such a sentence will be independent of the contents of N and W . (2.4)
A model is an ordered pair (a, I ) where I is an interpretation and a is a possible world relative to 1.
W e will define the notion of truth in a model, and then: (2.5)
(p
is valid iff
(p
is true in every model.
Truth will be truth in a model, but for convenience we will also allow ourselves t o talk about a sentence being true in a possible world (relative to an interpretation). For convenience we define the following notation:
THE FULL THEORY
(2.6)
131
Ifa=(s,~),N,W),thenS~=S,q~=~),N~=N,andW~=
w. If a is a possible world, Ta is the set of indicative truths in a. If I is an interpretation, [ [ I ] ]= { a ; a is a possible world (2.8) relative to I } . In defining the notion of a possible world, we have ignored the relation of the possible world to the actual world. The result is that we will have more possible worlds than we really want. A world is not 'really possible' unless there is a counterpart function from the real world into the domain of the possible world such that the denotation function T) 'agrees' with the counterpart function. More precisely, let us define: (2.7)
(2.9)
If I is an interpretation and a, (3 e [ [ I ] ] ,(3 is accessible from a iff for every b, c e 9 ( q a ) , if ~ ( b=)~ ( c and ) b 9(%), then %(b) = 'rls(c).
(2.10)
If a e [ [ I ] ] ,[ [ I l l a = { p ; (3 e [ [ I ] ]and (3 is accessible from a } .
If a is the real world, then only worlds accessible from a are 'really possible'. What remains is to define truth in a model. This can be accomplished by generalizing Analysis VII. We will define truth and the relation M by simultaneous recursion on the subjunctive depth of a sentence. Up to this point we have taken M to be a binary relation between a possible world and a sentence. But for a general semantics, we must construe it to be a ternary relation. If a and (3 are possible worlds relative to some fixed interpretation, '(3Ma
is true in (3.
Next we must consider what is required of No. In Analysis VII we required that No be the result of making a minimal P-change to Na.
132
CHAPTER V I
This is still correct, however our earlier definition of the notion of a minimal P-change is no longer-sufficient for two reasons: First, what we really wanted in Analysis VII was to ensure that No was the result of making minimal changes to Na in order to render it consistent with P. We could ensure that as we did by talking about V N o because we were only considering the case in which ip was indicative. In that case, (p is consistent with No iff it is consistent with VNa. But in the general case, where we can no longer assume that (p is indicative, we cannot get by talking about VNo. For example, (p might be the negation of some member of N',. In that case, (p would still be consistent with VNa. We must instead talk directly about the consistency of N', with P. There is a second difficulty. As long as we were only considering an indicative (p, the only way the supposition that (p is true could require us to alter Na was by being inconsistent with Na, and hence the only kind of change that could be required was a deletion. But if (p can be non-indicative, then the supposition that (p is true may require us to enlarge Na. For example, (p might be of the form r + 3 01, in which case we would have to add generalizations to N', which would be sufficient to allow us to derive (p from the enlarged set. Thus in the general case, there are two kinds of changes that might be required in constructing No. We might have to eliminate some members of Na, and we might also have to add some new generalizations not already in Na. In either case, we want to require that the changes be as small as possible. Just as in the case of changes to simple sentences, we can represent the change to Na by the indexed difference ( N a A N a ) . We can then take definitions 2.20 and 2.21 of Chapter IV to define the notion of a minimal change (replacing sets of propositions now by sets of sentences) if we supply a new definition of the notion of a strictly minimal change which will make the latter notion applicable to Na. This is readily accomplished: (2.1 1)
X is a strictly minimal (p-change to Na (relative to I and a ) iff there is a world p e [[!]la in which (p is true such that No = N', + X, and there is no Y such that Y c X and there is a world 7 e [[!]la in which (p is true such that N-, = N', + Y.
We then require:
133
THE FULL THEORY
(ii) (N. A Na) is a minimal
X is a strictly minimal F-change to Wa (relative to Na) iff there is a world 7 e [[Illa in which all members of F are true and such that N-, = Na and Wy = W. + X; and there is no Y such that Y c X and there is a world 8 e [[I]]. in which all members of F are true and such that N5 = Na and w x = W.+Y.
We then require: (iii) ( W a ) is a minimal (VNa U {(p})-change to W. (relative to No). We seek to preserve the truth values of simple sentences and their internal negations, as in Chapter IV. Simple sentences are now identified with the atomic sentences, so let us define: (2.13)
Sim = {(p;
(p
At f l Sn or (34)(4e At f~ Sn and
Given an atomic formula ip whose temporal subscript is define u("-(p^) = u(
we will
(2.14)
l T.) and a{
@Maw (relative to an interpretation I ) iff @ E DEFINITION: [ML 2nd: (i) (p is true in 0 ; (ii) (Na A No) is a minimal
With this definition, we have completed out account of the relation M.
134
CHAPTER VI
Definition 2.15 makes use of the notion of ip being true in a world y. Now we want to define that notion. For subjunctive ip, the definition will use the relation M. We want these two definitions to constitute a definition by simultaneous recursion on the subjunctive depth of ip. In order for this to work, we must ensure that in defining truth for ip, we only use the relation "@Ma$' for formulas $ of subjunctive depth less than that of ip. It is obvious how to define truth for indicative sentences, and for conjunctions, negations, universal generalizations, modal sentences, and subjunctive conditionals. The first case that is not entirely obvious is that of a formula of the form "Dip1. Intuitively, if ip is a sentence P
then "Dip1 is true in a world a (relative to an interpretation I ) iff
ip
is
P
entailed by Na. This can be captured by requiring that ip be true in every world j8 for which No includes at least Na. Recalling that if ip is an open formula then "Dip1 is supposed to mean the same thing as P
DVipl, we have the general condition: P
"W is true P
in a (relative to 1) if (Vfl E [[l]]J[if
N3 c= No
then "Vqle To]. Truth for "0ip1 is defined analogously. P
A generalization "q 3 i{/' is supposed to be true in a world a iff for each world j8 such that (3Ma3ip, the set VNo entails "V(q => 4)'. This can be captured by the following definition: is true in a (relative to I) iff (Vp, y e [[I]Ia)[if flMa3q and VNa c T then "V(y 3 $)'E T7].
i p$
Weak subjunctive generalizations are handled analogously. Putting these observations together into a definition, we obtain: (2.16)
If re Cr and (a, H), (j8, I ) are models, then (j8, I) is a t-variant of (a, H ) iff ( a , H) and (fl, 1) are identical except possibly for the value of q(t).
(2.17)
DEFINITION:
Truth in a relative to I: (i) If b,c l Cn and t e Cr then "b 7c1 is true iff b, c E 3 i ( ~ ) aand ) ~ l a ( b=) qa(c) and qa(b)e Da,,,(o;
T H E FULL T H E O R Y
135
(ii) if tl, t2l CT then 'ti q1 is true iff q(tl) 5 q(t2); (iii) if R e 8,,, t e CT, and al, . . . , an e Cn, then R , a 1 . . ., an1 is true iff a l , . . ., an63)(qa) and (qa(a1), . . . , q'x(an)>?u(R,t); is true iff
x ) rCl +)' is true in a iff (V(3e [[I]]J [if (3Maq then 4 is true in (31. Definitions 2.15 and 2.17 constitute definitions by simultaneous recursion on the subjunctive depth of a formula.
It is customary to define logical entailment both as an object language relation between sentences and as a metalinguistic relation between
136
CHAPTER VI
sets of sentences and sentences. W e define y - + (^ to mean that ^(p => 4' is necessarily true. However, where F is an infinite set of sentences, we cannot similarly define T + $', because there is no way t o talk about the set F in the object language. Thus we retreat into the metalanguage and define it t o mean that it is necessarily true that if all the sentences in F are true then \li is true. In the next chapter, where we construct an analysis of causal statements, we will find ourselves in a similar position with respect to subjunctive conditionals. W e will find it necessary to employ subjunctive conditionals whose antecedents and consequents are infinite conjunctions o r disjunctions. When F is a finite set, then the disjunction ' Z F ' and the conjunction 'HF.' of the members of F make perfectly good sense, and we can express statements like ^ S F > \lil o r ^IiT>v in the object language. However, if F is infinite, this can no longer be done. Our language does not contain infinite conjunctions o r disjunctions, so we must retreat into the metalanguage once more. W e can straightforwardly define the metalinguistic relation '>' between single a sentences as follows: is true in the world a. y > J!,I iff r(p > (3.1) a
Then if F is finite, we can write things like ' I I F > 4' and ' S F > ^\ a
a
But we want to extend our definition of this metalinguistic relation to the case in which F is infinite. W e want, ultimately, to be able to write all of the following:
THE F U L L THEORY
and other similar metalinguistic statements. Unfortunately, it does not appear to be possible to give a definition of a single relation ' > ' which c,
will then give us all of the above as special cases. The difficulty is that, e.g., (b) does not express a relation between two objects J F and iff. '.ST' is not an object. F is a set, but as we don't really have infinite disjunctions, 'XT' isn't anything. We must instead treat the whole expression ' 2 .. . > . . .' as expressing a metalinguistic relation between c,
r and if). Each of (a)-(i) expresses a different metalinguistic relation and must be defined separately. This could be overcome by formalizing our metalanguage and then giving a recursive definition in the metametalanguage, but that is not worth the trouble. Once we have seen how to define one of the relations (a)-(& it will be quite obvious how to define any of the others. Thus once we have gone through the definition of one of these, I will feel free to use other similar relations without actually defining them. The key to constructing definitions for relations like (a)-(i) is to extend the definition of M so that we can write things like ( 3 ~ and /3MJlP. It is trivial to do this. We simply go back to definition 2.15 and wherever that definition requires that the sentence (p be true in some possible world, we instead require that there is a member of F true in that world (in the case of SF) or that every member of F is true in that world (in the case of 77F). In the case of each of the antecedents of (a)-(i) there is a similarly obvious condition to employ in the definition of M. Then we can define our relation (a)-(i) quite straightforwardly. For example, we define (a) as: for every world
6, if (3Ma/IT then iff is true in 6.
Analogously, we define (h) as: for every world (3, if (3Ma((p &
ft JG),then some t < ~
member of A is true in (3.
~
2
~
138
CHAPTER VI
Using these metalinguistic relations we can write anything we could write if our language actually contained infinite conjunctions and disjunctions, the only difference being that what we write are expressions of the metalanguage rather than the object language. In addition there is the advantage that whereas one could raise philosophical difficulties about infinite conjunctions and disjunctions in the object language, similar difficulties do not arise for our metalinguistic procedures. We will also want to use infinitary operators in connection with necessitation conditionals, writing things like i^^Ã4'. We can define m
this quite straightforwardly in terms of our use of infinitary operators with the simple subjunctive. The object language necessitation conditional ru>>> 4' is defined to be: ($)
&
[(-w
-$)>(
Thus we would like to define 'JIT Ã $' to mean something like: m
(nr>> i/f)
& [ ( - n r & -$) > ( n r > $11. a
a
We cannot write quite this, because the final occurrence of '>' is not subscripted. But it is obvious what we want to say here. We want to say that '/7r> $' holds for every ( - n r & -$)-world (3. Thus our 0
definition should be:
(nr>4 ) & (V(3)[if p M m ( - n r & -$) e
then
nr > $1. 6
Given our infinitary simple subjunctives, we can define our infinitary necessitation conditionals completely generally on the above model: (3.1)
(
I>>[
then (
] i f f ( ) > [ I a n d ( V ( 3 ) { i f ~ M ~ (I ~ & (- [
I>[
I)
11.
One of the most important tasks of this chapter will be to prove theorems 4.20 and 4.21 of Chapter 111. However, these principles cannot yet be formulated in our language. To do this we must augment
139
THE FULL THEORY
our language with some set-theoretic notation and extend our semantics to handle this additional richness. This is not difficult to do. We add a new non-denumerable class Vs of set variables, and a nondenumerable class Cs of set constants, together with the new logical constant ' G '. We then turn our language into a three-sorted language. We add the new atomic formulas r x e X1 and "X=Y1 where x e 1
1
(Cn U VR), t (CTU VT), and X, Y e (Cs U Vs). Our other syntactical definitions remain as before, with the exception that we do not allow our set notation to be used in the formulation of subjunctive generalizations. Nor do we allow our set notation to be used in constructing simple sentences. The set Sim remains as before. In formulating the semantics for our extended language, we define 'structure' and 'interpretation' just as before. We must modify the definition of 'possible world' by requiring that T) interpret the set constants in addition to the individual constants. Thus we add the condition: (iii*) T) maps a countable subset of Cs into the power set of D. We require that T) only interpret countably many set constants in order to ensure that there are always plenty of set constants left over to denote new sets occurring in worlds possible relative to a. This requirement has the effect that if D is denumerable, then most sets will not receive names. Thus in defining truth for sentences involving quantification over sets, we must proceed conventionally in terms of X-variants. We define truth for our new atomic sentences as follows:
Our definition of accessibility must be made more complicated to accommodate our set notion. We want the transworld identity of a set to be determined by the transworld identity of its members. Thus if fS is a world accessible from a, and A e Cs, then the set denoted by A in fS must consist of the counterparts of the members of the set denoted
140
CHAPTER VI
by A in a. Thus our definition becomes:
(2.9*)
If I is an interpretation and a, (3 e [ [ I ] ] (3 , is accessible from a iff: (i) for every b, c e Cn, if b, c e 9 ( q a ) , qa(b)= ~ ( c and ) b e 9 ( q a ) then % ( b ) = qp(c); and (ii) for every A e Cs, if A e 9 ( ~ then ) A e 2i(qp) iff { c ; c e C n & q a ( c ) e q a ( A ) } c 9 ( q p ) ,and if then qp(A)= {qa(c); c e (Cn 9(rla)) and qa(c)e qa(A)}.
The rest of our semantics can be left unchanged. One noteworthy theorem which results immediately from our interpretation of set terms is the following:
(4.2)
A
=B t
3
Q((3X)X ; A
3
A
;B ) .
That is, set terms are rigid designators.
Definitions 2.15 and 2.17 constitute both an analysis and a formal semantics for subjunctive conditionals. A number of interesting theorems result from this analysis. The turnstile is defined as usual:
(5.1)
I-
Our logic of subjunctive conditionals contains the quantified modal logic of Chapter V. Once again, the propositional fragment of our modal logic is S4. As noted before, a necessary condition for the correctness of the above analysis is that it have the consequence that every P-change contains a minimal P-change. In the present context, this should become a provable metatheorem. However, the extreme complexity of our language makes it very difficult to prove anything about it, and I can do no more than conjecture that this metatheorem holds. Assuming that it does, the necessity operator can be defined in terms of '>':
THE FULLTHEORY
The axioms of SS are all valid:
At the end of Chapter I, it was suggested that there is a closer affinity between the linguistic theory of conditionals and the possible worlds theory than might be supposed. This can now be demonstrated. According to the linguistic theory, '(p > il/)l is true just in case <{/ is entailed by a combination of p, some laws, and some truths satisfying an as yet unanalyzed condition of 'contenability with p'. I suggested in Chapter I that the proper analysis of "0 is contenable with p1 is 8 E p 1 . Of course, this does not give an analysis of subjunctive conditionals until we have an analysis of 'even if', but that it is right as far as it goes is indicated by the following theorem: (5.11)
r ( p > if/)' is true in a model iff for some sentence 0, "{[(p & 0) 9 $1 & (OEip)r is true in the model.
Furthermore, Analysis VII proceeds by telling us what must be preserved in a p-world, and so is in effect an analysis of 'even if'. Thus in the end, the linguistic and possible worlds approaches to the analysis of subjunctive conditionals converge. Our logic is also a logic of law statements and physical necessity. All of the principles 3.15-3.27 and 4.5-4.16 of Chapter I11 are valid. A few other simple principles are:
142
CHAPTER VI
. In Chapter I11 we discussed the possibility of analyzing
^((P
3 $)^ as
a universally quantified conditional of some sort, but were unable to do so because of the difficulty concerning what the quantifier would range over in the case of a counter-legal. It was suggested instead that the strong subjunctive generalization was analyzable as "OS P
DV(
4)'. We can now prove that this proposed characterization is P
correct. According to our semantics, in deciding whether "(
Thus we have both of the following equivalences:
Analogously: (5.24)
^> 4 )= [3 UV(
143
THE FULL THEORY
An immediate consequence of 5.24 is: (5.26) l-( $) =[3
((p 3 $11. Now we are in a position to prove theorems 4.20 and 4.21 of Chapter 111: (5.27) If ip contains no set variables or set constants, and the free individual variables of
$) = (=){(x)(x E A = x = x) & [V(
Proof: Let A = {x; x = x}. Suppose "3(
V(
$)' is true in a. Suppose & X I , .. . , xn^A)'. Then in constructing B, We have added a new object or sequence of objects to the domain and made it satisfy
4)' will be true in every such /3 only if Wa entails 'V(
mv
a
xl, . . . , xn&A ) >V($)I. Hence by 5.20, it also a
entails '03(
UV(q 3 $)'. a
a
XI,. . . , xn&A)'
By 5.18, "OS(
is equivalent to "OSy1. Consequently, "03 a
nV(
f l i s true in a, i.e., a Conversely, suppose '(
=> $)'
"(
=> $)'
a
is true in a.
is true in a. Then 'UV( $I1 is true a
in a. By 5.18 and 5.12, "DV(
$) & 03
a
X I , .. . , xn&A)>V(q 3 $)I1. Thus ' 0 3 ~ U V ( < +)' p a
[3(
V(
$)I1,
a
&
entails " 0 3 ~ a
which is equivalent to '03(
X I ,. . . , xn&A) > [3( V(q 3 $)I1, which by 5.19 is equivalent to '3(
V(
(5.28)
If xi, . . . , xn are the variables free in
$) = (3P)(3A){(x)(xe A = x = x) & (
[PE3(
144
CHAPTER VI
Consequently, the conjectured characterizations of weak subjunctive generalizations turn out to be correct. Our language contains five kinds of conditionals. They form a sort of heirarchy as follows:
NOTE
If
ip
is an open formula, the interpretation of ' D i p is to be "DVipl. P
P
CHAPTER VII
CAUSES
The concept of a cause pervades the entire framework of concepts in terms of which we think of the world. As Ducasse points out, this "is made evident by the very large number of verbs of causation in the language; e.g., to push, to bend, to corrode, to cut, to make, to ignite, to transport, to convince, to compel, to remind, to irritate, to influence, to create, to motivate, to stimulate, to incite, to mislead, to induce, to offend, to effect, to prevent, to facilitate, to produce, etc." (Ducasse, 1966, p. 141). However, despite intensive philosophical labors, the concept of a cause has stubbornly resisted analysis. The major difficulty has been that talk of causes seems to involve us in a mysterious metaphysical kind of contingent necessity, and hence an account of causation would seem to call for the kind of metaphysical theory which is in disfavor in contemporary philosophy. However, philosophers have been equally suspicious about the metaphysical underpinnings of subjunctive conditionals, and as we have seen, it is possible to give a straightforward non-metaphysical account of them. Furthermore, it seems initially plausible to suspect that causes are intimately bound up with laws and subjunctive conditionals, and as such contain a subjunctive element. Perhaps there is hope that we can analyze the notion of a cause with the help of our newfound understanding ot subjunctive conditionals. Such an attempt will be made in this chapter.
In recent years, Donald Davidson, Jaegwon Kim, Zeno Vendler, and others have urged repeatedly that we should get clear on the ontology of causes before we undertake an analysis of the notion of a cause. The traditional supposition has been that causation is a relation between
146
CHAPTER VII
events, and so the ontology of causes has been supposed to be the ontology of events. Kim's (1971) review of Mackie (1965) makes it clear how much trouble one can unwittingly get into by ignoring questions about the nature and individuation of events. For example, in analyzing the concept of a cause philosophers have repeatedly found themselves talking about conjunctions and disjunctions of events. But the relations of conjunction and disjunction are relations between sentences or propositions. If these relations are to be used in some derivative sense in connection with events, this derivative sense must be explained, and such an explanation will very likely presuppose facts about the individuation of events. Thus Kim and Davidson have been led to propose analyses of the concept of an event. However, I think there is a fundamental confusion here. Why do we believe that causation is a relation between events? Most philosophers take this as just obvious, but I do not think that it is at all obvious. Let us consider this question objectively. To begin with, most philosophers are quick to point out that they are extending the use of 'event' somewhat when they assert that causation is a relation between events. Our ordinary understanding of events (it is asserted) is that they involve changes, but a cause can sometimes be the absence of a change. For example: 'The bell's not ringing caused us to remain seated'. But, it is supposed, with this minor extension of the concept we can make it true that causes are events. However, the extension involved is not really quite so minor as philosophers have been apt to suppose. The difficulty is'that the dictum 'Causes are events' makes one prone to overlook the great variety of antecedents that are possible in causal statements. We can have conjunctive causes: 'That switch A was closed and switch B was opened caused the light to go on'; we can have negative causes: 'The switch's not being closed caused the light to remain off'; and (hence) we can have disjunctive causes: 'That either switch A was open or switch B was open caused the light to remain on'. We can also have existential causes: 'That someone entered the room caused the buzzer to sound'. In general, we can use logical operators to construct causal antecedents of arbitrary complexity. We must not understand 'event' so narrowly that we rule out these logically complex causes. Given the required extension of the concept of an event, what sorts
CAUSES
147
of things are events? Davidson (1971, p. 217) gives a strangely mixed list in answer to this question: Sally's third birthday party, the eruption of Vesuvius in 1906 A.D., my eating breakfast this morning, the first performance of Lulu in Chicago. What is odd about this list is that the terms it contains are not all of the same grammatical category. 'My eating breakfast this morning' is a nominalized gerund. This is typical of the first term in causal statements, which tend to be expressions like 'The bell's not ringing', or 'The eight ball's striking the five ball'. But 'Sally's third birthday party' and 'the first performance of Lulu in Chicago' are not of this form. Furthermore, taking all of these terms at face value as really denoting things, they must denote different sorts of things. For example, there is no way to replace 'Sally's third birthday party' with a nominalized gerund which denotes the same thing. We cannot say, e.g., that Sally's third birthday party was the same thing as Sally's having her third birthday party. This identity sentence is either logically false or grammatically illformed. This can be made clearer by noting that the birthday party may have attributes inconsistent with attributes possessed by Sally's having the party. Perhaps the party was horrible - no one enjoyed it, everyone felt self-conscious, but they all came and pretended to have fun because Sally had just recovered from a serious illness. O n the other hand, Sally's having the party was wonderful because this signified her recovery from the illness that everyone expected to be fatal. If the party was horrible, but Sally's having the party was wonderful, then the party cannot be the same thing as Sally's having the party. Or to take another attribute, the party may have been long and drawn out, but this is not something that can even be meaningfully said of Sally's having the party. Switching examples, the first performance of Lulu in Chicago is not the same sort of thing as the play's being performed for the first time in Chicago. Performances of plays cannot be identified with the plays' being performed. For example, the performance might have been brilliant, but the play's being performed was not brilliant - it was stupid, it caused a race riot. It seems then that there are two logically different sorts of things that philosophers have called 'events'. On the one hand there are the referents of these nominalized gerunds (supposing them to really have referents), and on the other hand there are things like birthday parties
148
CHAPTER VII
and performances of plays. This disparity becomes particularly important when we realize two things. First, only items of the second grammatical category (birthday parties, performances of plays, baseball games, etc.) would ordinarily be called events. Second, only items of the first grammatical category enter directly into causal statements. Our most fundamental causal statements have forms like 'My eating breakfast this morning caused me to fall asleep while lecturing'. This second point requires elaboration. My claim is that the most fundamental kind of causal statement has the form (2.1)
ger(P) caused inf(Q).
where "ger(P)l is the appropriate nominalized gerund constructed from P, and "inf(Q)' is the appropriate infinitive construction on 0. I will now explain what I mean here by 'most fundamental'. We sometimes speak as if physical objects were causes. We say things like 'The tree caused the accident (by being too close to the road)'. However, it is rather obvious that an object can cause something only by being a certain way or having a certain property. In other words. (2.2)
x caused it to be the case that Q.
is analyzable as: (2.3)
(3F)[x's being F caused it to be the case that 01.
This is a rather obvious point which, I think, will be granted by everyone. Furthermore, causal statements of the form of 2.1 are more fundamental than causal statements of the form 2.2 in the sense that the latter can be analyzed in terms of the former, but not conversely. The former cannot be analyzed in terms of statements of the form of 2.2 because the former contain more information. They tell us how x caused it to be the case that Q, and that information is not contained in statements of the form of 2.2. We also speak of events as being causes; 'Sally's third birthday party caused her sister to be jealous', 'The baseball game caused a traffic jam'. But it is my contention that these causal statements are parallel to those of form 2.2 and are to be analyzed analogously. That is, an event can only cause something by virtue of having a certain property,
CAUSES
149
and then it is true that the event's having that property was the cause. For example, consider: (2.4)
The duel caused women to faint.
This might be true simply because the duel's occurring caused women to faint. But on the other hand, duels might be everyday events which women take in their stride, and it was not the occurrence of the duel that caused women to faint, but rather it's being so bloody. This would equally make 2.4 true. This indicates that 2.4 is to be analyzed as: (2.5)
(3F)[the duel's being F caused women to faint].
In general, where ' E ' is a term denoting an event, E caused it to be the case that 0 . (2.6) is analyzable as: (2.7)
(3F)[E's being F caused it to be the case that 01.
Thus causal statements of the form of 2.6 are analyzable in terms of those of the form 2.1. And once again, causal statements of form 2.1 do not seem to be analyzable in terms of those of the form 2.6, because the former (if they mention an event) tell us how the event caused what it caused, but the latter do not. In this sense, causal statements of form 2.1 are more basic or fundamental than those of form 2.6. What this all indicates is that although we do talk about events being causes, events are not causes in the most fundamental sense of 'cause'. The sense in which events can be causes is precisely the same as the sense in which physical objects can be causes. Events play no more basic or interesting a role in causal statements than do physical objects. If we are to understand causal statements, we must look directly at those of the form 'ger(P) caused inf(Q)", and these do not appear to say anything directly about events. In order to have a simple way of referring to them, let us call statements of the form "ger(P) caused inf(Q)^ basic causal statements. We can express basic causal statements in a kind of canonical form as I t s being the case that P caused it to be the case that Q1. I have argued that basic causal statements are not to be construed as expressing a relation between events. One is inclined to ask, then, what sorts
150
CHAPTER VII
of entities are related by basic causal statements. But this question is premature. It is not obvious that basic causal statements are relational statements at all. Their surface grammar is not that of a relational statement. The difficulty is that ^inf(Q)^is not a denoting expression. Expressions like 'women to faint' or 'us to remain seated' do not even purport to denote anything. In light of this surface grammar, it is a little bit mysterious that philosophers have been so quick to take basic causal statements as relational statements. But perhaps this is too fast. We do talk about causing events: 'Throwing the switch caused the explosion'; 'Seeding the clouds caused the rainstorm'. These causal statements do not have form of basic causal statements. However, they are readily seen to be equivalent to basic causal statements, viz.: 'Throwing the switch caused the explosion to occur'; 'Seeding the clouds caused the rainstorm to occur'. In general, talk about causing events is equivalent to talk about causing the events to occur. Conversely, can basic causal statements always be reformulated as statements about the causes of events? If so, then it seems we could, after all, regard the second term of a causal relation as being an event. But there are difficulties for this proposal. First, there are infinitive cons,tructions for which there do not seem to correspond , does not seem to be any 'event term' E such that events. E . ~ . there 'The oven's being turned on caused the pie to bake' is equivalent to 'The oven's being turned on caused event E to occur'. One is apt to propose that 'the baking of the pie' is such an event term, but this is a nominalized gerund, and as we have seen, neminalized gerunds do not designate events. But perhaps all that this shows is that there is no term in our language which designates the event and not that there is no such event. I do not see how to resolve that question. However, there is a more fundamental difficulty. Even when there are natural event terms corresponding to infinitive constructions, we do not get the simple equivalence we might expect. Consider, for example: (2.8)
Seeding the clouds caused it to rain today.
Corresponding to 'it to rain today' is an event - today's rain. Thus it might seem that 2.8 is equivalent to:
(2.9)
Seeding the clouds caused today's rain.
CAUSES
151
However, 2.8 and 2.9 are not equivalent. 2.8 entails 2.9, but not conversely. For example, suppose the clouds were seeded at 3 : 00 PM, and this caused it to rain at 4 :00 PM. However, meteorological conditions were such that if it hadn't rained at 4:00 because of the cloud seeding, it would have rained naturally at 5 : 00 PM. Then it is not true that seeding the clouds caused it to rain today - it would have rained anyway. But seeding the clouds did cause today's rain - that is, it caused that very rain we actually had today, because if the clouds had not been seeded we would have had a different rain (one at 5 :00 rather than at 4 :00, involving different clouds, different raindrops, etc.). Thus 2.8'and 2.9 are not equivalent. It does not seem that basic causal statements can be construed as expressing a relation the second term of which is an event. As I remarked above, the surface grammar of basic causal statements is not that of relational statements at all, because the consequents are not substantival expressions. Still, surface grammar does not tell the whole story, and it may be possible ultimately to analyze basic causal statements in terms of some relation between entities of some sort. In fact, there seems to be one rather simple way of doing this. It seems we can always construe basic causal statements as expressing a relation between propositions. That is, we can think of "Its being the case that P caused it to be the case that Q1 as expressing a relation between the proposition-that-P and the proposition-that-0. This suggests that we symbolize basic causal statements using a sentential connective, P C Q l , and then interpret this sentential connective as expressing a relation between propositions. It must be pointed out that the symbolization of basic causal statements using a sentential connective does not commit us to interpreting them in terms of a relation between propositions. The use of the sentential connective is nothing more than ashorthand device for expressing basic causal statements. On the other hand, the assumption that this sentential connective can be interpreted as a relation between propositions is tantamount to assuming the validity of the following two principles: (2.10)
If the proposition-that-P PCQ iff RCQ.
= the
proposition-that-R, then
152 (2.11)
CHAPTER VII
If the proposition-that-Q=the PCQ iff PCR.
proposition-that-R, then
Unfortunately, the individuation of propositions is almost as much of a problem as is the individuation of events. Philosophers often pretend that logical equivalence is a necessary and sufficient condition for propositional identity, but it is pretty obvious that this is really too weak a criterion. However, it is at least clear that logical equivalence is a necessary condition for event identity, and hence 2.10 and 2.11 are implied by the following two principles: (2.12)
I f P < - > Â ¥ R , t h e n P C Q RCQ. if
(2.13)
If Q
<->
R, then PCQ iff PCR.
If we can agree that 2.12 and 2.13 are true, then we can safely interpret basic causal statements as expressing a relation between propositions. That 2.12 and 2.13 are true seems almost obvious to me. However, Davidson (1967) has given an argument which purports to establish their invalidity. By adapting an argument of Frege, Davidson shows that principles 2.12 and 2.13 together with the following extensionality principle: (2.14)
If ti and t2 are individual terms and t i = t2, then (Ftl)CQ iff (Ft2)CQ, and PC(Ft,) iff PC(Fr2).
lead to the absurd result that the connective 'C' is truth-functional. In particular, they lead to the result that if any basic causal statements are true, then whenever P and Q are true, so is ^PCQ1. The argument is as follows. Suppose we have some true causal statement ^RCS1. Then R and S are true. But R and S are logically equivalent, respectively, to the statements r{x; x e w & R} = w1 and ^{x; x E w & S} = w1 (where w is the set of all natural numbers). Thus by 2.12 and 2.13, ({x; x w & R} = w)C({x; x w & S} = w). If P and Q are true, then {x; x e w & P}={x; x e w & R } , a n d { x ; x ~ & w Q}={x; x e w & S}. Hence by 2.14, ({x; x w & P} = w)C({x; x w & Q}= w). But the antecedent and consequent of this basic causal statement are equivalent to P and Q respectively, so by 2.12 and 2.13, ^PCQ1 is true. Davidson takes this absurd conclusion as establishing the incorrectness of 2.12 and
CAUSES
153
2.13, because he thinks that 2.14 is obviously correct. But strictly speaking, all the argument shows is that we must reject either 2.12 and 2.13, or principle 2.14, and in the abstract I do not think that either choice is obviously the correct one. Davidson thinks that 2.14 is obviously correct because he accepts the dogma that basic causal statements express a relation between events, and he thinks that events are extensional in the sense of 2.14. Personally, I do not think it is all that obvious that events are extensional in the appropriate sense, but whether they are or not is irrelevant to the validity or invalidity of 2.14 because, as we have seen, basic causal statements do not express relations between events. The assumption that they do has been responsible for a great deal of confusion regarding the logical properties of basic causal statements. In particular, this assumption has led a number of authors to endorse 2.14 and on that basis to reject 2.12 and 2.13. The decision whether to reject 2.12 and 2.13 or to reject 2.14 must rest upon an appeal to actual examples. At first glance, the appeal to actual examples appears to support 2.14, but as we will see on closer examination, this appearance is illusory. Suppose we have a group of men in a room, just one of whom is red-headed, and the redhead is also the most powerful man in the room. Suppose the following is true: (2.15)
That the most powerful man in the room gave the order caused it to be obeyed.
Does the following follow from 2.15? (2.16)
That the only redhead in the room gave the order caused it to be obeyed.
It seems that it does. But we must be careful. 2.15 and 2.16 involve definite descriptions, and definite descriptions are subject to scope ambiguities. In general, '^(FixGx)CQ1 is ambiguous between the narrow scope reading r[Fx](ixGx)CQ1 and the wide scope reading r[FxCQ](ixGx)l. In general, the scope notation is defined by stipulating that: (2.17)
"[Fx](ixGx)^ is equivalent to "(3!x)Gx & (3x)(Fx & Gx)'
Applying this to our causal statements, the narrow scope reading
154
C H A P T E RV I I
r [ F x ] ( ~ x G x ) C Qis1 defined as:
(idiomatically: "that there is a unique G and it is F caused it to be the case that Q1). The wide scope reading r [ ~ ~ C ~ ] ( ~is ~defined G x ) as: l
(idiomatically: '"there is a unique G, and its being F caused it to be the case that Q1). Corresponding to the scope ambiguity, where ti and t2 are definite descriptions, 2.14 is ambiguous between two principles:
(2.14*) If tl = t2, then [ F x ] ( t l ) C Q iff [Fx](t2)CQ, and P C [ F X ] ( ~ ~ ) iff PC[Fx](tZ). (2.14**) If t i = t2, then [ F x C Q ] ( t I ) iff [FxCQl(t2), and [PCFxI(td iff [ P CFx](t2). As examination of Davidson's argument shows that in order to make the substitution steps which enable him to obtain his disastrous conclusions he must have the narrow-scope version of 2.14, viz., 2.14*. This is because the term '^{x;
(2.15*) That there was a unique most powerful man in the room and the person who gave the order was a most powerful man in the room, caused the order to be obeyed. (2.16*) That there was a unique redhead in the room and the person who gave the order was a redhead in the room, caused the order to be obeyed. (2.15**) There was a unique most powerful man in the room, and there was someone such that he was the most powerful man in the room and his giving the order caused it to be obeyed.
CAUSES
155
(2.16**) There was a unique redhead in the room, and there was someone such that he was a redhead in the room and his giving the order caused it to be obeyed. Given this disambiguation,' it seems to me that 2.16* is unequivocally false and 2.16** is unequivocally true. Thus the example supports only 2.14** and actually constitutes a counter-example to 2.14*. This diagnosis indicates that Davidson's argument does not, after all, establish the invalidity to 2.12 and 2.13, and hence we are free to interpret 'C' as expressing a relation between propositions. However, such an interpretation does not preclude our also interpreting 'C' in terms of a relation between some other entities having less stringent criteria for individuation. I have argued that there is no reason to think causation to be a relation between events, but given the distinctions I have made this is probably not what most recent philosophers have wanted to maintain anyway. They have really wanted to say that causation is a relation between the entities that are denoted by nominalized gerunds, and they mistakenly thought that those entities are what are commonly called 'events'. An initial obstacle to this position would be that the consequents of basic causal statements are not nominalized gerunds, but notice that the basic causal statement "Its being the case that P caused it to be the case that 0' is convertible into the equivalent statement "Its being the case that 0 was caused by its being the case that P1,and the latter does seem to express a relation between the referents of two nominalized gerunds. There are going to be philosophers who question whether nominalized gerunds denote anything at all, but this is not a difficulty I am inclined to take seriously. If nothing else, we could always artificially construct suitable entities somewhat on the order of Kim's ordered triples (Kim (1970)). I think we might reasonably say that nominalized gerunds denote 'states of affairs', taking the latter as an explicitly technical term introduced for the sole purpose of having a name for these entities. Then it seems quite reasonable to say that basic causal statements express a relation between states of affairs. However, even if we grant this, it is not at all obvious how states of affairs are individuated. In particular, states of affairs are much more 'logically strict' entities than are events, and it is not obvious that states of affairs have coarser criteria for individuation than propositions do.
,
156
CHAPTER VII
I think this whole attempt to resolve the ontological problems about causation before attempting to give an analysis of causal statements is essentially backwards. Given a kind E of entity, the question whether basic causal statements can be regarded as expressing a relation between entities of kind E is equivalent to the question whether, whenever P and R 'correspond to' the same entity of kind E and Q is true iff and S 'correspond to', the same entity of kind E, R C S ' is true. The criteria for individuating entities of kind E can be regarded as telling us when two propositions 'correspond to' the same entity of kind E. Consequently, the question of what kinds of entities can be regarded as the relata of basic causal statements is just the question of what relations between propositions make them interchangeable in the context of basic causal statements. In other words, what relations St make the following true: (2.20)
If PWQ t h e n P C R i f f QCR,and RCPiff RCQ.
Putting the question in this way indicates that to attempt to answer the ontological question before analyzing basic causal statements is to put the cart before the horse. There are really two distinguishable problems regarding the concept of a cause. First, there is the ontological question, which we have been considering. This is the question what sorts of entities are related by causal relations. Distinct from the ontological question is the second, and more traditional, problem regarding the concept of a cause. This is to give an analysis of statements like 'John's striking the match caused there to be an explosion' in terms of other notions we already understand. There is no reason why we should have to solve the ontological problem before we can solve this latter problem. And as we have just seen, the ontological question is equivalent to a question about the logical properties of basic causal statements (i.e., the question what relations 9i satisfy 2.20), so any adequate answer to the ontological question may actually presuppose a prior analysis of basic causal statements. In fact, I have been unable to find any interesting relations 9t (other than logical equivalence) which satisfy condition 2.20. This suggests that basic causal statements cannot be analyzed as expressing a relation between entities having coarser criteria for individuation than logical equivalence, and hence that basic causal statements can only be
CAUSES
157
regarded as expressing a relation between propositions or something with equally stringent criteria for individuation. I am not sure that this is the case, but for lack of any positive results to the contrary I do not propose to return to the ontological question. The remainder of this chapter will be concerned with an analysis of basic causal statements in terms of a relation between propositions.
I have stated our problem as being that of analyzing the causal relation 'its being the case that. . . caused it to be the case that. . .'. However, this is not the only causal relation, and there is reason to believe that it is not even the most fundamental or most important causal relation. To begin with, this relation only holds between true propositions, but we often have occasion to say that something which did not happen would have caused something else. This subjunctive notion of cause is interdefinable with our indicative notion of cause: (3.1)
Its being the case that P would cause it to be the case that Q, iff, [P>(PCQ)]& O P . ~
(3.2)
PCQ iff, its being the case that P would cause it to be the case that Q, and it is true that P . ~ .
I will frequently abbreviate the long construction "Its being the case that P would cause it to be the case that Q1 as "P would cause 0'. Similarly, I will abbreviate "Its being the case that P caused it to be the case that Q1 as "P caused Q1. These shortened forms are convenient, but they are grammatically illformed and must be understood as elliptical for the more complicated constructions. There is another causal relation different from either of these two relations. We might call this the relation of causal sufficiency. Frequently, its being the case that P may fail to cause it to be the case that Q only because there is some third proposition R such that its being the case that R has already caused it to be the case that Q. For example, suppose we shoot a man twice. Shooting him the first time causes his death. Then shooting him the second time may only fail to cause his death because something else has already caused it. When
158
CHAPTER V I I
this happens, we say that its being the case that P would be causally sufficient for it to be the case that Q even though it would not or did not cause it to be the case that 0. This notion of causal sufficiency seems, in some sense, to be the most fundamental causal concept. The relation between one proposition and a second proposition which it would cause is that of causal sufficiency. Whether the first proposition also causes the second depends not just on the relation between the two propositions, but also on what other true propositions there are that would be causally sufficient for the second proposition. The notion of causal sufficiency is also interdefinable with the notion of a cause. To say that P would be causally sufficient for 0 (symbolized "PCSQ1) is to say, roughly that P would cause Q if Q weren't already true for some other reason. This suggests: (3.3) PCSQiff[-Q>(P>PCQ)]. However, this does not adequately handle the case in which P actually does cause 0. In such a case, if 0 were false, then P might no longer be sufficient to cause 0. For example, suppose we have a light operated by two switches in series. If both switches are closed, the light is on. Switch A was closed initially, and then switch B was closed. Switch B's being closed caused the light to be on. But if the light weren't now on, then one of the two switches would be open, but there is nothing that determines which switch would be open. In particular, switch A might be open. But in that case, switch B's being closed would not cause the light to be on. So it is not true that if the light weren't on, then switch B's being closed would cause the light to be on. The difficulty here is that if P actually causes 0 , it characteristically does so in conjunction with certain other collateral truths, so if Q were false, we can only conclude that either P would be false or one of those collateral truths would be false. If one of the collateral truths were false, then P would no longer cause 0 . We can take care of this just as we did in the analysis of the necessitation conditional by saying not just that 0 is false, but also specifying that it is the failure of P which is involved in Q's being false rather than the failure of one of the collateral truths. In other words, our analysis should be:
We include the final conjuncts to exclude the trivial cases.
CAUSES
It is equally simple to define 'cause' in terms of causal sufficiency. The most common case in which we want to say that P was causally sufficient for Q without causing Q is where there is more than one true proposition causally sufficient (in different ways) for Q. The classical example is that of two members of a firing squad who fire their weapons simultaneously with the result that their bullets hit the victim at the same instant and are each causally sufficient to cause death at the same instant. We will not happily say that either man's firing his weapon individually caused the death of the victim (although, of course, the death was caused by the fact that at least one of them fired his weapon). This seems to be because, had either man not fired his weapon, the victim would have died anyway. This suggests that we can analyze 'cause' as:
(3.5)
PCQiff[P&PCSQ&(-P>-Q)].
However, counterexamples have been proposed to the condition (-P>-Q)l. Scriven (1964) gives the example of a man whose state of excitation at ~ : O O P M is such that he would suffer a stroke at 4:55 PM which would cause his death at 5 :00PM were it not that a heart attack intervenes at 4 : 50 PM which causes his death at 5 :00PM. The heart attack prevents the stroke, and hence is the cause of death. But it is not true that if he had not suffered the heart attack he would not have died at 5 :00PM. On the contrary, he would have died of the stroke. Thus it seems that the condition r(-P > - 0 ) ' is too strong. I d o not think that this is a genuine counterexample. It involves the same confusion as was involved earlier in overlooking the distinction between causing it to rain today and causing today's rain. To make it easier to see this, let us first consider a modified example in which the stroke would have caused death at 5 : 10 PM rather than at 5 :00PM. Then it is true both that the heart attack caused the man's death and that the heart attack caused the man to die at 5 :00 PM. But for both of these, the condition r(-P> -Q)' is satisfied. First, if he had not had the heart attack, the man would still have died, but it would have been a different death. It would have been a death occurring at a different time and involving quite different processes. Second, if the man had not had the heart attack he would not have died at 5 :00PM. Contrast these two true causal statements with the false causal statement that
160
CHAPTER V I I
the heart attack caused the man to die today, i.e., caused it to be the cause that there was a time today when he died.4 This statement is false precisely because he would have died today anyway. We have made the example easier to deal with by supposing that the stroke would have caused death at a different time than the heart attack. Let us return to the original example in which the stroke would also cause death at 5 :00 P M . We must still distinguish between causing the man's death (that very death that actually occurred) and causing the man to die at 5 :00. It is still true that the heart attack caused the man's death, and that the death would not have occurred had he not had the heart attack. The man would still have died, but it would have been a different death - one involving quite different processes. O n the other hand, it is no longer true that the heart attack caused the man to die at 5 :00 - and this is because he would have died at 5 : 00 anyway. Thus, rather than constituting a counterexample to 3.5, 1 think that 3.5 explains the fine structure of the judgments that we make regarding this example. Of course, my resolution of this putative counterexample presupposes certain judgments about the individuation of events, but I think that those judgments are correct. Consequently, I believe that 3.5 constitutes a correct analysis of 'causes' in terms of causal sufficiency. Because of the interdefinability of 'cause', 'would cause', and 'causally sufficient', if we can provide an analysis of one of these notions, analyses of the others will follow. My strategy below will be to give an analysis of causal sufficiency.
We turn now to the analysis of causal sufficiency. We will build up to our final analysis by considering sequentially some plausible suggestions that have been made in the literature. 4.1. Nomic Subsumption The simplest approach to the analysis of causal sufficiency is the nomic subsumption model. This arises out of the traditional regularity theory
CAUSES
161
or the constant conjunction theory adumbrated by Hume. On the nomic subsumption model, P is causally sufficient for 0 just in case 0 'can be obtained from' P together with some physical laws. Kim (1973) has shown how difficult it is to make clear this notion of 'can be obtained from'. But this is a problem we have already discussed in talking about how laws are derivable from one another. We can formulate the nomic subsumption model quite simply as follows:
(4.1)
Nomic Subsumption Model: P is causally sufficient for Q iff P+ 0.
There is a standard objection to the nomic subsumption model. This is that it only deals with what John Stuart Mill called 'total causes'. For example, consider a house fire that results from faulty insulation. We would say that the faulty insulation is the cause of the fire, but there is no physical law to the effect that faulty insulation is always followed by a fire. In formulating the physical law that is operative here, we must mention the presence of oxygen, the temperature of the air, the current in the wires, etc. According to the nomic subsumption model, all of these together would be causally sufficient for the fire, but the faulty insulation would not be. But this is incorrect if taken as a remark about the concept of 'cause' that we have set out to analyze. The notion of the 'total cause' is probably a useful notion (although, as we will see, the nomic subsumption model does not capture it correctly), but it is not our ordinary concept of 'cause'. The ordinary concept of a cause has much in common with subjunctive conditionals. What makes subjunctive conditionals useful, and also difficult to analyze, is that under certain circumstances we are allowed to delete true conjuncts from the antecedent. The whole problem of analyzing subjunctive conditionals was to say when that can be done. Similarly, in stating causes, we are not required to state the total cause. Under certain circumstances we can delete mention of most of the causal factors, reporting simply that the remaining causal factors caused the effect. An important part of the problem of analyzing causal sufficiency is to say when such causal factors need not be mentioned in stating the cause. I have been urging that the nomic subsumption model is at best an analysis of 'total cause', and not an analysis of causal sufficiency. But
162
CHAPTER VII
there is a more fundamental difficulty with the nomic subsumption model which shows that it does not even capture the notion of 'total cause'. This is the difficulty, which we will encounter again, of 'epiphenomena'. When two propositions share a common cause, without either being causally sufficient for the other, we say that they are epiphenomena. We might well have a law statement "P 3 0' which is true not because P is causally sufficient for 0, but rather because there are physical laws which ensure that anything which would be causally sufficient for P would also be causally sufficient for 0. For example, P and Q might report the occurrence of astronomical events on a star, and we might have cosmological laws which tell us that there is only one way that P could be true, and that would also result in Q's being true. Under these circumstances, P and Q are epiphenomena instead of P being the total cause of Q. Apparently the notion of a total cause cannot be defined so simply as proposed by the nomic subsumption model. Just because " P 3 Q1is true, it does not follow that P is the total cause of 0. What is required in addition is that P cause Q. This suggests that we must define 'total cause' as follows: (4.2)
P is a total cause of Q iff P C 0 and P 3 Q.
The difference between a cause and a total cause is that the latter is a cause which is connected to its effect directly by a physical law. Of course, if we define 'total cause' as in 4.2, it will not be of much help in analyzing causal sufficiency. In particular, if we try to define causal sufficiency directly in terms of 'total cause', our definitions will be circular. It might be possible to avoid this circle by defining 'total cause' in some other way, but, as we will see, it is possible to break the circle anyway at the other end by providing an analysis of causal sufficiency which does not use the notion of a total cause. 4.2. Contingently Sufficient Conditions Scriven (1964) and Mackie (1965) have proposed basically similar analyses which explicitly recognize that in stating the cause of something, we do not have to state the total cause. Scriven formulates the analysis succinctly by saying that causes are "contingently sufficient
163
CAUSES
conditions". More precisely: (4.3)
P is causally sufficient. for Q iff there is a set of true propositions R l , . . . , R,, such that P together with R,, . . . , R,, is sufficient for Q (i.e., ( P & R , & . . . & R n ) 3 Q), but R l , . . . , Rn alone are not sufficient for Q . e . , ( R i & . . . & R n ) Q).
By taking the conjunction of the R's, we can simplify this proposal to: (4.4)
P is causally sufficient for Q iff there is a true proposition R such that ( P & R ) 3 Q, but R Q.
+
This analysis is clearly inadequate as it stands. The simplest way of demonstrating the inadequacy is to note that it entails that if 0 is true, then any proposition at all is causally sufficient for 0 . To get this result, we simply let R by " ( P z Q)'. For the same reason, a false proposition would be causally sufficient for anything at all. This difficulty seems to be related to the analogous difficulty that arose in connection with subjunctive conditionals. There the difficulty was to explain under what circumstances we can detach a true proposition R from a conditional '"[(P & R ) 3 Q]' to obtain a true simple subjunctive '"(P>Q)'. In that case, the condition required for detachment was REP1. This condition also seems to be required in the case of causal sufficiency. When a condition R is already true which, when conjoined with P would be causally sufficient for 0 , in order for us to delete mention of R and say simply that P itself would be causally sufficient for Q it is at least necessary that R would still be true if P were true. If this condition is not satisfied, then P's being true would be no guarantee of the truth of 0, and hence P-is not causally sufficient for 0 . For example, it is true that natural gas has been flowing from the orifice of the pilot light on my furnace for the past hour. That, taken in conjunction with the flame on the pilot light's having blown out an hour ago and my now lighting a match in the vicinity of the water heater, would be causally sufficient for an explosion. But we cannot conclude, in accordance with 4.4, that the flame on my water heater having gone out an hour ago and my now lighting a match in its vicinity, would be causally sufficient for an explosion. This is because it is not true that gas would have been flowing from the pilot light for the
164
CHAPTERVII
last hour had it blown out. There is a safety mechanism designed to prevent that. This indicates that we must at least add one further condition to 4.4: P is causally sufficient for Q iff there is a true proposition R (4.5) such that REP and [ ( P & R ) 3 Q ] and R $ Q. But now something remarkable has happened. The analysans of 4.5 is equivalent to the condition that the simple subjunctive ' ( P > Q)' be true. This leads naturally into the attempt to analyze causal sufficiency in terms of subjunctive conditionals.
4.3. Causal Sufficiency and Subjunctive Conditionals That ' ( P > Q)' be true is clearly a necessary condition for P to be causally sufficient for Q, but I think it is equally clear that it is not a sufficient condition. For example, '"(P& Q)' entails ' ( P > Q)', but we certainly d o not want t o conclude that any two true propositions are causally sufficient for one another. W e must add additional requirements if we are t o obtain a satisfactory analysis of causal sufficiency. T o begin with, it seems clear that causal sufficiency is a special case of necessitation:
(4.6)
If P is causally sufficient for 0, then ( P >> 0).
If P is causally sufficient for 0,then the truth of P must be sufficient to 'bring it about' that Q is true, in the sense that ( P >> 0 ) .Although this is still not a sufficient condition for causal sufficiency, I think we are now on the right track. From this point the analysis will proceed by accumulating additional necessary conditions until finally we have a list which is also sufficient. T h e basic connection between cause and effect seems to be that of necessitation. But not all necessitations give rise t o causal relations. If P Ñ Q, then P >> Q, but it is not usually true in such a case that P is causally sufficient for Q. Kim (1973)gives the following two examples: (i) Yesterday's being Monday does not cause today to be Tuesday. (ii) George's being born in 1950 and still being alive in 197 1 does not cause him to have reached the age of 2 1 in 1971.
CAUSES
165
It seems reasonable to suppose that causation is a species of contingent necessitation, and hence the cause can never entail the effect. However, there are what appear to be counterexamples to this supposition: (1)
If you heat the metal to a sufficiently high temperature, that will cause it to melt.
(2)
His bending the rod over his knee until it broke caused the rod to break. His being fatally stabbed caused him to die.
(3)
These examples seem a little odd, but I think the oddness is due to the triviality of the connection between the antecedent and the consequent rather than it being the oddness of logical impossibility. It is quite possible for statements like ( I ) , (2), and (3) to be true, and they seem t o be examples of the cause entailing the effect. However, upon closer examination this becomes less obvious. The only examples I have been able to find in which the cause seems to entail the effect are of the same general sort as ( I ) , ( 2 ) , and (3), and it seems that these statements can be paraphrased as follows:
(I*)
There is a temperature such that the metal's being heated t o that temperature is causally sufficient for it t o melt, and if you heat it t o that temperature now, that will cause it to melt.
(2*)
He bent the rod to a degree which was causally sufficient for it to break, and his bending it to that degree caused it to break.
(3*)
H e was stabbed in a way which was causally sufficient for him to die, and his being stabbed in that way caused him to die.
Thus ( I ) , (2), and (3) can be regarded as not having the form 'PCQ', but instead a more complicated form involving quantification into causal contexts. So construed, they d o not constitute counter-examples t o the requirement that causation involves contingent necessitation. Consequently, I think that we can build this requirement into the analysis of causal sufficiency:
(4.7)
If P is causally sufficient for Q, then P does not entail 0.
166
CHAPTER VII
We have two necessary conditions for causal sufficiency, but these conditions are not also sufficient. So far we have overlooked a well known problem. This is the problem of epiphenomena. Frequently we will have two propositions P and 0 such that P contingently necessitates 0, but does so because P's being true necessitates the truth of a third proposition R which would cause Q without the causal chain passing through P. In such a case, we say that Q is an 'epiphenomenal effect' of P. To take a simple example, suppose we have a black box with a switch and two lights. The box is wired in such a way if the switch is thrown, light A comes on and is followed two seconds later by light B. In an actual situation in which both lights are off, we might well know that if light A were to come on this would be because the switch was throwo. Letting P be 'Light A comes on' and 0 be 'Light B comes on', we see that P contingently necessitates 0. But it is clear that P is not causally sufficient for 0. The connection between P and Q is rather that what would cause P to be true would also cause Q to be true. The case of epiphenomena is to be distinguished from the case in which P and Q have a common cause R, but R causes Q by causing P. This kind of case gives rise to talk of 'causal chains'. If in the previous example, light B came on not as a direct result of throwing the switch but rather in response to a photoelectric cell sensing the light output of light A, then although throwing the switch would cause both lights to come on, we would not have a case of epiphenomena. This is because light A's Coming on would itself cause light B to come on. The difference between these two cases is that in the epiphenomenal case light A's coming on without the switch being thrown would not necessitate light B's coming on, whereas in the non-epiphenomena1 case light A's coming on would necessitate light B's coming on regardless of whether the switch is thrown. This suggests a way of characterizing the epiphenomenal case. As a first approximation we might try: (4.8)
Q is an epiphenomenal effect of P iff there is a proposition R such that ( P >> R ) and (R >> Q) and -[(P & - - R ) >> 01.
For example, in the epiphenomenal case of the black box, R is 'The switch is thrown'.
CAUSES
167
4.8 embodies the right idea, but it is not entirely adequate as it stands. The difficulties are those of 'underdetermination' and 'overdetermination'. First underdetermination: rather than knowing of a particular R which would necessitate 0, we may know of a set F such that one of the members of F would be true if P were true, but it is not determined which member of F it would be, and we may know that no matter which member of F would be true, it would also necessitate 0 . This gives us: Q is an epiphenomenal effect of P iff there is a set F of propositions such that (P Ã .SF) and -[(P & -.ST) >> 01, and for each R in F, R Ã 0 . There remains the problem of overdetermination. We might know that if P were true then some member of F would be true, but we might also know that if P were true and no member of F were true, then some member of a second set F* would be true, and each member of F* necessitates Q. In this case we might have [(P & -SF) Ã Q] true in spite of the fact that Q is an epiphenomenal effect of P. In this case, the reason 0 is an epiphenomenal effect of P is that [(P & -^T) Ã Sr*] and -[(P & -2r & -SF*) >> 01, and for each R in F*, R Ã 0 . Generalizing this, we may have a whole sequence Fo, . . . , FQ, . . . (p < a ) of sets of propositions such that if P were true then some member of Fo would be true, but if P were true and no member of Fnwere true then some member of F, would be true, and so on. This suggests the following definition:
0 is an epiphenomenal effect of P iff there is a sequence rg(B < a ) of sets of propositions such that for each /3 < a , [(P & -%) Ã JFp], and for each R in Fa, (R Ã Q), and -[(P & & < a --SFy) Ã 01. We now have three requirements for causal sufficiency. Let us use them to define a notion of 'almost causal sufficiency': (4.9)
(4.10)
P is almost causally sufficient for Q iff (P Ã 0) and P does not entail 0 , and 0 is not an epiphenomenal effect of P, and 0(-P & -0) & OP.
What, if anything, must be added to almost causal sufficiency to obtain causal sufficiency?
168
CHAPTER V I I
I believe that the one outstanding problem for the relation of almost causal sufficiency is that of the direction of causation - in many cases the relation of almost causal sufficiency does not discriminate between cause and effect. For example, consider a simple electrical circuit with a switch which operates a light. Let P be 'The switch is thrown' and Q be 'The light comes on'. Clearly, P is almost causally sufficient for Q. Unfortunately, this goes the other way too: Q is almost causally sufficient for P. That is, if the light were to go on then the switch would have been thrown, there is no entailment, and P is not an epiphenomenal effect of 0. It is considerations like this that have led philosophers to build temporal relations into the analysis of causation. The idea is that we can distinguish cause from effect by seeing which comes first. Basically, I think that this idea is correct, but there a r e numerous details to be worked out. The first difficulty is that it is often unclear what it means to talk about temporal relations between a particular causal antecedent and causal consequent. Talk of temporal order would seem to presuppose a notion of the date of a proposition, but this is a very problematic notion. If the content of a proposition is that some object is in a simple state at a particular time, then it is clear what to count as the date of the proposition. In this case, the date is the time mentioned. But it is not so clear what to count as the date of more complex propositions. One might suppose that if a proposition asserts that something is true at a particular time, then that time should be taken as the date of the proposition. But this prima facie reasonable supposition leads to the embarrassing conclusion that logically equivalent propositions can have different dates. Consider, for example: A t 3 :00 PM,x became
At 2 : 00 PM,it was one hour before x was to become
In general, I doubt that it makes much sense to talk about the date of a proposition even when it is in some sense a temporal proposition. For example, what is the date of the proposition that John gave Joe a black eye last week? Is the date all of last week, or is it the time when he gave him the black eye, or what? Or consider the proposition that one of my ancestors was a horse thief. What on earth could count as the date of this proposition?
CAUSES
169
If we attempt to date propositions in terms of what dates they mention, we will get into all of the above difficulties. But such a notion of the date of a proposition will not help us anyway. Consider, for example: (4.11)
The clouds being seeded today caused it to rain today.
The mentioned dates of this causal antecedent and consequent would seem to be all of today. But so construed, this cause and effect have the same date and so cannot be discriminated on that basis. However, reflection indicates that the dates that are relevant to this example are not the dates mentioned by the causal antecedent and consequent, but are rather the precise time when the clouds were seeded and the precise time when it rained. The clouds were indeed seeded before it began to rain. The notion of the date of a proposition that is relevant here is, roughly, the date of 'what makes the proposition true'. For example, what makes it true that the clouds were seeded today is that potassium iodide crystals were released into the clouds at certain times today, and it is that collection of times which constitutes the date relevant to assessing the causal relation. In the case of a causal statement like 4.11, it seems that what is required is that 'what makes the antecedent true' caused 'what makes the consequent true', and that a temporal relation is built into this latter causal relation which thereby discriminates between cause and effect. For example, in the case of 4.11, the clouds being seeded when they were caused it to rain when it did. To turn this into a general account, we have to get clear on 'what makes the antecedent and consequent true'. If, as in 4.1 1, a statement is an existential generalization, then what makes. it true is whatever makes some of its instances true. Similarly, if a statement is a disjunction, what makes it true is whatever makes one of its disjuncts true. If a statement is a conjunction, what makes it true is whatever makes both conjuncts true. And so on for each kind of logical compound until we get down to propositions 'which are not logical compounds, i.e., propositions which are simple in the sense of Chapter IV. What makes a simple proposition true is simply it itself. And simple propositions have precise dates built into them. This suggests that we proceed as follows. First, we define a notion of 'direct causal sufficiency' which
170
CHAPTER VII
holds between simple propositions and which, by making reference to time, allows us to discriminate between cause and effect. Then we define causal sufficiency something like: (4.12)
P is causally sufficient for Q iff P is almost' causally sufficient for 0, and ,if P were true then there would be sets A and B of simple propositions such that A would be what makes P true, and B would be what makes Q true, and the propositions in A would be directly causally sufficient for the propositions in B.
As it is stated, 4.12 has a great many deficiencies. Let us take them one at a time. First, what does it mean to say that A would be what makes P true? As a first approximation, it seems reasonable to take this as meaning that A is a minimal set of true simple propositions adequate to entail P. More precisely:
A makes P true iff A is a set of true simple propositions, A entails P, and no proper subset of A entails P. However, the requirement that A entail P is too strong. The difficulty arises in the case where P is a universal generalization. Consider: That all of the buttons on the console are pushed is causally sufficient for the light to go on. In this case, what makes it true that all of the buttons on the console are pushed is the collection of propositions reporting that each button individually is pushed: {'Button 1 is pushed', 'Button 2 is pushed', . . . , 'Button n is pushed'}. But this collection does not entail that all of the buttons are pushed without the addition of a premise telling us that these are all the buttons there are on the console. Let us call a proposition of the latter form an 'enumerative proposition': (4.13)
An enumerative proposition is a proposition of the form ( x ) [ F x ~ . x = a , v . ..v x = a n ] .
Then our amended definition of 'makes true' is: (4.14)
A makes P true in a world p iff A is a set of simple propositions true in 8, and there is a set of B of enumerative propositions true in 6 such that A U B entails P, and
CAUSES
171
there is no proper subset A. of A for which there exists a set Bo of enumerative propositions true in Q such that (AoU Bo) entails P. Notice that in plugging 4.14 into 4.12, we do not get the result from 4.12 that the propositions in A and B must be true in the real world. All that is required by 4.12 is that some such sets A and B would contain true propositions if P were true. We can make this quite precise by reformulating 4.12 as follows: (4.15)
P is causally sufficient for 0 iff P is almost causally sufficient for 0, and for each world Q such that BMP, there are sets A and B of simple propositions such that: (i) A makes P true in Q ; (ii) B makes Q true in Q ; (iii) in Q, the propositions in A are directly causally sufficient for the propositions in B.
However, 4.15 runs into difficulty with a kind of 'causal overkill'. Consider: (4.16) That someone was in the room was causally sufficient for the warning buzzer to sound. Let us suppose in fact that both Brown and Robinson were in the room. Then there are two distinct minimal sets A making it true that someone was in the room, viz., {'Jones was in the room'} and {'Robinson was in the room'}. All that 4.15 requires is that at least one of these sets be directly causally sufficient for the warning buzzer to sound. Suppose, then, that Jones' presence is causally sufficient to set off the buzzer, but Robinson's presence is not. In such a case, we would not agree that 4.16 is true. What is required for causal sufficiency is not just that some minimal set A be directly causally sufficient for the causal consequent, but that every minimal set A be directly causally sufficient. On the other hand, if we turn our attention to the causal consequent, we only want to require that A be directly causally sufficient for the propositions in some minimal set B. If A is directly causally sufficient for the propositions in some minimal set B which makes 0 true, then A is causally sufficient to make Q true in some way or other, and that
172
CHAPTER VII
is all we want. For example, suppose we have a room lit by three lights numbered 1, 2, 3. Suppose the lights are operated by correspondingly numbered switches, with switches 1 and 2 being in the room and switch 3 being at some remote place outside the room. Suppose lights 1 and 3 are on. Then the following is true: That one of the switches in the room is on is causally sufficient for there to be a light on in the room. What makes it true that there is a switch on in the room is the unit set A : {'Switch 1 in on'}. There are two minimal sets which make it true that one of the lights is on, viz., {'Light 1 is on'} and {'Light 3 is on'}. A is only directly causally sufficient for the first of these sets B, but clearly that is all that is required. This indicates that we must modify 4.15 to read: (4.17)
P is causally sufficient for Q iff P is almost causally sufficient for 0 , and for each world /3 such that PMP: (i) there is a set A of simple propositions such that A makes P true in p ; (ii) for every set A of simple propositions such that A makes P true in p, there is a set B of simple propositions such that B makes Q true in p ; and the propositions in A are directly causally sufficient in 6 for the propositions in B.
Now what remains is to give an analysis of direct causal sufficiency. This is to be a relation of causal sufficiency between sets of simple propositions and simple propositions, and is supposed to provide the means for discriminating between cause and effect by appealing to temporal relations. An obvious first attempt at defining direct causal sufficiency would be the following: (4.18)
If A is a set of simple propositions and for each simple proposition 0 , u(Q) is the date of 0 , then A is directly causally sufficient for the simple proposition P iff A is almost causally sufficient for P and (Q)(Q e A ~ ( 0<)d ~ ) ) . ~
CAUSES
Definition 4.18 embodies the traditional view that a cause must precede its effect. This traditional view is connected with the almost universally held view that causal relations are asymmetric - if P caused 0, then 0 could not cause P. The only obvious way to ensure that causal relations are asymmetric is to build into them the requirement that the cause precede the effect. However, it takes little reflection to see that this requirement is unreasonable. At the very least, cause and effect can occur simultaneously. For example, the five ball's striking the eight ball caused the eight ball to accelerate, and these two events occurred simultaneously. Frequently causes do precede their effects, but not invariably. Apparently, the most we can require is that the effect not precede the cause. But is even this reasonable? We do say things like, 'The baseball game's being held this afternoon caused the soccer game to be cancelled this morning'. However, it seems wrong to say that it is literally the occurrence of the baseball game that caused the cancellation of the soccer game. Rather, it was the prior decision to hold the baseball game that caused the cancellation. Other putative examples of effects preceding causes can generally be handled in this same way. The very idea of an effect preceding its cause seems mindboggling. The only argument I have for this is that it seems required in order to establish the direction of causation. Without this assumption, it does not seem possible to distinguish between cause and effect in many causal contexts. Thus I propose to build this into the analysis of direct causal sufficiency. But we cannot require that the cause precede the effect. But what does this do to the supposed asymmetry of causal sufficiency? It will indeed have the effect, at least on the analysis given here, that causal sufficiency is not asymmetric. But I am not convinced that it should be asymmetric. It seems to me that there are 'feedback examples' in which each of two propositions causes the other. For example, consider two boards arranged in an inverted ' V so that their tops are leaning against one another. Each board is holding the other one up. What this comes to is that either board's not starting to fall at time t prevents the other board from beginning to fall at time t, and hence we have a case of symmetric causation. Consequently, the fact that asymmetry will not result from my analysis of causal sufficiency does not bother me.
174
CHAPTER V I I
If we are agreed that cases of symmetric causation are possible, can we simply modify 4.18 by requiring that the effect not precede the cause, and then accept temporally symmetric cases as cases of symmetric causation? Unfortunately, things are not that simple. Consider the switch and light again, and suppose for the sake of the example that closing the switch instantaneously causes the light to come on. Letting P be 'The switch is closed' and Q be 'The light comes on', we have, as before, that P and Q are each almost causally sufficient for the other, and we have that u(P) = o-(Q) and hence that cause and effect cannot be distinguished on temporal grounds. But we most assuredly do not want to say that the light's coming on causes the switch to be closed. In the case in which u(P) < o-(Q),the temporal ordering does enable us to distinguish between cause and effect, but as we have seen, it is possible for the cause and effect to be simultaneous, in which case the temporal ordering by itself is not sufficient to make the distinction. However, it will turn out that the distinction in the case of simultaneous cause and effect can be made by making it parasitic on the distinction in the non-simultaneous case. First, let us introduce a technical term to refer to the non-simultaneous case which we already know how to handle:
(4.19) .
If A is a set of simple propositions and P is a simple proposition, then ASCSP (A is strictly causally sufficient for P) iff A is almost causally sufficient for P and (Q)(Qe A =) o-(0) < o-(P)).
Now we consider the case of simultaneous cause and effect. On what basis do we discriminate between the cause P and effect Q? I think we do this on the basis of the role P and 0 are able to play in temporally extended causal chains. Roughly, in order for P to be causally sufficient for 0, it must be possible to cause Q by causing P, where these latter causal relations involve non-simultaneous causeleffect pairs. For example, it is possible to cause the light to go on by causing the switch to be closed, but it is not possible to cause the switch to be closed by causing the light to go on. This is the basis upon which we say that the closing of the switch causes the light to go on but not vice versa. But what exactly does it mean to say that it is possible to cause Q by causing P. To begin with, the 'possible' in this formula is an existential
CAUSES
175
quantifier. To say that it is possible to cause 0 by causing P is to say that there is something which would be causally sufficient for Q by being causally sufficient for P. As we are working on the level of simple propositions, the 'something' must be a set of simple propositions, and the causal sufficiency in question should be strict causal sufficiency. But still, what does it mean to say that a set of simple propositions is strictly causally sufficient for Q by being strictly causally sufficient for P. To simplify things initially, let us suppose that the set of simple propositions has a single member R . Then we want to explain what it means to say that R is strictly causally sufficient for 0 by being strictly causally sufficient for P. Clearly this requires at least that R is strictly causally sufficient for P and R is strictly causally sufficient for 0. In addition it would seem to require that we do not have ' ( R & - P ) > Q1. That is, although R is sufficient to bring about 0, if ' ( R & -P)' were true, then the way in which R would bring about Q has been . undermined. The causal chain from R to 0 passes through P, so if we have ' ( R & -P)', then the causal chain has been broken and there is no reason to expect 0 to be true. Can we perhaps take this as our analysis and say that R is strictly causally sufficient for Q by being strictly causally sufficient for P iff R S C S P & R S C S Q & - [ ( R & - P ) > Q]? Unfortunately, this is still too weak to discriminate between P and 0. This is because if R is S C S for 0 by being S C S for P, then we also have ^ [ ( R & -0) > PI1. This is because if we had ( R & -Q)', this would tell us that the causal chain between R and Q had broken down somewhere, but we would not know whether it had broken down between R and P or between P and 0. Consequently, if we had ' ( R & -Q)'. then P might be false. For example, in the case of the switch and the light, suppose R reports the setting in motion of a mechanical contrivance which closes the switch by the activation of certain motors and levers. If the mechanical contrivance were set in motion but the light did not come on, then something would have gone wrong, but the malfunction might have been either in the mechanical contrivance or in the wiring between the switch and the light, so we cannot conclude that the switch would have been closed. Evidently a stronger condition is required to capture what we mean by saying that R is S C S for 0 by being S C S for P. The nature of this
176
CHAPTER VII
stronger condition can be seen by realizing that it is always possible to choose R such that we have not just ^ - [ ( R & - P ) > QI1, but ^(R & - P ) > -QI1. Again, if R reports the setting in motion of the mechanical contrivance, then if R is true but the switch is not closed then the light will not go on. But we do not have that if R is true but the light does not go on then the switch is not closed. On the contrary, if R is true but the light does not go on, this might be because something was wrong with the circuit rather than with the mechanical contrivance. In general, if P and Q are simultaneous and P is causally sufficient for 0, then it seems to be the case that we can construct a set A of simple propositions which is S C S for both P and Q and such that [ ( H A & - P ) > Q ] . To have the latter condition satisfied, we must include in A both propositions sufficient to bring about P and propositions sufficient to rule out any other ways of bringing about P. For example, suppose our light is connected to two switches 1 and 2, and closing either is causally sufficient for the light to be on. Suppose further that switch 2 is closed. Then closing switch 1 is causally sufficient for the light to be on, although it would not cause the light to be on. In this case, in constructing our set A we must include a mechanism for closing switch 1, and we must also include propositions precluding switch 2's being closed. Thus my proposal becomes:
-
(4.20)
If P and Q are simple propositions and u ( P )= o-(Q), then P is directly causally sufficient for Q iff P is almost causally sufficient for Q and there is a set A of simple propositions such that A S C S P & A S C S Q & [ ( H A & -P)> Q ] .
Notice that in cases of genuinely symmetric causation, like that of the two boards leaning against one another, this gives the result that each is causally sufficient for the other. Let us call the boards 'board 1' and 'board 2'. P reports that board 1 does not begin to fall, and 0 reports that board 2 does not begin to fall. Then we want to say that P is causally sufficient for 0,and also Q is causally sufficient for P. To see that it results from 4.20 that P is causally sufficient for Q, we note that we can construct an A describing, e.g., building certain braces around board 1, such that A is causally sufficient for P and thereby for 0,but A is such that if we did build the braces but board 1 began to fall anyway, then board 2 would begin to fall. To see that we also have 0 causally sufficient for P, we note that we can construct another set A *
177
CAUSES
which is analogous to A but involves bracing board 2 rather than board 1, and then we have [ ( H A * & Q ) > -PI. We must generalize 4.20 to include the case in which the causal antecedent is a set of simple propositions rather than a single proposition P, This involves a few new difficulties. First, when do we apply the test? The general case not covered by 4.19 is that in which ( Q ) ( Q e A ^ u ( Q ) s ; u ( P ) )and ( 3 Q ) ( Q e A & u ( Q ) = u ( P ) ) . But it might seem that as long as Q contains a member having a date earlier than P, this is sufficient to distinguish cause from effect and hence we need not apply our complex test. However, this is an illusion. For example, A might contain a list of conditions under which one member Q of A would be causally sufficient for P, where u ( Q )= u ( P ) . We might have that under those conditions, Q would also be almost causally sufficient for P (although not really causally sufficient for P ) . Then letting A* be the result of replacing Q by P in A, we would have that A * is almost causally sufficient for Q. But we would not want to conclude that A* is causally sufficient for 0, even though A * contains members which predate 0. Thus our test must be applied in general if A contains any members having the same date as P. But what precisely does our test amount to when the causal antecedent is a set rather than a single proposition? We might suppose that it requires there to be a set B of simple propositions such that:
-
i ) (Q)[Q A 3 BSCSQI; (ii) B S C S P ; (iii) [ ( U B & -J7A)>-PI. The difficulty is with requirement (i). That requirement implies that ( Q ) ( R ) [ ( Qe B & R A ) = > u ( Q )
(4.21)
If B is a set of simple propositions and B,={Q; Q e B & u(Q)
t
is a time, then
178
CHAPTER V I I
Then it seems we should replace (i) by:
Putting all of this together, we obtain our final analysis of direct causal sufficiency: (4.22)
^
If A is a set of s' pie propositions and P is a simple proposition, then A CSP iff A is almost causally sufficient for P and either; (i) (Q)(Q e A 3 u(Q) < o-(P)) (i.e., ASCSP); or i i ) (Q)(Q e A 3 40)5 o-(P)) and (3Q)(Q e A ) & u(Q) = u(P)),and there is a set B of simple propositions such that:
Finally then, we want to incorporate our analysis of direct causal sufficiency into 4.17 to give us an analysis of causal sufficiency. In order to do this, we must first become aware of the inadequacy of the final clause of 4.17, which requires that if A and B are the sets of simple propositions making P and Q true, then (in order for P to be causally sufficient for Q) we must have the propositions in A directly causally sufficient for the propositions in B. The first difficulty with this requirement is that the dates of members of A may overlap with the dates of members of B. In analogy to 4.22, all we really want to require is that the members of A having dates no later than that of any particular proposition R in B be directly causally sufficient for R. That is, defining: (4.23)
If C is a set of simple propositions and t is a time, then Cw={Q; Q e C & o-(Q)
be directly causally sufficient for we require of each R in B that R. However, this is still too strong a requirement. Consider, for example: (4.24)
Its raining for forty days and forty nights would cause it to be the case that all the unicorns die.
CAUSES
179
Notice that this does not require that the rain would cause the death of each unicorn - only that the rain would cause the death of each unicorn who would not die anyway. Suppose there are just three unicorns - Mary, Charley, and Vasilly. Let us suppose that the rain could not cause Vasilly to die because he is on top of a mountain, but he will die anyway of old age. Applying our analysis of causal sufficiency to 4.24, what would be required is that if A is the set of simple propositions describing the rain, then A be directly causally sufficient for each member of the set B ={'Mary dies', 'Charley dies', 'Vasilly dies'}. But this is too strong a requirement. A need not be (and is not) directly causally sufficient for 'Vasilly dies' because this is already true and would still be true even if it did rain for forty days and forty nights. Thus it seems that in order for P to be causally sufficient for 0, A need not be directly causally sufficient for every member R of B. There may be propositions R which are already true and would still be true even if P were true. Of course, this condition is automatically true if P, Q, and hence R, are already true, e.g., if we transform 4.24 into: Its raining for forty days and forty nights caused it to be the case that all the unicorns died. So, in analogy to our analysis of necessitation, what we actually need is the requirement that, first, REP, and second, this would still be true even if (-P & -Q); i.e., (REP)E(-P & -Q). Thus our final analysis of causal sufficiency becomes: (4.25)
P is causally sufficient for 0 iff P is almost causally sufficient for 0 , and for each world p such that PMP: (i) there is a set A of simple propositions such that A makes P true in p ; (ii) for every set A of simple propositions such that A makes P true in j8, there are sets Bl and B2 of simple propositions such that ( B i U B2) makes 0 true in j3, and: a ) Bi # 0 and (R)[R e B I 3 (AwH DCSR in p)]; (b) (R)(R e B2 3 [(REP)E(-P & - 0 ) in PI.
Our analysis has become rather complicated, but the basic idea is really quite simple. Causal relations between simple propositions are
180
CHAPTER VII
just contingent non-epiphenomena1 necessitations with the appropriate temporal order. The causal relations between logically complex propositions are just contingent non-epiphenomena1 necessitations underlain by causal relations between the simple propositions which make the complex propositions true.
I have developed an analysis of causal concepts in terms of subjunctive conditionals. This analysis may be regarded as a kind of regularity theory, although it is more complicated than a simple Humean-type theory. Its claim to being a regularity theory comes from the way subjunctive conditionals arise out of subjunctive generalizations. Because of its additional complexity, the present theory is not subject to the traditional difficulties regarding regularity theories. For example, consider two popular counterexamples. Many simple regularity theories entail that night is the cause of day. Its being night at ti does indeed necessitate its being day at t2, but this is not a causal necessitation because these are epiphenomena. They both are the causal effects of certain facts about the earth and the sun, and its being night at t i without those facts being true would not necessitate its being day at t2. Many regularity theories also entail that birth is the cause of death. The reason this does not result from the present theory is more complicated than in the case of the previous example. In this case the necessitation fails. A person's being born does necessitate that he will sometime die, but this is not the necessitation that is required for birth to cause death. By principle 4.25, a person's being born in the way he is at a certain time ti (i.e., the conjunction of the simple propositions describing his birth) must necessitate his dying at the time he does. But there is no such necessitation. A person's being born at a certain time necessitates that he will sometime die, but it does not necessitate that he will die at any particular time. Thus birth does not cause death. The literature is full of sophisticated counterexamples to earlier analysis (e.g. Scriven, 1966; Kim, 1971, 1973, 1973a). However, the present analysis has been designed explicitly to handle those counterexamples and I believe that it does so successfully.
CAUSES
181
It has frequently been maintained that causal concepts embody a contextual element, and that this must be included in any adequate analysis of these concepts. Scriven (1974) supposes that 20°/ of the people exposed to heavy doses of ultraviolet radiation develop skin cancer, with certain hereditary factors determining which people will develop cancer when exposed to the radiation. Given a man who has developed cancer upon exposure to the radiation, if we ask what caused him to develop cancer, the answer might be either 'He was exposed to heavy doses of radiation' or 'He had the hereditary factors making him susceptible to cancer when exposed to radiation'. Which answer is appropriate depends upon what we are interested in. The difference is that between the two questions, 'Why did he develop cancer now when he did not do so before?', and, 'Why did this man develop cancer when others who were exposed to the radiation did not?' Scriven and Mackie (1965) suppose this to show that a statement of the form "X caused Y1 is really elliptical for a more complicated statement of the form "X caused Y relative to the contrast class C1. If we want to know why the man developed cancer now, the contrast class consists of moments of his history, whereas if we want to know why he developed cancer when others who were exposed to the radiation did not, then the contrast class consists of men exposed to the radiation. I think that Scriven and Mackie are onto something, but we must be careful to distinguish between statements of the form "X caused Y1 and statements of the form "X was the cause of Y1. The latter locution is one I have carefully avoided up to this point. First consider statements of the form "X caused Y1. If we ask, 'What caused this man to develop skin cancer?' the context may make it quite inappropriate to reply, 'His being exposed to radiation caused him to develop skin cancer'. But the inappropriateness of a reply does not establish its falsehood. Frequently, a reply is inappropriate simply because it is not helpful. For example, if the question is asked in the context of a discussion of people who were exposed to a heavy dose of radiation in an industrial accident and some of whom developed skin cancer while others did not, then it would be quite unhelpful to reply that the radiation caused one of these people to develop skin cancer. But the reason it is unhelpful is that we already know that the radiation caused
182
CHAPTER V I I
the cancer. The radiation did cause cancer in some of the people, and did not cause cancer in others. But what we are looking for is another factor which explains this selectivity, i.e., which caused cancer in all and only those people (in the group) in whom the factor was present. I think that any contextual element in statements of the form "X caused Y1 is not part of the meaning of the statement, but rather is part of the pragmatics of language in general. However, this changes when we turn to the peculiar statement form '"X was the cause of Y1. The reason this is peculiar is that there can simultaneously be more than one thing which is the cause of something. For example, it may be true both that the cause of the explosion was the room's being filled with gas, and that the cause of the explosion was Jones' throwing the switch. This makes the use of the definite article rather strange. The only way I can see to make sense of this is to suppose that statements about 'the cause' of something really are elliptical for more complicated statements involving a contrast class. However, I will not pursue this further at this time.
If the present analysis of causal relations is acceptable, it settles a number of questions about the logical properties of these relations. To begin with, the analysis leads immediately to interchange principles for logical equivalence: (6.1)
PÇ-ÈQ&PCR.=JQ
(6.2)
P
Ç-
Q & RCP. 3 R C Q
(6.3)
P
Ç-
0 & PCSR. 3 Q C S R
(6.4)
P
Ç-
Q & RCSP. ^RCSQ
These interchange principles, in turn, imply that substitutivity of identity fails for causal contexts. That is, the following two principles fail:
where if ti or
t2
is a definite description, it is understood to have
CAUSES
183
narrow scope. The failure of 6.5 and 6.6 can be seen as follows. Suppose "Ftil is causally sufficient for 0 , where rFtll and 0 are both false. Then let t2 be the definite description "?x(x = t2 & -0)'. Then t i = tyl is true. Hence, assuming 6.5, r(Ft^}CSQ1 is true. But ^Ft: is equivalent to "Fti & -Q1. Hence by 6.3, (Ft, & -0)CSQ. By our analysis, this implies "(Ft1 & -Q)> Q1, which within SS implies that F t i l logically entails Q. This result is absurd, so principle 6.5 must fail. Principle 6.6 fails for analogous reasons. The following two principles are equivalent to one another, and it may seem that they should hold: (6.7)
PC(Q & R ) =>[PC0 & PCR].
(6.8)
[ P C 0 & Q -^ R ]
PCR.
However, neither of these principles holds. For example, it might be true that John's hitting Joe caused Joe to have a black eye. That Joe has a black eye entails that Joe has an eye. But John's hitting Joe did not cause Joe to have an eye. Thus principle 6.8, and hence the equivalent 6.7, are both false. Slight modifications of this example yieldcounterexamples to the analogous principles regarding 'would cause' and causal sufficiency. I think that the failure of 6.7 seems particularly surprising because in English, if we say, e.g., 'Closing the switch caused both lights to be on', this is ambiguous between 'Closing the switch caused it to be the case that both lights are on' and 'Closing the switch caused each light to be on'. The reason 6.7 fails is that P can cause a conjunction to be true simply by causing the truth of that part of it which isn't already true. E.g., if one of the lights is already on, and the switch is wired to the other light, then closing the switch will cause it to be the case that both lights are on (where before only one was on). There are some other distribution principles which fail for much the same reason. The following both fail: (6.9)
PC(x)(Fx =i Gx) 2 (x)[Fx 2 PCGx].
(6.10)
(3x)[Fx & PCGx] 2 PC(3x)(Fx & Gx).
We have already seen a counterexample to 6.9 in the distinction between 'Its raining for forty days and forty nights caused all the unicorns to die' and 'Its raining for forty days and forty nights caused it
184
CHAPTER V I I
to be the case that all the unicorns died'. The former requires that each unicorn was killed by the rain, but the latter only requires that the rain killed all the unicorns that wouldn't have died anyway. To generate a counterexample to 6.10, suppose we have a room containing two lights, one of which is on and the other of which is off and controlled by a switch. We throw the switch, thereby causing the second light to come on. Then there is a light such that throwing the switch caused it to come on, but it is not true that throwing the switch caused there to be a light that was on (because one of the lights would have been on anyway). It has generally been supposed that causal relations are transitive. For example, David Lewis (1973b) explicitly builds this into his analysis of 'would cause'. But transitivity does not result from my analysis. Nor should it. None of the following three principles is correct: (6.11)
[PCSQ & QCSR1-3PCSR.
(6.12)
[ P would cause 0 & Q would cause R ] =1 P would cause R.
(6.13)
[PCQ & Q C R ] ^ P C R .
It is really quite simple to get counterexamples to 6.11 and 6.12. These principles fail for the same reason that transitivity fails for subjunctive conditionals. For example, let us suppose that it is a chemical process C which causes a match in the presence of oxygen to light upon being struck. The chemical process C (we can'suppose) can occur with or without there being oxygen present, but it is only when oxygen is present that the occurrence of the process causes the match to ignite. Now consider a match which is in fact in the presence of oxygen and which would still be in the presence of oxygen even if chemical process C were to occur. Then the following three statements are all true: (1) (2) (3)
The occurrence of chemical process C would cause this match to ignite. This match being struck in the absence of oxygen would cause chemical process C to occur. This match being struck in the absence of oxygen would not cause this match to ignite.
Consequently, 'would cause' is not transitive. And this example is also
CAUSES
185
a counterexample to the transitivity of causal sufficiency. Transitivity fails for these two causal relations for precisely the same reason it fails for subjunctive conditionals. This is because changing causal antecedents involves 'world hopping7. That is, evaluating the truth values of causal statements with different antecedents involves evaluating the truth values of subjunctive conditionals with different antecedents, and that in turn involves our looking at different possible worlds. Although it is quite obvious that transitivity fails for both causal sufficiency and 'would cause', it may not seem so obvious that it fails for 'caused'. This is because in the case of 'caused' we are dealing with true causal antecedents and consequents, and hence one is apt to suppose that there is no world hopping involved. However, this is a mistake. rPCQ' entails r(-,P & Q) > ( P > Q)', and so in spite of appearances, world hopping is involved. This can be substantiated by seeing how our analysis actually leads us to intuitive counterexamples to 6.13. Characteristically, Q only causes R because some third proposition S is true. In such a case, it is required of S that [(-Q & R ) > (Q > S)]. However, even though P caused Q, if it is not true that [(-P & -R)>(P>S)], then there is no reason to expect P to have caused R. Let me give two counterexamples that take this form: (1) Suppose there are two kinds of gasoline. One kind gives an engine extra power, but causes it to overheat if run for a long time. The other kind gives less power, but allows sustained running of the engine. Suppose there is an engine which was running, and I filled the almost empty fuel tank with high-power gasoline. That I filled the fuel tank caused the engine to run for five more hours. And the engine's running for five more hours caused it to overheat (because it was burning high-power gasoline). But, that 1-filled the fuel tank did not cause the engine to overheat. Rather, that I filled the fuel tank with high-power gasoline caused the engine to overheat. Thus, 'causes' is not transitive. This counterexample arises because it is not true that if I had not filled the tank and the engine had not overheated, then had I filled the tank, the engine would have been burning high-power gasoline. (2) Consider a 'vacuum oven'. This is a chamber which can be evacuated and then heated. Suppose this oven is operated by two buttons. Pushing button A results in any gas in the oven being pumped
-
-
186
CHAPTER V I I
out, and then five minutes after the button is pushed the heating element comes on and heats the contents of the oven to a very high temperature. Pushing button B results in the gas being left in the oven, so that nothing happens for five minutes, and then the heating elements come on. Suppose now that the oven is filled with pure oxygen and contains a piece of paper. I push button B, after five minutes the heating element comes on, and because of the oxygen environment this causes the paper to burst into flame. In this case we have the following: (1) That the heating element came on, caused the paper to ignite; (2) that there was a control button that was pushed, caused the heating element to come on; but, that there was a control button that was pushed, did not cause the paper to ignite. Rather, that button B was pushed caused the paper to ignite. Thus, once again we have a failure of transitivity. The reason that there being a button that was pushed did not cause the paper to ignite is that it was not true that if no button had been pushed and the paper had not ignited, then if some button had been pushed then the oven would have been filled with oxygen. On the contrary, if it were true that no button was pushed, then had some button been pushed it might have been button A that was pushed, in which case the oven would not have been filled with oxygen. Apparently none of our causal relations are transitive. This is certainly surprising in light of the almost universally held belief that they are transitive, but the conclusion seems inescapable. This has important implications for a great many of our favorite beliefs about causes. For example, it makes nonsense of our ordinary way to thinking about causal chains. The 'received view' of causal chains is that they arise when we have, between two propositions P and Q, a sequence P I , . . . , Pn of propositions such that PCPl, PlCP2,, . . , PnCQ, and 'hence', PCQ. But, as we have just seen, the 'hence' doesn't follow. Does this mean that all talk of causal chains is nonsense? It does not seem that it should. We can often talk about the 'way' in which one proposition P causes another proposition Q to be true. This 'way' involves tracing out a sequence of intermediate causal links. To take a straightforward example, consider a row of dominoes standing on end. Pushing over the first domino causes the final domino to fall over, and it does so by causing each intermediate domino to fall
CAUSES
187
in turn. How are we to understand this in light of the failure of transitivity for causes? I suggest that we can make sense of causal chains by concentrating on the notion of P causing Q by causing R . I propose that what this means is that P caused Q, P caused R , and that P without R would not have been causally sufficient for Q :
(6.14)
P caused Q by causing R iff PCQ & PCR & -[(P & R)=Ql.
-
Then in a causal chain, what happens is that the first link causes each successive link by causing the preceding links:
(6.15)
A sequence ( P I ,. . . , P,,) of propositions is a causal chain iff PI caused P2, and for each k such that 2 < k 5 n, PI caused Pk by causing (P2 & . . . & Pkp1).
In summary, philosophers have held a remarkable number of false beliefs about the logical properties of causal relations. However, if we are careful we are now in a position to sort out what is true and what is false. NOTES
'
In point of fact, I do not think that the narrow-scope readings 2.15* and 2.16* are plausible interpretations of the English sentences 2.15 and 2.16, but that does not affect the logical point being made. 'The condition rOP7 is included so that we are not forced to say that a contradiction would cause everything. 'It is worth noting that 3.2 is entailed by 3.1 together with the principle that rPCQ7 entails P. Thus it need not be defended separately from 3.1. The statement that the heart attack caused the man to die today is ambiguous between the true statement that there was a time today such that the heart attack caused him to die at that time, and the false statement that the heart attack caused there to be a time today when he died. Strictly speaking, definition 4. I8 does not make sense because almost causal sufficiency has been defined as a relation between individual propositions and not as a relation between sets of propositions and propositions. However, this is trivially rectified by simply replacing 'P' by 'HAin definition 4.10.
CHAPTER VIII
PROBABILITIES
At this point I must be candid and admit that everything that has come before may be just so much science-fiction. This is because there may be no true, exceptionless, laws. It may be that all putative laws are really just approximations to general probability statements. For example, this is the way much of quantum mechanics has gone. Should it come to pass that there really are no true general laws, then the counterfactual conditionals, necessitation conditionals, and causal statements that we base upon putative general laws will all be false. All of the logical framework developed above will become of only theoretical interest. In this eventuality, our law statements and counterfactuals must be replaced by various kinds of probability statements. Many philosophers are prone to suppose that the situation I have just described in which there are no genera1 laws is almost certainly the situation we are really in. They defend this view by pointing in an off-hand way to quantum mechanics and observing that the probabilistically governed behavior of elementary particles must underly everything else. However, this involves at least a partial misconception of the nature of quantum mechanics. l t is certainly true that some of the fundamental laws of quantum mechanics are probabilistic, but it is often overlooked that quantum mechanics embodies lots of other laws that are not probabilistic. Among these are conservation laws (the conservation of energy, momentum, angular momentum, spin, strangeness, etc.), laws regarding relativistic transformations, laws regarding the charges, masses, spins, etc., of elementary particles, laws governing weak, strong, and Coulomb forces, and so on. There are actually more non-probabilistic laws in quantum mechanics then there are probabilistic laws, and the prospects of this changing seem remote. It would alter the entire character of quantum mechanics if it were decided, e.g., that the charge on an electron is not fixed but is instead represented by some sort of probabilistic distribution centered around what is presently believed to be the charge of an electron.
PROBABILITIES
189
The situation I have just described in quantum mechanics, in which we have a mixture of probabilistic and non-probabilistic laws, seems rather likely to be representative of the true state of the world (although I am not totally convinced that the world is not governed instead by strictly deterministic laws). It will turn out that even in this case some changes are required in our account of subjunctive conditionals. In Chapters IV and VI I implicitly made the simplistic assumption that there were no basic probabilistic laws. In this chapter we will investigate what happens if we reject that assumption. Probability enters into our account of subjunctive conditionals in two ways. First, countenancing basic probabilistic laws will force us to modify our definition of the relation '/3Mmq9.We must get clear on the logical nature of basic probabilistic laws and see how they are involved in the relation M. Second, it will turn out that many subjunctive conditionals that we are apt to assert are not really true. What are true instead are a variety of probabilistic statements like rIf it were true that P, then it would probably be true that Q1, rIf it were true that P, it would almost certainly be true that Q1, etc. We must get clear on the nature of these probability statements and how they are related to subjunctive conditionals. These will be the two tasks undertaken in this chapter.
Basic probabilistic laws report 'indefinite probabilities'. An indefinite probability statement has the form rThe probability of an A being a B is rl. Indefinite probability statements do not report the probability of a proposition, but rather concern predicates or open formulas. The probability of an A being a B might reasonably be symbolized as rprob(Bx/Ax)l.
2.1. Relative Frequencies The traditional view on indefinite probabilities is that they are relative frequencies or limits of relative frequencies. If there are just n A's, and m of them are B's, then prob(Bx1Ax) = mln. In this case, the indefinite
190
CHAPTER V I I I
probability is identified with the relative frequency of B's in A's. If there are infinitely many A's, and infinitely many of them are B's7 then the relative frequency of B's in A's is no longer well-defined? so instead the traditional view identifies prob(Bx1Ax) with the limit of the relative frequency of B's in larger and larger finite subsets of the set of all A's. In this case7 prob(Bx1Ax) is said to be the limit of the relative frequencies. I suspect that most people find the above view unobjectionable in the case in which there are only finitely many A's. But there is a well-known problem for the case in which there are infinitely many A7s. If there are infinitely many A's, then we are supposed to consider a sequence Ao7A l , . . . , A,,, . . . of finite subsets of A such that (1) for each i7 j, if i < j then At G A,? and (2) UIEW Al = A, and then letting Car(X) be the cardinal of a set X, we define: p r o b ( ~ x / ~=x lim ) i+a
Car(Ai fl B) Car(Ai) '
The problem is that the limit of relative frequencies that we get in this way depends upon the particular sequence {Ai; i E w } that we choose. It is quite possible to choose another sequence {AT; i E w } of subsets which gives a different limit for the relative frequency. For example? consider a very durable coin which lasts forever and? over all time, is tossed infinitely many times. Letting H be the set of tosses of this coin which comes up heads, and letting Ai be the set of the first i tosses of this coin? it is quite plausible to suppose that
But now, suppose we define A ? to be the set of the first i tosses resulting in heads and the first 2i tosses resulting in tails. Then we still have A ? = A? but now
uiEW lim
i-+a
Car(A ? n H) = 113. Car(A ?)
Thus we cannot define prob(Bx1Ax) in terms of just any sequence of subsets {Ai; i E m } . Perhaps it is generally supposed that the solution to this p-roblem is
PROBABILITIES
191
that certain sequences of subsets of A are 'natural', and others are 'unnatural'. For example, it does seem that the sequence {A,; i E m } above was natural7 and the sequence {A:; i E c i ~ }was unnatural. Thus it may be felt that if we can somehow characterize which sequences are natural, we can solve the problem of how to define prob(Bx1Ax) in terms of limits of relative frequencies. I think, however7 that there are at least three reasons why this will not work. First, in the infinite case again, if the infinitude of A's is of a temporal origin, there being only finitely many A's at any one time but infinitely many A's over all time, then just as in the case of the coin there is a natural sequence of subsets defined by considering all the A's existing in progressively broader temporal intervals. But suppose instead that there are infinitely many A's at a single instant. For example. astronomers once believed that there were infinitely many stars, from which it would also follow that there are infinitely many elementary particles, probably infinitely many atoms and molecules, perhaps infinitely many rocks, trees. clouds, etc. In this case, choosing any particular sequence of subsets over all others seems quite arbitrary. Of course, certain sequences of subsets artificially contrived for the sole purpose of getting strange limits of relative frequencies do seem unnatural, but this rules out very few sequences of subsets. Among those sequences of subsets which are not so artificially contrived, we may not be able to prove that they lead to different limits of relative frequencies, but there is no a priori reason to think that they will not. The difficulty in defining the limit of the relative frequency in the infinite case is, I suspect? insurmountable. However. surprisingly enough7 there is an even more obvious difficulty for the finite case. Consider a particular coin which is subjected to very careful physical examination and ascertained to be a completely fair coin. That is, we may conclude on the basis of our knowledge of physics that the probability of a flip of this coin resulting in heads is 112. But suppose the coin is never flipped. It is melted down shortly after the examination. Then the relative frequency does not exist, but we would still regard the probability statement as true. Or suppose the coin is flipped just once, resulting in a head7 and then melted down. Then the relative frequency is l 7but we would still insist that the probability is 112. It is
192
CHAPTER VIII
just not true that in the finite case, the probability of an A being a B is the same thing as the relative frequency of B's in A's. What is happening here is that, just as in the case of non-probabilistic laws, these indefinite probability statements are subjunctive in character. They are not just about actual A's, but also about physically possible A's. To say that the probability that a flip of our coin will result in heads is 112 is to make a statement about all possible flips of the coin, and not just about those few flips that actually take place. If the coin is only flipped a few times, then the relative frequency will almost certainly not be the same as the probability. Furthermore, the probability statement supports conclusions about physically possible flips of the coin that do not actually occur. E.g., we can conclude from the probability statement that if, instead of melting the coin down, we had flipped it one hundred more times, out of those one hundred flips we would probably have gotten in the vicinity of fifty heads. 2.2. Subjunctiue Indefinite Probabilities What the relative frequency theory overlooks is that our indefinite probability statements are subjunctive in the same way subjunctive generalizations are. As such, they cannot be identified with summations of actual indicative states of the world, any more than subjunctive generalizations can be identified with material generalizations. In fact, as we will see below, subjunctive generalizations can be regarded as limiting cases of indefinite probability statements. Just as there is a distinction between strong and weak subjunctive generalizations, there is a distinction between two kinds of indefinite probability statement. Consider our infinitely durable coin which is flipped infinitely many times. Let us suppose that physical examination shows it to be perfectly balanced and shaped, etc., so that the probability of a flip yielding a head is 112. Let D be a description of all of the relevant physical characteristics of the coin. Thus what we have is that the probability of a flip of a coin of description D yielding heads is 112. Now let us suppose in addition that there is just the one coin of description D. Furthermore, suppose that over the years a secret society has grown up which worships this particular coin. It is part of their mystical beliefs that only 113 of the flips of this coin should yield
PROBABILITIES
193
heads, and so to ensure this they enclose the coin in an infinitely durable coin-flipping machine which, through the use of magnetic fields, biases the flips in such a way that the relative frequency of heads always hovers right around 113. What are we to say now about the probability of a flip of a coin of description D yielding a head? O n the one hand it is reasonable to say, on the basis of the physical description D, that the probability is 112. After all, as I have argued, the probability statement is as much about physically possible tosses of coins of description D as it is about actual tosses of such coins, and it is only an accident that there is only one coin of description D and it finds itself in such peculiar circumstances. On the other hand, it also seems reasonable to say that, because there is only one such coin and it is enclosed in the sacred machine, the probability is only 113. This is still a subjunctive statement because even if the coin had been flipped more times that it actually was, it would still have been in the machine and hence the relative frequency would still have remained around 113. I think that what is happening here is that we must make a distinction between two kinds of indefinite probabilities - strong ones and weak ones. The distinction is parallel to the distinction between strong and weak subjunctive generalizations. The strong subjunctive generalization "(Fx 3 Gx)l is about all physically possible circumstances in which we might encounter an F. On the other hand, the weak generalization "(Fx ^> Gx)' is about a more narrowly circumscribed set of circumstances - the 'actually possible' circumstances in which we might encounter an F. As we have seen, weak generalizations are parasitic upon strong generalizations. "(Fx ^> Gx)l is true because of certain physically contingent facts P about the world, and what makes "(Fx ^> Gx)' true is that "[(Fx & P ) 3 Gx)' is true. For example, given Chisholm's bottle of rat poison, what makes it true that anyone who drank from this bottle would be poisoned, is that the bottle does contain rat poison and that the strong generalization 'Anyone who drank from a bottle containing rat poison would be poisoned' is true. Analogously, there are two kinds of subjunctive indefinite probability statements. On the one hand there are strong indefinite probability statements, henceforth symbolized "probs(Gx/Fx) = rl, which take into account all physically possible circumstances in which there could be
194
CHAPTER VIII
an F. These probability statements are physically necessary - as they take into account all physically possible circumstances, they will remain true in all worlds having the same physical laws. On the other hand, there are weak indefinite probability statements, symbolized probw(GxlFx) = r', which rely for their truth on physically contingent facts about the world. Just as in the case of weak subjunctive generalizations, weak indefinite probability statements are parasitic upon strong indefinite probability statements. In general, if P is the physically contingent proposition the truth of which makes the weak probability what it is, then probw(Gx/Fx) =probs(Gx/(Fx & P)). For example, in the circumstances described above, the weak probability of a flip of a coin of description D yielding a head is the strong probability of a flip of a coin of description D yielding a head under the circumstances described. Thus in the above circumstances, the strong probability of a flip of a coin of description D yielding heads is 112, but the weak probability is 113. It must be pointed out that there is a third kind of indefinite probability statement. For example, consider an urn containing seven white balls and three black balls. We want to say that the probability of a ball in this urn being black is 3/10. However, this probability is not subjunctive. It is about a fixed set of objects (the objects actually in the urn), and as such it implies nothing about the probability of other balls being black if they were in the urn. This is a material indefinite probability statement. Let us symbolize these as rprobM(Gx/Fx) = rl. It is worthwhile to compare our three kinds of indefinite probability statements with the three kinds of generalizations we have encountered. There are strong subjunctive generalizations, weak subjunctive generalizations, and material generalizations, and correspondingly there are strong indefinite probability statements, weak indefinite probability statements, and material indefinite probability statements. The generalizations are, in the following sense, limiting cases of the corresponding probability statements:
(2.2)
(Fx => Gx) -* probw(Gx/Fx) = 1.
(2.3)
(x)(Fx 3 Gx) -^ probM(Gx/Fx)= 1.
PROBABILITIES
195
For familiar reasons, these entailments do not go in the other direction. For example, if h is a real-valued function of F's, and the values of h are evenly "distributed over some finite interval, then for each particular r in that interval, probw(h(x) = r/Fx) = 0, but this does not imply that (Fx => h(x) # r). To suppose otherwise would lead to the absurd result that ( y)(Fx ^> h(x) # y). Some unrepentant frequentists, having noted material indefinite probabilities, may assert that those are the probabilities they have been talking about all along. But they would be wrong. For example, consider the urn with seven white balls and three black balls. We can talk about the material probability of a ball in the urn being black, but that is not the probability concerning that urn which philosophers have generally wanted to talk about. It has been far more common for them to talk about the probability that drawing a ball from that urn will produce a black ball. This is an altogether different probability. To begin with, it is subjunctive. It is not just about all actual draws that have been performed, but also about what would happen if there were more draws. Furthermore, this probability need not be the same as the material probability of a ball in the urn being black. For example, if the black balls are a different size than the white balls, this may affect the probability of a black ball being drawn, but it does not affect the probability of a ball in the urn being black. Although material probabilities make sense (at least in the finite case - the infinite case is problematic), I think it must be concluded that they are not of much use and that they are not the probabilities in which we are generally interested. 2.3. Strong Indefinite Probabilities As we will see in the next section, weak indefinite probabilities can be defined in terms of strong indefinite probabilities. So now let us examine strong indefinite probabilities. ^probs(Gx/Fx)-' will not be defined for all predicates F and G. probs(Gx/Fx) is intended to be an indefinite probability, but there are clear examples of predicates F and G for which probs(Gx/Fx), if it were defined, would be a definite probability rather than an indefinite probability. For example, probs(Ha/x = a ) , if it were defined, would be the probability of the
196
CHAPTER V I I I
proposition Ha. Of course, definite probabilities make perfectly good sense, and will be discussed in due course, but the point is that they are not what the function probs gives us. Probs is an indefinite probability. Accordingly, there must be some restrictions placed on the arguments of probs. Unfortunately, it is not at all obvious what the requisite restrictions are, and I will not attempt to elicit them here. But one thing at least seems clear. If there is an E such that probs(Ex/Fx) exists, and if there is an H such that probs(Gx/Hx) exists, then both probs(Gx/Fx) and probs(Fx/Gx) exist. This is to say that there is a-set 77 of predicates such that probs(+/
+,
D
physically possible F's that are G's. This would naturally lead us to suspect that probs(Gx/Fx) does not exist if -0(3x)Fx. But that would P
be a mistake. For example, 'black holes' are supposed to be astronomical phenomena resulting from the gravitational collapse of stars and consisting of regions of space from which nothing (not even light) can escape. Thus it is supposed to be physically impossible to retrieve something from a black hole. Nevertheless, it makes perfectly good sense to talk about the probability of heads resulting from a flip of a coin of physical description D which was retrieved from a black hole. If D ensures that it is a fair coin, then the probability is 112, Even in the case in which -O(3x)Fx, probs(Gx/Fx) exists, although this is perhaps best viewed as a convention. At least ordinarily, we want to say that the logical entailment of Gx by Fx ensures that probs(Gx/Fx) = 1. If -0(3x)Fx, then (x)(Fx -> Gx), and so for any G at all in IT, we should have probs(Gx/Fx) = 1. Our indefinite probabilities are conditional probabilities. They are probabilities of certain things given other things. The traditional approach to conditional probabilities was to define them in terms of non-conditional or absolute probabilities. The absolute probability prob(Fx) is supposed to be the probability of anything at all being an F. Absolute probabilities can be defined easily enough in terms of conditional probabilities: prob(Fx) = probs(Fx/Gx v
- Gx).
PROBABILITIES
197
The traditional move was to somehow start with absolute probabilities and then define conditional probabilities as follows: probs(Gx/Fx)
=
prob(Gx & Fx) prob(Fx) '
The recognized difficulty with this approach is that it does not work in the case in which prob(Fx) = 0. In that case, probs(Gx/Fx) would be left undefined. Hence this approach cannot handle the case in which F is logically impossible, and presumably cannot handle the case in which F is physically impossible. But I think it is generally supposed that this traditional approach will work for any more normal case. In fact, however, I think that absolute probabilities really make very little sense. If we define them as above (and really mean what we say so that they are not conditional on some additional implicit and nontautological condition like that of being a physical object), then the absolute probability for any normal predicate will be zero. For example, the probability of a thing being red given that it is anything at all (including sets, real numbers, transfinite ordinals, etc.), is surely zero. Thus I think it is best if we just forget about absolute probabilities and rest content with conditional probabilities. Let us look more closely at what it means to say that probs(Gx/Fx) = r. As a first approximation, we want to read this as saying that the proportion of physically possible F's that are G's is r. But what does this mean? To talk about physically possible F's is, presumably, to talk about the physically possible ways of being an F. We can regard a way of being an F as a maximal consistent set of predicates which contains F. Let us define: (2.4)
fl = {A ; (D)[D is a predicate 3 ( D e A vr-D1e A)] & 0 ( 3 x ) ( D ) ( De A
3
Dx)}.
P
Then the set of all physically possible ways of being an F is:
Then it is natural to suppose that for each F in J7, there is an additive measure function u, defined on (certain) subsets of [F] such that probs(Gx/Fx) = p([Fl n[GI).
198
CHAPTER VIII
The above approach seems to work as long as 0(3x)Fx, but if P
-0(3x)Fx it would lead to probs(Gx/Fx) being undefined (because [F]= 0 ) . If F is physically possible, then instead of talking about physically possible ways of being an F (there aren't any), it seems natural to talk counterfactually about what would be physically possible ways of being an F if F were physically possible. This is on the right track, but is still not quite right. The difficulty is that there might be different ways of making F physically possible. For example, if F is a disjunction " F i x v F2x1 where 'Fix1 contradicts one basic law and F 2 x 1 contradicts another basic law, then we could make F physically possible by rejecting either of these laws. But which of these two laws we reject may lead to quite different maximal consistent sets of predicates being physically possible, and there may be no such sets containing F which would be physically possible regardless of which law we reject. Thus we do not want to talk about what maximal consistent sets of predicates would be physically possible ways of being an F if F were physically possible; instead we want to talk about what maximal consistent sets of predicates might be physically possible ways of* being an F if F were physically possible. In other words, in computing probs(Gx/Fx), we take into account all of the physically possible way of being an F that would result from each of the ways of making F physically possible. Thus we define: (2.6)
OF = {A ; (D)[D is a predicate 0(3x)(D)(D 6 A
3
3
( D 6 A v "-Dl6 A)] &
Dx) &
-[O(3x)Fx > -0(3x)(D)(D <= A 3 Dx)]}. P
P
(2.7)
[GIF ={A; A
â
OF & G 6 A}.
Then probs(Gx/Fx) is supposed to be a measure of the proportion of things in [FIF that are also in [GIF. This means in particular that for each F in J7, there is an additive measure function up such that PF([F]F) = 1, and for each G in J7 probs(Gx/Fx) = /.&p([G]pfl [FIF). This works for all cases except that where [FIF=0. It is easily demonstrated that IFIF= 0 iff -0(3)Fx, so in this case it seems best to simply stipulate that, since (x)(Fx + Gx), probs(Fx + Gx) = 1.
PROBABILITIES
199
We cannot get by with a single measure function p and define
as has been the traditional assumption when dealing with probability measure-theoretically. This is because such a function p would have to give a finite value to every subset of {I in its domain, and so in particular, p ( [ G v -G]rGv_o-) = @({I) must be finite. But this will have the result that for many logically contingent F's, u([F]F) = 0, and hence the above definition will not work. Instead, we must have a whole class 8 of measiire functions. For each X c {I, if X # 0 and X lies in the domain of some member of 8, then there should be a unique p in 8 such that X is in the domain of p and p ( X ) # 0. Let px be this unique p. Then we define:
Finally then, we can define:
This measure-theoretic characterization of probs looks at least superficially like Carnap's 'logical' interpretation of probability (although Carnap was working with definite probabilities rather than indefinite probabilities). However, there is a very important difference here. This is that the class 8 of measure functions is not a priori definable. This can be seen as follows. In general, there will be two distinct elements that contribute to the value of probs(Gx/Fx). O n the one hand, there are strong subjunctive generalizations ('deterministic laws'), and they contribute by determining what physically possible ways there are of being an F, and hence determine membership in [GIF and [FIF. Different deterministic laws will leave the class 8 of measures unchanged, and will affect probabilities only by altering what sets are being measured. But on the other hand, there may also be some probabilistic laws. These cannot alter membership in [GIF or [FIF because they do not render anything physically impossible - only more or less improbable. Consequently, probabilistic laws must alter probabilities by altering the measure functions themselves. They result in
200
C H A P T E R VIII
the same physical possibility receiving different weights. For example, let us suppose there are no deterministic laws, so that everything that is logically possible is physically possible. Then different probabilistic laws will all operate on the same sets of possible ways of being an F or a G, ([GIF = [GI and [ F I p= [F]), and hence can only result in different values for probs(Gx/Fx) by assigning different measures to [GI and [F]. This means that the contents of the set 8 are contingent, depending upon what probabilistic laws there are, and hence cannot be defined a priori. O n the other hand, if we suppose that all laws are deterministic, that there are no probabilistic laws, then 8 becomes fixed, and differences in probability can only be generated by differences in membership of [GIp and [FIF. In this case, it is not totally implausible to suppose that 8 really is a priori definable, although I haven't the faintest idea how one might set about trying to define it. Let us call this set of measure functions the set gL of logical measure functions. These measure functions are intimately connected with Carnap's 'logical conception of probability'. Many of the difficulties to which Carnap7s approach led can be traced to his supposition that we could define our probabilities in terms of a single logical measure rather than employing a whole class of measure functions. For example, if X is finite, then it seems reasonable that the logical measure should simply count the members of X. However, if X is infinite, then we need a different measure function, and there is no obvious connection between this measure function and the measure functions for smaller sets. We need a whole hierarchy of logical measure functions, and they seem to be largely independent of one another. The problem of how to define the logical measure functions seems extremely difficult. Unless we somehow know that there are no probabilistic laws, we cannot employ the logical measure functions directly in computing probabilities. Probabilistic laws force us to adopt a new set of measure functions - the 'empirical' measures. To this extent, the traditional 'logical conception of probability' is bankrupt. Nevertheless, when we turn to the discussion of probabilistic laws, we will find that the logical measure functions do play a central role even when our measure functions are not those in %=. Our measure-theoretic characterization of probs immediately im-
20 1
PROBABILITIES
plies that it satisfies all of the following conditions, which will henceforth be called 'the measure-theoretic axioms':
. P
(2.10)
0 5 probs(Gx/Fx) 5 1.
(2.1 1)
(x)(Fx -+ Gx) v o b s ( G x l F x ) = 1.
(2.12)
(x)(Fx
(2.13)
O(3x)Fx 3 probs(-Gx/Fx) = l -probs(Gx/Fx).
(2.14)
(x)(Fx + Gx) = ~ ~ ~ o b s ( F x /
-
,&
Gx) -3 probs(Hx/Fx) = probs(HxlGx).
It is generally supposed that conditional probability satisfies an additional condition, which can be formulated by any of the following axioms: (2.15)
If (x)(Fx + Gx) and (x)(Gx -^ Hx), and probs(Gx/Hx) # probs(Fx/Hx) 0, then probs(Fx/Gx) = probs (GxIHx) '
(2.16)
If (x)(Gx -> -Hx), then probs(Fx/(Gx v Hx)) = probs(Fx/Gx)-probs(Gx/(Gxv Hx))
+probs(Fx/Hx)~probs(Hx/(Gxv Hx)).
These 'product axioms' are interderivable given the measure-theoretic axioms, and together they would give us all of the normal principles of the probability calculus. In particular, they imply the axioms of Popper (1955) and (1959). Unfortunately, the product axioms are all false. For example, consider 2.15. By- our definitions, given the antecedent of 2.15, we have:
These are only the same if [FIG =[FJHand [GIo = [GIH, i.e., if the same worlds might be physically possible if r(3x)Gx1 were physically
y
202
CHAPTER VIII
possible and if r(3x)Hx1 were physically possible. This is the same as requiring that fla = &. This will be true if r(3x)Gx1 and "(3x)Hx1 are already physically possible, or more generally if
with this qualification, 2.15 becomes true: (2.15*) If (x)(Fx + Gx) and (x)(Gx + Hx) and (O(3x)Gx> P
O(3x)Hx) and (O(3x)Hx > 0 ( 3 x ) G x ) and probs(Gx/Hx) # P
P
0, then
Similar qualifications lead to true versions of 2.16 and 2.17. In particular, the product axioms are all true if we suppose that r(3x)Fx1, (3x)Gx1, and r(3x)Hx1 are all physically possible. It has been remarked that strong subjunctive generalizations can be regarded as a kind of limiting case of strong indefinite probabilities. Part of the meaning of this claim is provided by principle 2.1, according to which (Fx Gx) + probs(Gx/Fx) = 1. However, as we remarked earlier, this entailment does not go the other way. The precise sense in which generalizations are limiting cases of probabilities is provided by the following theorem:
+
(Fx & Hx)) = l]}' Proof: From left to right follows immediately from theorem 5.20 of Chapter VI according to which if "(Fx 3 Gx)' is true then O(3x)Fx > [0(3x)(Fx & Hx) 3 [(Fx & Hx) 3 Gxll P
P
together with principle 2.1. Conversely, suppose the right side holds. Suppose (2.19)
(H)[0(3x)(Fx & Hx) =>probs(Gx/((Fx& Hx)) = 11. P
203
PROBABILITIES
Suppose 'O(Sx)(Fx & P
- Gx)'.
probs(Gx/(Fx &
contradicts 2.19. Thus 2.19 entails 'U(x)(Fx
='
- Gx)) = 0 # 1, which Gx)', and hence the
P
right side of 2.18 entails r[0(3x)Fx >\Z\{x)(Fx2 Gx)]'. By theorem P !' 5.21 of Chapter VI, this is equivalent to '(Fx 3 Gx)'. The above remarks have been intended to elucidate the intuitive meaning of strong indefinite probability, and to elicit the logical properties of probs, but they cannot be regarded as constituting an analysis. We gave a measure-theoretic characterization of probs, but that is not an analysis because the measure pp is not constructed a priori. In fact, the only obvious way to define p~ is in terms of probs itself. If we cannot analyze probs measure-theoretically, how are we to analyze this concept? This is not a problem which I am now prepared to solve, although I can make some remarks which may, hopefully, point in the direction of a solution. The traditional attempts to define indefinite probability in terms of the limit of relative frequencies fail because of the subjunctive nature of non-material indefinite probabilities. Strong indefinite probabilities are intimately connected with strong subjunctive generalizations, the latter being a limiting case of the former. When it came time to analyze subjunctive generalizations, it was almost obviously hopeless to try to give a reductive analyses which would state their truth conditions in terms of other simpler concepts. Instead we sought an analysis in terms of their justification conditions. I argued at great length in Pollock (1974) that those intransigent concepts which resist truth-condition analysis can instead be analyzed by giving an account of how we operate with them, or more precisely, by giving an account of how we become justified in holding beliefs involving those concepts. In the case of subjunctive generalizations this came down to giving an account, first, of how basic subjunctive generalizations are confirmed directly by induction, and second, of how other subjunctive generalizations are derived from the basic ones. I believe that the same sort of account will work for strong indefinite probability. The justification conditions for this concept are of basically the same structure as those for its limiting case, the strong subjunctive generalization. There are two kinds of strong indefinite
204
CHAPTER V I I I
probability statements. On the one hand there are those basic indefinite probability statements that can be confirmed directly by induction, and on the other hand there are non-basic indefinite probability statements that are derived from the basic ones. Let us begin by looking at the basic indefinite probability statements. Herein lies the initial plausibility of the frequentist position. It is, I think, inescapable that subjunctive indefinite probabilities cannot be defined in terms of relative frequencies, but relative frequencies must be connected with probabilities somehow. And I think it is rather obvious what the connection is: observed relative frequencies provide the inductive grounds upon which we base our estimates of probabilities. As in the case of subjunctive generalizations, basic indefinite probability statements must involve pairs of projectible predicates. And, as is always the case with inductive grounds, the confirmation is defeasible. However, the defeaters involved in the confirmation of indefinite probability statements seem to be considerably more complicated than those involved in the confirmation of subjunctive generalizations. I will not attempt to spell them out here. Turning next to the question of how we obtain non-projectible indefinite probability statements from the projectible ones, we come to an extraordinarily difficult problem. Sometimes we can simply use the laws of the probability calculus and calculate the probabilities for non-projectible predicates directly, but this is only a very small part of the story. It will become apparent in section three that concealed in this question are many of the most important and difficult problems of probability theory. Although I am unable to give an account of the kind required, I do want to claim that when such an account is given, we will then have given an analysis of the meaning of strong indefinite probability statements. This will be no less an analysis than a truth-condition analysis would be. However, I am confident that a truth-condition analysis is impossible.
2.4. Weak Indefinite Probability Statements Between strong and weak indefinite probability statements, the weak ones are commonly of more interest to us. We are generally more
PROBABILITIES
205
interested in how likely an F is to be a G given the way the world actually is than we are in how likely an F is to be a G in all physically possible worlds. We can give a measure-theoretic characterization of probw(Gx/Fx) that is completely analogous to our measure-theoretic characterization of probs(Gx/Fx). Just as probs(Gx/Fx) is supposed to be the proportion of physically possible ways of being an F that are also ways of being a G, so probw(Gx/Fx) is supposed to be the proportion of actually possible ways of being an F that are also ways of being a G. Thus we can define: (2.20)
& fl%={A; (D)[D is a p r e d i c a t e a ( D e A v r - D ' e A ) ] 0 ( 3 x ) ( D ) ( D A 3 Dx) & -[0(3x)Fx > - 0 ( 3 x ) ( D ) ( D e a a A =3 Dx)]}.
(2.21)
[GI:=
{A; A E 0: & G e A}
Then there is a set fJ*of measures such that for each X, if X # 0 and X is in the domain of some member of fJ*,then there is a unique p in B* such that X is in the domain of p and p ( X ) 7'0. Let pg be this unique p. In Section 3.2 below, I will argue that in general, $ x - PX, because these measures reflect the same probabilistic laws. It will be maintained that there are not two kinds of probabilistic laws, one involved in strong probabilities and the other in weak probabilities. On this assumption, we have:
As before, this gives us most of the normal principles of the probability calculus for probw. The measure-theoretic axioms are true, and restrictions analogous to those involved in probs give us qualified versions of the product axioms. There is also another way of characterizing probw which is in many ways more interesting. Weak indefinite probability statements and weak subjunctive generalizations are very much alike, as indeed they should be given that the latter is really a kind of limiting case of the former. In each case they are made true by some physically contingent facts about the world. We found that weak subjunctive generalizations
206
CHAPTER V I I I
can be defined in terms of strong subjunctive generalizations: (2.24)
(Fx => Gx) = (3P)(3A){(x)(xe A = x = x) & (Fx & P. 3 Gx) & PE(3x)(Fx & xs^A)}.
Can we, in some analogous way, define weak indefinite probability statements in terms of strong indefinite probability statements? It seems we can. Let P be that physically contingent proposition the truth of which is responsible for the weak probability being what it is. Then, analogous to the condition in 2.24 that '[(Fx & P ) 3 Gx]' be true, we have the condition that probw(Gx/Fx) = probs(Gx/(Fx & P)). We also need a restriction on P analogous to that contained in 2.24. To see what this restriction should be, consider the case of a coin whose physical characteristics are such as to make it a fair coin. Then we want to say that the probability of a flip of this coin at this time yielding a head is 112. Let P be the statement describing the physical characteristics which make the coin a fair coin. Clearly it is required of P that it would be true even if the coin were flipped, i.e., ^PE(3x)Fx1 is true. However, the condition ^PE(3x)Fx1 is clearly not sufficient by itself. If the coin is in fact being flipped at this time, then the condition PE(3x)Fx1 is automatically true because both P and ^(3x)Fx1 are true. We must require of P that it would still be true even if a different flip of the coin occurred. But this just means that ^PE(gx)(Fx & xs^A)' should be true, where A is the set of actually existing things. Thus our restriction becomes the same as in 2.24. This suggests that we should have the following: r & (x)(x e A =x= x) & PE(3x)(Fx & xs^ A)]. Principle 2.25 is precisely analogous to 2.24. However, because we are now dealing with probabilities rather than strict generalizations, a further difficulty remains for 2.25 that did not occur in connection with 2.24. This is that the function probw is not well defined because there could be two propositions P and P * satisfying the constraints of 2.25 but such that probs(Gxl(Fx & P)) ?'probs(Gx/(Fx & P*)). In such a case we would say that at least one of the two propositions P and P* has not taken into account all of the relevant information. Intuitively, we want to do something like taking the conjunction of all of the
PROBABILITIES
207
propositions P satisfying the constraints of 2.25, and then look at the strong probability of an F being a G given that that conjunction is true. However, taking such an infinite conjunction is only a heuristic approximation to the right idea. One might question whether such infinite conjunctions really make sense. It seems that what we really want to require of P is that it be strong enough to take into account all of the relevant information, in the sense that if there is another proposition P * also satisfying the constraints of 2.25 but yielding a different probability, then P + P* and hence takes account of its information. This condition is still a bit too strong, because given a P* of this sort and an arbitrary irrelevant proposition Q such that rQE(3x)(Fx & xef A)' is true, we would have probs(Gx/(Fx & P*))= probs(Gx/(Fx & P* & Q)), and hence the above condition would require that P + Q. This would lead to the result that P would have to entail every irrelevant proposition Q satisfying the constraint that "QE(3x)(Fx & xef A)' is true. This is much too strong a requirement. We can avoid requiring P to entail such irrelevant propositions by only requiring P to entail P * if there is no R such that P* entails R but R does not entail P * and ^RE(3x)(Fx & xef A)' is true and probs(Gx/(Fx & R)) =probs(Gx/(Fx & P*)). If there is such an R, then P * is an irrelevant strengthening of it. So our analysis becomes: (2.26)
probw(Gx/Fx) = r = (3P)(3A)[(x)(x A = x = x) & probs(Gx/(Fx & P)) = r & (P*){[P*E(3x)(Fx & xe?A ) & probs(Gx/(Fx & P*)) # probs(Gx/(Fx & P)) & PE(3x)(Fx & xef A ) & -(3R)((P* + R ) & (R 4 P*) & probs(Gx/(Fx & R ) ) = probs(Gx/Fx & P*)))}3 ( P -Ã P*)].
But there remains one final difficulty. There seems to be no logical guarantee that there is such a maximally strong P. It seems logically conceivable for there to be an infinite progression of stronger and stronger P's without limit each giving a different probability.3 In such a case, it would be unreasonable to identify probw(Gx/Fx) with the probability given by any of the P's. In such an eventuality, there would be no P by virtue of which there would be a weak indefinite probability different from the strong indefinite probability, and so we should
208
CHAPTER VIII
have probw(Gx/Fx) = probs(Gx/Fx). Thus our final analysis becomes:
(2.27)
probw(Gx/Fx) = r iff either (i) (3P)(3A)[(x)(x A = x = x) & probs(Gx/(Fx & P ) ) = r & PE(3x)(Fx & x e A ) & (P*){[P*E(3x)(Fx & xs^ A ) & probs(Gx/(Fx & P*)) # probs(Gx/(Fx & P)) & -(3R)((P* + R ) & (R M *probs(Gx/(Fx ) & R)) = probs(Gx/(Fx
& P*))} 3 ( P + P*)]; or
(ii) -(3P)(3A)[(x)(x e A = x = x) & PE(3x)(Fx & x&A ) & (P*){[P*E(3x)(Fx & xs^A ) & probs(Gx/(Fx & P*)) # probs(Gx/(Fx & P))
&-(3R)((P*
-+
R ) & ( R + P*)
& probs(Gx/(Fx & R)) = probs(Gx/(Fx & P*))} 3
( P + P*)] & probs(Gx/Fx) = r.
One of the major reasons for discussing subjunctive indefinite probabilities at such length is that it seems they should be relevant to the definition of the relation M which is involved in the analysis of subjunctive conditionals. We have entertained the hypothesis that there are fundamental laws of nature which are essentially probabilistic. For example, we may have a law which tells us that if an electron is in state S , at time t, then the probability is 113 that it is in state S2 at time t At. Given such a probabilistic law, it would seem to follow that if an electron is in state Si at time t, then it might be in state S2 at time t At. Non-zero probabilities arising from fundamental probabilistic laws create 'might be's'. This is a genuinely new source of 'might be's'. Without probabilistic laws, the only way a 'might be7 can arise is when P counter-implicates a conjunction of true propositions without counter-implicating either conjunct.
+
+
PROBABILITIES
209
Let us see how we can make all of this precise. We want to say, roughly, that if Q is false but prob(Q/P) 7' 0, and this probability arises from fundamental probabilistic laws, then QMP, and hence there must be a world fl such that flMP and Q is true in fl. The first thing to notice here is that the probability prob(Q/P) to which we are appealing is not an indefinite probability. It is the probability of a proposition Q given a proposition P. This is a definite probability, and is something new which we have not yet defined. We must begin by constructing a definition of this probability function. 3.1. Strong and Weak Definite Probability
The definite probabilities in which we are interested are probabilities of propositions and arise directly out of the indefinite probabilities we have already discussed. These definite probabilities are based solely and exclusively on the indefinite probabilities. We obtain two kinds of definite probabilities depending upon whether they are based on strong indefinite probabilities or weak indefinite probabilities. We will call these 'strong-' and 'weak definite probability' respectively. We will symbolize them using the same symbols we used for the indefinite probabilities: 'probs(Q/P)' and 'probw(Q/P)^. However, there is no danger of confusing the definite probabilities with the indefinite probabilities, because the arguments for the former must be propositions whereas the arguments for the latter must be predicates. We want ' p r ~ b ~ ( Q / Pto) ~be a definite probability based solely on what strong indefinite probabilities there are. How is this t o be done? In a purely formal way, I think that this is quite simple. First define: (3.1)
T(P) = the set of all possible worlds in which the proposition P is true.
Then as a first approximation, we can define quite simply:
In other words, the strong probability of Q given P is simply the strong probability of a world in which P is true being a world in which Q is true. Thus we reduce the definite probabilities to the indefinite probabilities in a very simple way.
210
CHAPTER VIII
It may seem that we are somehow cheating in this definition. I think we are cheating, but the cheating does not occur at this point but earlier on when we eschewed the attempt to give an account of how non-projectible indefinite probabilities are computed on the basis of projectible ones. Just what is concealed in that problem now becomes apparent. Hidden there is the whole problem of how one computes definite probabilities on the basis of inductively confirmed indefinite probabilities. The probability probs(x T(Q)/x e T(P)) is itself a perfectly reasonable indefinite probability. However, it is clearly not an indefinite probability whose value is subject to direct inductive confirmation. Thus its value must be indirectly determined by those basic strong indefinite probabilities which can be confirmed inductively. I will not try to say how this is done, because I do not know how it is done. This involves all the traditional problems of randomness, etc. Without a solution to this problem, we cannot claim to have given an analysis of probability. However, that is not my present purpose. My objective in this chapter is not the analysis of probability per se, but rather the exhibition of the intimate connections between probability and subfunctive conditionals, and we can accomplish the latter without a complete analysis of probability. Returning now to 3.2, we can see that it does not provide quite the definition we want. Intuitively, we want probs(Q/P) to be the proportion of physically possible worlds making P true which also make Q true, but our definition fails to give us this result. This can be seen by looking at our measure-theoretic analysis of probs. T(Q)']rXeTfpr consists of certain maximal sets of predicates including r x T(QV ~ which are such that it is logically possible for something to satisfy them all simultaneously. The only things that can satisfy r x T(Q)' ~ are possible worlds, so these are maximal consistent set of predicates of possible worlds. A predicate of possible worlds picks out a set of possible worlds. It is frequently maintained that any set of possible worlds determines a proposition. This might be denied on the grounds that propositions are 'intensional', and hence are only picked out by 'definable' sets of possible worlds. However, it is at least clear that any set of possible worlds picked out by a predicate does determine a proposition. Conversely, any proposition R determines a predicate of possible worlds, viz., rx e T(R)'. Consequently, the members of r x E
rx
21 1
PROBABILITIES
T(Q)l]rXeT(p)-' correspond one-one to maximal consistent sets of propositions, which in turn correspond to possible worlds. Thus we can regard r x â T(Q)l]rx<ET(pr as picking out a set of possible worlds in which Q is true. By definition 2.25, a member A of this set of possible worlds must satisfy the additional constraint that
We want this constraint to say that if P were physically possible, then A would be one of those worlds that might be physically possible. At first it looks like it does say this, but it doesn't. The difficulty is that "(3x)x T(PY simply says that there is a possible world in which P is true, which is to say "OP". Similarly, "('3x)(R) (rxe T(R)"e A 3 x e T(R))' say that there is a world making all of the propositions corresponding to A true at the same time, which is just to say that A determines a possible world. But we already know this. Consequently, our constraint is vacuous. But this means that ["x E T(Q)']rxfET(p)7 picks out all logically possible worlds in which Q is true, and not just the physically possible ones. Consequently, p r ~ b ~ ( ~ x T ( Q ) l / r x eT(P)') is the proportion of all logically possible worlds making P true which also make Q true, rather than being the proportion of all physically possible worlds making P true which also make Q true. Formally, our difficulty arises from the fact that 'O(3x)x e T(Q)' is P
equivalent to "OOQ1,and hence to "OQ1,whereas we would like it to P
be equivalent to "OQ1. The source of our difficulty lies in a strange P
equivocation of which philosophers are frequently guilty in connection with possible worlds. First consider possible men. Corresponding to each man is his 'diagram' - the set of all the predicates he satisfies. Talk about possible men is talk about possible ways men could be, which is to talk about all those diagrams which logically could be the diagrams of men. No one would confuse a man with his diagram. Non-actual men do not exist, but their diagrams do. Hence there is a clear distinction between saying that it is physically possible for there to be a man of a certain description and saying that it is physically possible for there to be a man-diagram containing that description.
212
CHAPTER V I I I
The latter is equivalent to saying that it is physically possible for it to be logically possible for there to be a man of a certain description (i.e., it is logically possible for there to be a man of that description) because (unlike men) man-diagrams exist just so long as they are logically consistent. Philosophers tend not to make the analogous distinction between worlds and world-diagrams. Instead, they generally identify a world with its world diagram. This is the customary procedure, and it has been our procedure throughout this book. But now that customary procedure has landed us in difficulty. We must distinguish between saying "It is physically possible for there to be a world in which Q is true1 and "It is physically possible for there to be a world-diagram containing Q1. As we have seen, the latter is equivalent to "It is logically possible that Q1, and given our identification of worlds with their diagrams, it is this rather than the former which is captured by It is now clear how to solve our problem. We must replace the predicate ^x T(Q)' by another predicate "Q(x)' such that ^(3x)Q(x)' says that Q is true (in the actual world) rather than that Q is true in some world or other. We can accomplish this without changing our formal definition of 'possible world' if we introduce the predicate "x is actual1 to mean "x is a (the) actual world1. Then we can define: (3.3)
probs(Q/P) = probs(x e T(Q)/(x e T(P) & x is actual)).
This has the desired result that probs(Q/P) is the proportion of physically possible worlds making P true which make Q true.4 Analogously, we can define:
This gives us the proportion of actually possible worlds making P true which make Q true. 3.2. Probabilistic Laws I have talked glibly about probabilistic laws as being probabilistic analogues of deterministic laws (subjunctive ge'neralizations). But what
PROBABILITIES
213
does this mean? We might naturally suppose that any strong indefinite probability statement expresses a probabilistic law, but it takes little reflection to see that such a view is mistaken. As we have seen, there are two contributing factors involved in generating strong probabilities - deterministic laws, which alter the contents of the sets [GIF by determining what worlds are physically possible, and probabilistic laws which dictate different weightings to the physically possible worlds. Even if there were no probabilistic laws (so our measures would be g L , the logical measure functions), there would still be strong probabilities. Thus strong probability statements do not automatically express probabilistic laws. To say that there is a probabilistic law operating in a certain case is just to say that our measure functions are not those in &. To make this precise, let us define a kind of hybrid probability function which uses the logical measures on the physically possible worlds. For each X, if X # 0 and X is in the domain of some member of BL, let pL,Xbe the unique p in BL such that X is in the domain of p and p ( X ) # 0. Then we define:
Analogously,
Then to say that there is a probabilistic law operating is just to say that x )such . a case, let us say that "Fxl is probs(Gx/Fx) # p r ~ b ~ , ~ ( G x / FIn strongly relevant to "Gxl. Analogously, we can define weak relevance, and positive and negative relevance:
(3.7)
"Fxl is strongly relevant to "Gxl iff probs(Gx/Fx) # probLs(Gx/Fx).
(3.8)
"FX1 is weakly relevant to "Gxl iff probw(Gx/Fx) # p r ~ b ~ , ~ ( G x / F x ) .
(3.9)
rFx' is strongly positively relevant to '^Gxl iff
214 (3.10)
CHAPTER V I I I
"Fxl is weakly positively relevant to "Gxl iff p r ~ b ~ , ~ ( G x /
(3.11)
"Fxl is strongly negatively relevant to "Gxl iff orobs(Gx/Fx) < p r ~ b ~ , ~ ( G x / F x ) .
(3.12)
"Fxl is weakly negatively relevant to ^Gxl iff probw(Gx/Fx) < p r ~ b ~ , ~ ( G x / F x ) .
We can extend these definitions to strong and weak definite probabilities in the obvious way. Then to say that there is a probabilistic law operating in probs(Gx/Fx) is just to say that "Fxl is strongly relevant to "Gxl. Similarly, a probabilistic law is operating in probw(Gx/Fx) iff "Fx1 is weakly relevant to "Gxl. We know what it is for there to be a probabilistic law operating in a particular case, but what is the nature of the beast itself? What is a probabilistic law? What a probabilistic law does is shift our measures from the logical measures gLto a new set of empirical measures 8, so it seems reasonable to simply identify the probabilistic laws with the set of empirical measures. That is, a probabilistic law is any set of probability measures other than gL. This is not entirely in accord with our ordinary way of thinking of probabilistic laws according to which, e.g., the quantum mechanical equations expressing probability distributions for various quantities represent probabilistic laws. However, this can be reconciled with the view that a probabilistic law is a set of empirical probability measures. In order for the quantum mechanical equations to represent probabilistic laws, the distributions they dictate must differ from the 'chance' distributions generated by the logical measure functions. Insofar as they do differ, in order for the quantum mechanical equations to be true the set of empirical measures must differ from gL. The quantum mechanical laws do not completely determine the set (5 of empirical measures, but they do partially locate (5 as being a member of that class of sets of measures which would yield the probability distributions expressed by the quantum mechanical laws. It seems reasonable to call such a partial characterization of (5 a 'probabilistic law'. A complete specification of (5 is just a maximally strong probabilistic law. The weaker probabilistic laws are equivalent
PROBABILITIES
215
to the indefinite probability statements which generate them, so we can regard the latter as probabilistic laws too if we wish. We have two kinds of subjunctive probability - strong and weak. Are there correspondingly two kinds of probabilistic laws? The answer to this seems to be, 'No'. A probabilistic law simply selects a set of empirical measure functions. There is nothing about such a set of measure functions to relate it to one kind of probability rather than the other. The difference between strong and weak probability would seem to be, not in the measure functions employed, but in the sets measured. Strong probability-results from measuring sets of physically possible combinations. Weak probability only diverges from strong probability by virtue of measuring smaller sets, i.e., by measuring sets of actually possible combinations.
3.3. The Analysis of M In our original analysis of M, which overlooked probabilities, the only way "QMP1 could be true, where 0 is false in the actual world, is for there to be some false proposition R such that P + ( Q v R), but P^>R. But if the only true subjunctive generalizations were entailments, other putative subjunctive generalizations being replaced by subjunctive probability statements, this would not in fact diminish the set of 'might be' statements. On the contrary, whenever P is positively relevant to 0 , we would conclude that QMP. Thus it seems that there is a second way that 'might be' statements can arise, viz., from the positive relevance of P to Q. However, we have two kinds of subjunctive probabilities - strong and weak. Which kind is involved in generating 'might be's'? It is not difficult to see that it is the weak probabilities that are involved, the reason being that they take into account more about what is actually the case in the world. For example, consider Chisholm's bottle of rat poison again, but suppose now that all laws governing the effect of poison on the human system are probabilistic rather than deterministic. It is clear that the p r ~ b ~ , ~ ( S x / of D xthe ) survival Sx of a person who drank Dx from the bottle is non-zero - there are numerous physically possible circumstances under which a person could do so and survive, e.g., by washing the bottle out first and then filling it with clean water. Let us suppose further that due to some strange psychological and
2 16
CHAPTER VIII
genetic features of human beings, people who drink from bottles are apt to live longer than people who do not drink from bottles. Then pr~b~(Sx/Dx)>prob~,~(Sx/Dx), i.e., drinking from the bottle is (slightly) strongly positively relevant to surviving. But we can suppose that because the bottle is filled with very dilute poison, this is just enough to overcome that strong positive relevance that drinking from the bottle has to survival, with the result that drinking from the bottle is not weakly relevant to surviving. Under these circumstances, if we consider a person who died of old age, we would not say that he might have lived longer had he drunk from the bottle. If it weren't that the bottle contains the poison, the strong probability and the weak probability would coincide, there being nothing by virtue of which the weak probability would differ from the strong probability, and hence drinking from the bottle would be weakly positively relevant to survival. In that case we would say that the person might have lived longer had he drunk from the bottle; but because the bottle does contain the poison, it is not true that the person might have lived longer if he had drunk from it. Thus the same circumstances (the bottle's containing poison) which make the weak probability diverge from the strong probability, and hence prevent drinking from the bottle from being weakly positively relevant to survival, also determine that -(QMP). This indicates that it is weak subjunctive probability and weak positive relevance that are involved in the analysis of M and allow additional 'might be' statements to be true. How, precisely, is weak positive relevance involved in the analysis of M? We will start from analysis 2.15 of Chapter VI and see how it must be modified. The first thing we must do is modify the definition of a possible world so that it contains information on weak positive relevance. Rather than taking a possible world to be a quadruple ( 5 , T), N , W) as in Chapter VI we will now take it to be a quintuple (5, T), N, W, P) where P is the set of all ordered pairs (
217
PROBABILITIES
In this case, noting that we have no contingent subjunctive generalizations with which to contend, it should be that case that if ((p, + ) ePa, then +Ma, and hence there is some p such that pMaq and is true in p. We must liberalize the definition of M to allow such a p. This can be done very simply by disjoining the following to the above requirement:
+
(3.13)
+
There is a such that a &T(+) and (
Next, what happens when a contains true contingent subjunctive generalizations? Let us still suppose that (p is indicative, and suppose that (p is not counter-legal, i.e.,
218
CHAPTER V I I I
slightly positively relevant to dying of heart failure. Suppose Smith dies from pneumonia, with the actual causal mechanism consisting of the pneumonia bringing about heart failure. We would still agree that he might not have died when he did had he gotten plenty of vitamin C, even though his getting plenty of vitamin C is not negatively relevant to his suffering heart failure. The reason we would agree that he might not have died had he gotten plenty of vitamin C is that if we trace the historically antecedent circumstances implicating his dying back to a certain point in time (namely, the time at which he contracted pneumonia), we find that from that time back his getting plenty of vitamin C is negatively relevant to his being in those circumstances. It seems then that the basic condition under which considerations of relevance can bring it about that (-Q)MP is the following: P is false and 0 is true, P is weakly negatively relevant to 0 , and there is a time r such that P is weakly negatively relevant to all historical antecedents of Q predating t. However, once we have generated some 'might be' statements in this way, we automatically get some more. For example, suppose Q is a conjunction ""0,& Q> Then we must have either (-Q,)MP or (-QJMP. There is an obvious constraint here. If P is negatively relevant to Q because it is negatively relevant to Ql and not relevant to Q2, then we should have (-Ql)MP, but we should not have ( - O m . More generally, if -Q =^> (-R v-S), we have to have either (-R)MP or (-S)MP. If P is negatively relevant to R but not to S, then we are constrained from having (-SIMP. When there is a choice between two propositions -R and -S, and P is negatively relevant to one of them, then that is the one we should choose as being one that might be true if P were true. We can capture all of this by ruling:
(3.14)
ftMaip
r(
(1)
if ip is false in a, and there is a -4)' is true in (3, and
true in a such that
ip is weakly negatively relevant to if/,and there is a t e R l such that for all t * < t and finite X c SJt*), if J7X is the conjunction of X and I7X => 4 then ip is weakly negatively relevant to X ; and
219
PROBABILITIES
(2)
for all t e Rl. (a) (Sa(r)ASo(r))is a minimal (VNa U V Wa U ^
-
(b) there is no indicative 0 and ye[[I]]a such that & -^})(Sa(f)ASJt)) is a minimal (VN,, U VW,, U change to Ta at t, and N,, = No and W,, = Wo, and 0 e (To-Ta) but 0i^ T,, and
r
Finally, consider what happens when we remove the restriction that be indicative and not counter-legal. Removing this restriction does not seem to make any appreciable change to the analysis other than those changes already embodied in definition 2.15 of Chapter VI. The new definition becomes the same as the old 2.15 except that clause (vi) is replaced by the disjunction of the old clause (vi) and the new condition 3.14 above; and finally a new seventh clause must be added:
(VII)
Pa = Pa.
In defense of this final clause, notice that there is no way we can formulate within our language any sentence incompatible with the statements of positive relevance reflected in the set Pa. To force changes in Pa we would at the very least have to be able to formulate probability statements in our language. I believe that this new definition of M takes full account of the possibility of there being probabilistic laws. The definition of truth given in Chapter VI and based upon the definition of M can still be used without change. This completes the first of the two basic tasks of this chapter, which was to see what changes must be made to the definition of M to accommodate probabilistic laws.
The second major task of this chapter is to examine those probability statements like 'If it were true that P, then it would probably be true that Q' and '"If it were true that P, then it would almost certainly be true that Q' which we are often inclined to make in lieu of asserting a
220
CHAPTER VIII
simple subjunctive. The suggestion was made above that in the absence of any true subjunctive generalizations (that is, if we had only probabilistic laws), these probability statements would be all that we would be warranted in asserting. Statements like "If it were true that P, then it would probably be true that Q1 look like simple subjunctive conditionals whose consequents are probability statements. But I think that this grammatical form is misleading. Taken at face value, it would give these probability statements the form "(P >prob(Q) 2 1- e)'. The first difficulty with this is that it requires a notion of unconditional probability which proves very hard to come by. But supposing for the moment that we h a v e such a probability concept, it would still not be reasonable to regard "If it were true that P, then it would probably be true that 0"'as having the form ^(P> prob(Q) 2 1- el1. The latter statement would require "prob(Q)? 1 - e l to be true in every P-world, but that is an unreasonably strong requirement. For example, suppose that p r o b ( Q ) 2 1 - e/2' is true in every P-world with the exception of one, 0. Suppose further that if P were true, it would be much less probable that ft would be the actual world than it would be that any other P-world would be the actual world. Surely, in this case, we would agree that if it were true that P then it would probably be true that 0, although "(P > prob(Q) 2 1- e)"' would be false. These considerations indicate that "If it were true that P then it would probably be true that Q1 is not really a conditional. Given that it is not a conditional, it seems rather obvious that it must be a conditional probability statement. In this section I will define the kind of conditional probability which I believe to be involved here, and show how it is related to simple subjunctive conditionals. 4.1. The Variety of Probabilities
Philosophers.sometimes make the mistake of supporting that there is just one reasonable concept of probability, and then criticize one another's pronouncements on probability from that point of view even though they are in fact talking about different concepts. We must avoid making that mistake. The term 'probability' has been used to talk about a number of
PROBABILITIES
22 1
different concepts. Three such concepts are 'degree of confirmation', 'degree .of belief', and 'degree of rational belief'. It has never been obvious to me that degree of confirmation has the structure of a probability concept. Degree of belief (taken literally, and not idealized) almost certainly does not have such a structure, simply because people can hold irrational combinations of beliefs. However, degree of rational belief pretty clearly does have the structure of a probability concept, and it is a very interesting kind of probability to investigate. I mention degree of rational belief primarily to contrast it with the kind of probability which will be investigated here. Let us symbolize degree of rational belief as 'probR'. ProbR is primarily of use in deciding what actions to take in cases of partial ignorance. If P represents everything we know, then probR(Q/P) represents how likely it is that Q is true given everything we know. As our state of knowledge changes, probR changes (by changing P). This is all in strong contrast to the kind of subjunctive probability involved in "If it were true that P, then it would probably be true that Q1. I will symbolize the latter kind of probability simply as 'prob'. 'probR(Q/P) = r1 is an indicative statement - it is about the probability that Q is true given P. O n the other hand, 'prob(Q/P)=rl is a subjunctive statement - it is about the probability that Q would be true if, contrary to fact, P were true. Prob(Q/P) is really only of interest in the case where P is now false (this is also true of simple subjunctive conditionals). In determining the value of prob(Q/P), we make use of everything that is true in the actual world, just as in evaluating the truth value of a subjunctive conditional we make use of everything that is true in the actual world. Prob(Q/P) looks at all those worlds that might be actual if P were true, i.e., at all P-worlds, and gives us the proportion of them that would make Q true. This is a weighted proportion, the different P-worlds being weighted according to their relative likelihood of being actual if P were true.
4.2. Simple Subjunctive Probability The probability concept that will now be defined will be called 'simple subjunctive probability'. It is so-called because the simple subjunctive conditional is a limiting case of this probability concept. Prob(Q/P) is
222
CHAPTER VIII
about what would be the case if P were true, thus it is a measure of the proportion of worlds making 0 true, not among all possible worlds making P true, but among all worlds that might be actual if P were true. We can define it in a way completely analogous to our definitions of strong and weak definite probabilities. First, define:
Then: (4.2)
prob(Q/P) = probw(x T(Q)/(x M(P) & x is actual)).
We can also define 'prob' directly in terms of weak definite probabilities. First we define: T ( P > 0 ) 2 Q is true].
(4.3)
map = (Q)[a
(4.4)
if ay is the actual world, mP = ma0P.
rmP1 is a proposition which is true in a world just in case that world is a P-world. Then we can define: (4.5)
prob(Q/P) = probw(Q/mP).
Our simple subjunctive probability is a conditional probability. It is of course possible to define a non-conditional probability in the normal way:
However, this turns out to be the same thing as the truth value of P. This results from the following two theorems: (4.7)
If P is true, then prob(Q/P) =
1 if Q is true 0 otherwise
Proof: if P is true, then M(P) = {ao} (where a. is the actual world). If
0 is true then 0 is true in all members of M(P), and so prob(Q/P) = 1. If 0 is false, then 0 is true in no member of M(P), so prob(Q/P) = 0. (4,8)
prob(P) =
1 if P is true 0 if P is false
Philosophers have often supposed that the connection between subjunctive conditionals and conditional probability should be that
PROBABILITIES
223
prob(P> Q) = prob(Q/P). Theorem 4.8 shows that this is not possible, at least for our notion of simple subjunctive probability. We will always have either prob(P > Q) = 1 or prob(P> Q) = 0, depending upon whether r P > Q1 is true or false, but it will not generally be the case that either prob(Q/P) = 1 or prob(Q/P) = 0. What then is the connection between simple subjunctive probabilities and simple subjunctive conditionals? Just as in the case of subjunctive generalizations, the connection should be that the subjunctive conditional is a limiting case of subjunctive probability. Part of what this means is contained in the following trivial theorem:
If the situation here were completely analogous to the situation with subjunctive generalizations, then the full content of saying that the conditional is a limiting case of the probability statement would be contained in the following principle: (4.10)
( P > Q) = ( R ) prob(Q/(P & R)) = 1.
Unfortunately, 4.10 is false. From right to left fails because we can trivially prove (substituting ^-Q1 for R): (4.1 1)
( R ) prob(Q/(P & R ) )= 1-3(P+ Q).
The other half of 4.10 fails for the same reason the principle
fails. The source of both of these difficulties is that when we change subjunctive hypotheses, we change the set of possible worlds under consideration (from M(P) to M(P & R)), and there may be no simple connection between the two sets of possible worlds. In particular, we do not generally have that M ( P & R ) = M(P) nM(R). For this same reason, it turns out that simple subjunctive probability does not satisfy all of the normal axioms for conditional probabilities. The measuretheoretic principles automatically hold:
224
CHAPTER VIII
However, the following 'product axioms' (which are interderivable given 4.12-4.16) all fail: (4.17)
If P -^ Q and Q
-> R
then prob(P/Q) = (4.18)
If Q
+ -R,
and prob(Q/R) # 0,
prob(P/R) prob(Q/R) '
then
prob(P/Q v R ) = prob(P/Q)-prob(Q/Q v R ) +prob(P/R)-prob(R/Qv R).
As 4.17-4.19 are interderivable, it suffices to give a counter-example to just one of them. 4.19 is particularly interesting, because it can be viewed as a probabilistic analogue of principle 7.1 of Chapter 111:
This is the principle which, when added to SS yielded Lewis' C l , and which expressed the semantical principle that the ordering of possible worlds which generates subjunctive conditionals is connected. We can generate counter-examples to 4.19 in precisely the same way we generated counter-examples to 7.1. All we need is a case in which prob(P/R)#0 (so PMR); and P and Q are independent so that prob((P & Q)/R) = prob(P/R)-prob(Q/R); and (R > 0 ) so that prob(Q/R) = 1, but prob(Q/(P & R)) # 1. Given such a case, prob(P/R)-prob(Q/(P & R))<prob(P/R) = prob((P & Q)/R). Such a case can be constructed using the same example which was a counterexample to 7.1. Let S, T, and U be the three unrelated false statements 'My car is painted black', 'My garbage can blew over', and 'My maple tree died'. Then let P= ^ ( U v T)', R = '(S v T)', and Q =
-
u:
Because of the failure of 4.17-4.19, we cannot express the sense in which the simple subjunctive conditional is a limiting case of simple subjunctive probability statements by asserting principle 4.10. We
225
PROBABILITIES
must resort to more complicated techniques, which will be the topic of the next section. Theorem 4.7, according to which If P is true, then prob(Q/P) =
1 if Q is true 0 if 0 is false
expresses what is perhaps an unfortunate characteristic of simple subjunctive probability. By virtue of this theorem, these probabilities are really only useful in talking counterfactually about what would probably be the case if P (which is now false) were true. But sometimes we employ a kind of probability statement which seems to circumvent this difficulty. For example, suppose a car goes out of control in heavy traffic and after careening about for a while collides with another car. We might observe that under the circumstances, since the car went out of control it was quite probable that there would be a collision. This notion of Q's being probable since P was true is an important notion that is worth investigating. It seems to be a probabilistic analogue of the necessitation conditional, which can be expressed (when it has a true antecedent) as "It was true that Q since it was true that P1. The more general form of a necessitation conditional is I f it were true that P, then it would be true that 0 since it would be true that P1,and analogously the general form of our probability statement is ^If it were true that P, then it would be probable that 0 since it would be true that P1. Presumably, necessitation conditionals are the limiting case of these probability statements, and hence the analysis of necessitation conditionals can guide us in the analysis of the probability statements. The necessitation conditional is analyzable as:
Correspondingly, I would suggest that "If it were true that P, then it would be probable that Q since it would be true that P1 is analyzable as: Q would probably be true if P were true, and if " ( - P & -Q)' were true, it would still be the case that 0 would probably be true if P were true. Presumably, to say that 0 would probably be true if P were true is to
226
CHAPTER VIII
say that prob(Q/P) 3- r, for some particular r. Thus the above comes to:
A reasonable reading of this would be "P's being true would dispose Q to be true with at least a degree rl. Let us introduce a short symbol for this:
We clearly have:
It is worth noting in passing that conditionals like "(-P & -Q)> prob(Q/P)s r1 illustrate that we can have some true counterfactuals even if we have only probabilistic laws. In 4.20 we introduced the notion of the degree to which P disposes Q to be true. It might be supposed that this is another kind of probability. It is easy enough to construct measures of this degree. An obvious definition would be:
Actually, I think that a more interesting measure would be one taking a weighted average of the values prob(Q/P) might have if "(-P & Q)l were true:
-
However, all of this is really rather beside the point, because it is easy to see that no matter how we define 'disp', it will not be a probability, i.e., it will not satisfy the measure-theoretic axioms of the probability calculus. This results from the fact, noted in Chapter 111, that the following is not valid:
and hence we can disp((Q v R)/P) < 1.
have
cases in
which
disp(Q/P)=l,
but
227
PROBABILITIES
4.3. A Probability Algebra There is reason to believe that the relation "If it were true that P, then it would be more probable that Q than that R1 should have a fine structure not reflected in the values of prob(Q/P) and prob(R/P). Let us symbolize this relation as " Q K ~ R ' . We can construct cases in which we seem to have Q K ~ Rand , yet prob(Q/P) = prob(R/P). The simplest such cases arise when prob(Q/P) = prob(R/P) = 0. For example, let P be 'Joe is taller than six feet'. Let us suppose that if P were true, then for any finite interval 5 between 6'2" and 6'4", the probability that Joe's height lies in that interval is directly proportional to the length of the interval. Then it follows that if Joe were taller than six feet, then the probability of his height being any particular value between 6'2" and 6'4" is zero. If this were not the case, by adding a single point to an interval we could discontinuously increase the probability of Joe's height being in that interval. For each x in the interval between 6'2" and 6'4", let Qx be the proposition that Joe's height is x. Then for each x, prob(Qx/P) = 0. Nevertheless, it seems clear that ( P & -P)
-
However, as we will see, there are some difficulties for this principle.
228
CHAPTER V I I I
What are we to make of this relation '-+' which exhibits a finer structure than our probability function? It turns out that it .is definable in terms of our probabilities. It is not definable simply as R-+Q
iff prob(Q/P) < prob(R/P)
but it is definable in a more complicated way. First consider the case of propositions of zero probability. We may have prob(Q/P)= 0 even though Q might be true if P were true, because M(P) is so much bigger than M(P) fl T(Q) that any measure in S sufficiently coarse to give M(P) a finite measure must give M(P) fl T(Q) a zero measure. However, for the purpose of comparing 0 and R (on the hypothesis P), we need not employ a measure which gives a finite value for M(P). It is not M(P) that we are interested in, but rather its two subsets M(P) fl T(Q) and M(P) f l T(R). We want to compare the sizes of these sets, and any measure in 2 which gives them both finite values and makes at least one non-zero will do the job. This suggests the following definition: (4.23)
Q R iff probw(Q/[mP & (Q v R)])
(4.24)
Q ^P iff probw(Q/[mP & ( Q v R)]) = probw(R/[mP & ( Q v R)]).
What this definition does is move us from a finite measure for M(P) to a finite measure for the smaller set M(P) f l T(Q v R), and this measure must give a non-zero measure to at least the larger of M(P)fl T(Q) and M(P) fl T(R). Definition 4.23 works for the case of propositions of probability zero, but it does not work for the case of propositions of probability R. However, we can one. If prob(Q/P) = prob(R/P) = 1, then Q rectify this situation easily enough as follows: (4.25)
Q < ~ Riff Q < y R or -R
(4.26)
Q
R iff neither Q
Q.
The way in which this second ordering relation works can be seen
PROBABILITIES
easily from the following theorems: (4.27)
Q<$)R iff either: (i) prob(Q/P) = prob(R/P) = 0 and Q<:) (ii) prob(Q/P) = prob(R/P) = 1 and
R ; or
-R
(iii) prob(Q/P) and prob(R/P) are neither both zero nor both one, and prob(Q/ P ) < prob(R/P). (4.28)
Q ^p'R iff either: (i) prob(Q/P) = prob(R/P) = 0 and Q ^R ; or (ii) prob(Q/P) = prob(R/P) = 1, and -Q
=V)-R;
or
(iii) prob(Q/P) and prob(R/P) are neither both zero nor both one, and prob(Q/P) = prob(R/P). From this it follows easily that '=g" is an equivalence relation, and '<$^ is a linear ordering relative to the equivalence r e l a t i ~ n . ~ However, clause (iii) of 4.27 shows that '
'
The idea behind this definition is that rather than compare the entire sets M ( P ) n T(Q) and M(P)fl T(R), we only compare those parts which they do not have in common: M ( P ) f l T ( Q & - R ) = (M(P)f l T(Q))- (M(P)fl T(R)); and M(P) f l T(R & 0 )= (M(P) n T(R))- ((M(P)n T(Q)). This new relation does seem to yield all of the fine structure we want:
-
(4.30)
If prob(Q/P) = prob(R/P) = 0, then Q-@ R iff Q R.
(4.31)
If prob(Q/P) = prob(R/P) = 1, then Q
(4.32)
If T(Q) fl M(P) c T ( R ) fl M(P) then Q R.
iff -R<$'-Q.
So far, '< F" looks like a fine relation - just the one we want. But there
230
CHAPTER VIII
is a difficulty. The difficulty concerns how we are to define equiprobability. As far as I can see, the only plausible definition is: (4.33)
Q *R
iff neither Q < ~ Rnor R Q.
Unfortunately, so defined, '=?' is not an equivalence relation. This can be seen by employing the following theorem: (4.34)
If P > -(Q & R ) is true (i.e., M(P) D T(Q) is disjoint from M ( P ) n T(R)), then Q + R iff prob(Q/P) = prob(R/P).
Given this theorem, we can immediately see that it is possible to have Q < ~ R but , S = g ) Q and S ='p'R. For example, we simply choose 0 , R, and S so that (1) S is disjoint from both Q and R (i.e., '-P>-(S & 0)"' and "P > - ( S & R)' are true), and prob(S/P) = prob(Q/P), and (2) Q + R, M(P) n T(R & 0 ) # 0 , but prob((R & Q)/P = 0. This result makes ' ^/^ a rather peculiar relation. It may be quite useful for some purposes, but it is not as useful for our present purposes as is ' < g)', which is better behaved but agrees with '< g)' at the extremities of probability zero or probability one. Thus I shall define:
-
(4.35)
Q+R
(4.36)
Q z p R iff Q = ' ~ ' R .
-
iff Q-$)R.
We have a well-behaved probability algebra which makes discriminations beyond those made by 'prob', and in terms of which we can explain in precisely what sense simple subjunctive conditionals are a limiting case of simple subjunctive probabilities. To be maximally probable is to be equiprobable with a tautology. We now have the simple theorem: (4.37)
" ( P > Q ) l is true iff Q = p ( Q v - Q ) .
Proof: "(P > Q)l is true iff M(P) <= T(Q), iff M(P) f l T(- 0 ) # 0 = M(P) n T(-(Q v -Q)), iff Q +Q v - 0 ) . Thus for the conditional to be true is just for Q to be maximally probable given P. Similarly, for "QMP1 to be true is for Q not to be minimally probable given P: (4.38)
"QMP1 is true iff Q + p ( Q & - 0 ) .
PROBABILITIES
23 1
4.4. Simple Subjunctive Probability and Simple Subjunctive Conditionals Our probability algebra demonstrates the intimate connection between simple subjunctive probability and simple subjunctive conditionals. The nature of this connection allows us to understand many features of subjunctive conditionals which previously seemed puzzling. For example, people often feel some misgivings about the analysis of "QMP1 as " - ( P > -0)'. The first misgiving is that "QMP1 is expressed in English as a conditional: ""If it were true that P, then it might be true that P. Why, then, is it analyzed as the negation of a conditional? Principle 4.38 provides the answer by showing that "QMP1 is a kind of conditional probability statement. It amounts to saying that if P were true, then Q would not be totally improbable. The second misgiving about the analysis of "QMP1 is that when we assert it to be the negation of "If it were true that P, then it would be false that Q1 we feel a bit of strain, and are inclined to want to contrast it instead with "If it were true that P, then it would definitely be false that Q1. The explanation for this hinges upon what I regard as an extremely important fact about subjunctive conditionals. This is that we frequently assert "(P> Q)' when it is not really true - when all that is literally true is something like "If it were true that P, then it would almost certainly be true that Q1. It is this careless assertion of ( P > -Q)' we are resisting when we insert 'definitely' into the conditional. The observation that we often assert a simple subjunctive when all that is really true is a probability statement is, I think, very important, partly because it is such a pervasive tendency. Consider an example. Suppose we have a cylinder of gas, one end of the cylinder being formed by a movable piston, and suppose there is a temperature mechanism which holds the temperature of the gas constant as we move the piston in and out. We would be very much inclined to say that if the pressure of the gas increased, the piston would have been depressed. But this subjunctive conditional is not really true. If the pressure were to increase, this might be because the temperature mechanism went awry. This is so much less probable an alternative that we are inclined to ignore it, but all that is strictly warranted here is
232
CHAPTER VIII
the statement that if the pressure were to increase then very probably, or almost certainly, the piston would have been depressed. Lest someone think that perhaps all subjunctive conditionals are literally false, and all that is ever really true is a subjunctive probability statement, it is worth pointing out that the converse of the conditional in the above example is quite literally true. That is, if the piston had been depressed, then the pressure would have increased. This is because the piston's being depressed would undercut the pressure's being what it was, but would not undercut the proper functioning of the temperature mechanism. The continuum of subjunctive probability statements with subjunctive conditionals as the upper bound makes it much easier to understand what is happening in many cases in which, if we are restricted to conditionals and not allowed to employ probability statements, we do not quite know what to say. For example, consider a person, Joe, who is 5'11" tall. If Joe were over six feet tall, how tall might he be? Clearly, he might be 6'1", and he might be 6'2", and he might be 6'3", and so on for a while. But it seems there must be an upper bound. Isn't it true that even if he were over six feet tall, he would not be one thousand miles tall? If so, then there must be some point h in between such that he might be any height less than h, but he would definitely not be any height greater than h. The difficulty is that it is not at all clear what the value of h might be. Unless there is some physical law which dictates that a man cannot have a height greater than a certain magnitude h (and I know of no such law), then it is hard to see how any height h could come to constitute a sharp dividing line between those heights Joe might be if he were over six feet tall and those height that he would not be if he were over six feet tall. Something here seems very mysterious. I think that the solution to our difficulty lies in re-examining our initial supposition that if Joe were over six feet tall, he would not be one thousand miles tall. It is of course extraordinarily unlikely that he would be one thousand miles tall, so unlikely that for all practical purposes we can ignore the possibility altogether, but I doubt that there is anything which rules this out beyond all possibility. Because it is so extraordinarily unlikely that Joe would be one thousand miles tall, we also tend to ignore this possibility in our judgments regarding what
PROBABILITIES
233
would be the case if he were over six feet tall, and so we assert that if Joe were over six feet tall, he would not be one thousand miles tall. But this conditional is literally false. There is just the slimmest possibility that he would be one thousand miles tall, and so all that is literally true is the subjunctive probability statement that if Joe were over six feet tall, he would almost certainly not be one thousand miles tall. I suspect that a similar analysis can be given for many cases which might initially appear to be counter-examples to our analysis of subjunctive conditionals in terms of undercutting. For this reason an understanding of subjunctive probability is as important for understanding subjunctive conditionals as it is in its own right.
4.5. Simple Indefinite Probabilities We have defined two kinds of indefinite probabilities corresponding to the two kinds of subjunctive generalizations. We can define a third kind of indefinite probability which corresponds to simple subjunctive definite probability. This will be called 'simple indefinite probability'. It will be of some importance in the next chapter where we undertake to analyze disposition statements. We want prob(Gx/Fx) to be the probability of an F being a G, given the way the world actually is. It seems natural to suggest that this indefinite probability can be defined as being prob((3x)Gx/(3x)Fx). This is initially plausible, but that it will not work follows from Theorem 4.7 according to which if r(3x)Fx1 is true, prob((3x)Gx/(3x)Fx) is either one or zero. .However, I think that this general idea can be made to work. Let us begin by asking what is the probability of an F being a G given that there is just one F ? That is, taking r(31x)Fx1 to symbolize t h e r e is exactly one F1, we want to know the value of prob(Gx/(Fx & (gix)Fx)). It seems to me that this should be:
What then is the probability of an F being a G given that there are exactly two F's? It seems that we should be able to calculate this like
234
CHAPTER V I I I
an expectation value: (6.2)
prob(Gx/(Fx & (3g)Fx)) = pr0b((3~x)(Fx& G ~ ) / ( 3 ~ x ) F x ) +prob(Gx/(Fx & (3ix)(Fx & Gx) & (3zx)Fx)) .pr0b((3~x)(Fx& G ~ ) / ( 3 ~ x ) F x ) .
In general:
prob(Gx/(Fx & (akx)(Fx & Gx) & (3,,x)Fx)) is the probability of an F being a G given that there are exactly n F's and k of them are G's. This probability would seem to be k/n, so principle 6.3 can be simplified to read: (6.4)
prob(Gx/(Fx & (3,,x)Fx))
Symbolizing the statement that there are infinitely many F's as '(3-x)Fx, how should we define prob(Gx/(Fx & ( 3 -x)Fx))? the only plausible way I can see to define this is as the limit of the probabilities for larger and larger finite sets of F's: (6.5)
prob(Gx/(Fx & (3-x)Fx)) = lim prob(Gx/(Fx & (3,,x)Fx)). n--
Letting r('^^;n~)Fxl symbolize the statement that there are no more than n F's, we can next define: (6.6)
prob(Gx/(Fx & ( 3 a n ~ ) F x ) ) prob(Gx/(Fx & (gkx)Fx)).~ ~ O ~ ( ( ~ ~ X ) F X / ( ~ ~ ~ ) F X ) ) .
= Ocksn
Then it seems reasonable to define prob(Gx/(Fx & -(3-x)Fx)), the probability of an F being a G given that there are finitely many F's, to be the limit of prob(Gx/(Fx & (3sn~)Fx))as n becomes indefinitely
PROBABILITIES
large, and so finally to define prob(Gx/Fx) as: (6.7)
prob(Gx1Fx) = prob(Gx/(Fx & (3-x)Fx)) -pr0b((3~x)Fx/(3x)Fx) + pr0b(-(3~x)Fx/(3x)Fx)
- lim prob(Gx/(Fx & (3sn~)Fx)). n--
Principle 6.7 seems to constitute a reasonable definition of simple indefinite probability on the basis of simple subjunctive definite probability. However, there are some surprises forthcoming from this definition. By virtue of Theorem 4.7 we obtain: (6.8)
If '(3,,x)Fx1 and "(gicx)(Fx& Gx)' are true, then prob(Gx1Fx) = kin. That is, if there are finitely many F's, then prob(Gx1Fx) is just the material probability probM(Gx/Fx). For many purposes, this is an entirely reasonable result. We want to know the probability of an F being a G given the way the world actually is, so if there are some F's in the world, this probability should be the actual proportion of F's that are G's. However, for some purposes we would also like a different kind of indefinite probability which still reflects the way the world actually is. This would reflect not simply the actual proportion of F's that are G's, but the likelihood of a new F being a G. For example, if we are flipping a coin, we would like to know not just what proportion of flips already made have resulted in heads, but how likely it is that a new flip would result in heads. This probability can be defined very easily using a now-familiar construction:
& prob(Gx/(Fx & x ei A)) = r ] .
Prob+(Gx/Fx) is the simple indefinite probability of a new or different F being a G. This will turn out to be a very important probability concept. In particular, it will be involved in the analysis of dispositions. NOTES Note that ' ( H ~= pQ' is an equivalence relation. It will hold iff probs(Gx/(Fx v Gx)) # 0 and probS(Fx/(Fx v Gx)) # 0.
236
CHAPTER V I I I
It is worth remarking that the analogous theorem holds for weak indefinite probabilities, weak subjunctive generalizations, and actual possibility. If, as is often maintained, given any set of possible worlds there is a proposition true in just those worlds, then it follows that infinite conjunctions of propositions exist, and hence that there is always a maximal P. More precisely, this is the proportion of worlds making Q true out of those worlds which might be physically possible if P were physically possible and which make P true. * U p to this point I have been intentionally vague about whether 'prob(Q/P)^ is a metalinguistic relation between sentences or an object-language term-forming operation. It is clear that for our present purposes we must opt for the latter alternative. T h i s means that the ordering of the equivalence classes imposed by the ordering '<^" of their members is a linear ordering.
CHAPTER I X
DISPOSITIONS
The final kind of subjunctive statement to be considered in this book is that of a statement ascribing a disposition to an object. Traditional examples of such statements would be: (1)
That liquid is flammable.
(2)
This vase is fragile.
(3)
That chalk is friable.
(4)
Joe is foolhardy.
(5)
Mary is inquisitive.
The problem is how to analyze such statements. Remarkably little has been written about dispositions in the last decade. I suspect that this is due largely to philosophers feeling that this problem has been solved, or at least successfully reduced to the problem of analyzing subjunctive conditionals. For example, it seems initially plausible to suppose that (1) is analyzable as:
(I*)
If that liquid were heated, it would burn.
Thus the feeling is that the only problem remaining is that of analyzing subjunctive conditionals, and there is really no point in further discussion of dispositions per se. We have presented an analysis of subjunctive conditionals, but unfortunately we cannot rest content that we have thereby solved the problem of dispositions. The traditional view according to which disposition statements are analyzable on the model of (I*) is completely and unalterably wrong. A first glimmering of difficulty for the traditional view was noted by Goodman (1955) who pointed out that
C H A P T E R IX
(1) does not entail (I*). There are numerous circumstances under which (1)would be true but (I*)false. For example, there might be no oxygen present. Thus, as Goodman observes, if we are to defend something like the traditional view, we must retreat to:
(I**) If conditions were propitious and that liquid were heated, then it would burn. But of course, without an account of what it is for conditions to be propitious, this is not an analysis. As we will see, following this line of thought to its logical conclusion leads ultimately to a completely different kind of account of dispositions. However, before continuing the discussion of the above difficulty, it behooves us to note an entirely different sort of difficulty. Our list of disposition statements contains statements about two entirely different sorts of dispositions. Ignoring the difficulties about propitious conditions, (1)-(3) seem to be analyzable roughly on the model of (I**). These disposition statements at least seem to entail conditionals to the effect that if something were the case then something else would be the case. But (4)and (5) are markedly different. They do not entail any such conditionals. (4)and (5) report tendencies. They tell us how Joe and Mary tend to behave under certain sorts of circumstances. Unlike (1)-(3),they do not tell us what definitely would happen under some circumstances, but only what would be likely to happen. We have a generally overlooked distinction here between what may be termed 'absolute dispositions' and 'probabilistic dispositions'. I do not know of a single discussion of the difference between these kinds of dispositions, and yet an account of the sort of (I**)is obviously inappropriate for probabilistic dispositions. This is a rather remarkable oversight because, historically, those dispositions which have most interested philosophers have tended to be mental or psychological dispositions, and these are perhaps without exception of the probabilistic variety. It will turn out that there are definite similarities between the absolute and the probabilistic dispositions. These arise from the connections noted in the last chapter between subjunctive conditionals and subjunctive probabilities. Once we have seen how to analyze statements ascribing absolute dispositions, it will become much clearer how to analyze statements ascribing probabilistic dispositions.
DISPOSITIONS
239
It is surprisingly difficult to find examples of absolute dispositions. Almost all of the dispositions which philosophers customarily discuss are of the probabilistic variety. However, the following, picked at random from the dictionary, all seem to be of the absolute variety: 'absorbant', 'addictive', 'adhesive', 'deflatable', 'fissionable', 'flammable', 'flexible', 'fluorescent', 'fragile', 'friable', 'magnetized', 'magnetizable', 'soluble'. According to the (modified) traditional view, 'x is q-able' is analyzable as 'If conditions were propitious and x were i/'-ed, then x would q', where i/' is the appropriate antecedent for
240
CHAPTER I X
as capabilities. To be deflatable is to be capable of being deflated; to be fissionable is to be capable of undergoing fission; to be friable is to be capable of being made to crumble. What about our other dispositions for which there are obvious antecedents? Consider 'absorbant'. The natural antecedent here is 'x is placed in contact with the liquid'. But this is only natural because we know something about the circumstances in which absorbant materials absorb liquids. It is still only a contingent fact that these are the appropriate circumstances. It might have been the case instead that absorbant materials only absorb liquids when there is a slight air space between them, the air somehow facilitating the movement of molecules. Turning to 'flammable', there is already something peculiar about the antecedent 'x is heated sufficiently'. What does 'sufficiently' mean here? And I should think once more that it is only a contingent fact that heating flammable objects makes them burn. There are other ways to ignite some flammable objects, e.g., pouring certain chemicals on them, and I see no reason in general why it couldn't have been the case that flammable objects are made to burn by cooling them rather than heating them. Heating an object just happens to be the most common and generally the simplest way to make it burn. But if there is any way to make an object burn, then it is flammable. To be flammable is to be capable of being made to burn. Similarly, to be absorbant is to be capable of absorbing liquids. There are no antecedents built into absolute dispositions. The extent to which we are able to supply such antecedents reflects on the one hand our knowledge of how these dispositions work, and on the other hand the uniformity of the way they work (e.g., there are too many different ways of being deflatable for us to be able to supply a single antecedent). The ability to supply the antecedents does not arise simply from an understanding of the concepts involved. In general, absolute dispositions are best regarded as capabilities. Philosophers have misappropriated the term 'disposition' in talking about absolute dispositions. The term 'disposition' is really only appropriate for probabilistic dispositions, which are truly 'tendencies'. The observation that absolute dispositions are really capabilities explains the difficulty regarding propitious conditions. The reason we had to include the propitiousness of the conditions in the antecedent of our conditionals is that the antecedents supplied by our common
DISPOSITIONS
241
understanding of the functioning of an absolute disposition do not generally state a 'total cause' for the actuation of the capability. But this becomes irrelevant if the antecedent is not part of the meaning of the concept anyway. Once it is realized that absolute dispositions are really capabilities, it is rather easy to see how they are to be analyzed. For example, to say that a certain liquid is flammable is to say that it is capable of burning, which seems to be to say that it is physically possible for it to burn. Similarly, to say that a piece of chalk is friable appears to be equivalent to saying that it is physically possible for it to crumble. In general, (2.1)
x is (p-able iff 0 ( x is
(p).
P
I believe that 2.1 is basically correct, but more must be said about how it relates to particular disposition statements. Consider fragility. If we try to fit fragility into the format of 2.1, we find that it cannot be done. The nearest we can come is something like:
x is fragile iff it is physically possible for x to break. But this is incorrect. This would at best be a definition of 'breakable' rather than 'fragile'. To be fragile is not just to be breakable, but to be 'quite breakable'. We distinguish between degrees of breakability, flammability, friability, etc., and many of our 'disposition words' refer to particular degrees of having capabilities. For example, a piece of 'non-flammable' plastic might be made to burn by heating it to a very high temperature in a pure oxygen environment. Thus the plastic is flammable (i.e., has the capability of burning), but just barely so. Because it is so hard to make the plastic burn, it has the capability to such a minimal degree that we say it is non-flammable. Our ordinary use of the word 'non-flammable' is such that non-flammable objects need not be totally non-flammable. This indicates that 2.1 should not be taken as the scheme for defining disposition words, but rather as the scheme for defining what we might call 'basic capabilities'. Then most disposition words refer to varying degrees of these basic capabilities. For example, we might define the basic capability 'x is breakable' as 'It is physically possible for x to break'. The degrees of this basic capability constitute a continuum stretching from total non-breakability through extreme
242
CHAPTER IX
fragility, and different disposition words may pick out different regions of this continuum. In particular, 'breakable' refers to a rather large region including anything which is 'reasonably breakable'. Thus the term 'breakable' can be used both to refer to the basic capability and to refer to a region in the continuum of degrees of that basic capability. This seems to be true of most disposition words which can also be regarded as naming basic capabilities. They do double duty. Principle 2.1 constitutes an analysis of basic capabilities, but it does not yet constitute an analysis of most disposition statements, because such statements generally refer to specific degrees of basic capabilities. We must explore this notion of the degree to which something has a capability. To say that x is more flammable than y is to say that it is 'easier' to get x to burn than it is to get y to burn. Similarly, to say that x is more breakable than y is to say that it is easier to break x than it is to break y. In general, the degree to which an object has a capability is a measure of the ease with which that capability can be realized in that object. But 'ease' in what sense? We might naturally suppose that this has something to do with human abilities and how easy it is or hard it is for humans to cause the capability to be realized. But this cannot be right, because we can talk about a capability whose realization is simply beyond human ability. For example, an object which will ignite at a temperature of 1 0 ' ~ degrees '~ centigrade is more flammable than ~ ~ centi~ ' one which will not ignite until it is heated to 1 0 ~degrees grade, but it is beyond human ability to heat an object to either temperature. A natural second suggestion would be that talk about degrees of capabilities is to be analyzed in terms of the amount of energy that must be expended to realize the capability. But this cannot be right either. The concept of energy only plays the role it does in the realization of capabilities because of contingent physical laws, and as such it cannot be involved in the very concept of the degree to which something has a capability. Furthermore, the amount of energy that must be expended is not really a correct measure of the degree to which something has a capability anyway. Consider two substances, one of which can be ignited by heating it to any temperature above 100 degrees centigrade in the presence of any gas that would normally be considered 'breathable air'. Consider a second substance which can only be ignited by heating it to a temperature between 91.333 333 333
DISPOSITIONS
243
degrees centigrade and 91.333 333 334 degrees centigrade, holding it at that temperature for 37.375 seconds, and then abruptly changing the temperature to between 92.334 788 8 degrees centigrade and 92.334 788 9 degrees centigrade and holding it there for 10 seconds, all in the presence of pure oxygen. It takes less energy to ignite the second substance than to ignite the first substance, but surely we would consider the first substance more flammable than the second substance. Items made of the second substance would almost never burn because of the critical sequence of steps required to ignite them. If an object is flammable, there will generally be infinitely many ways to make it burn, e.g., heating it to 100°heating it to 101° heating it to 102O, . . . , dipping it in sulphuric acid and then placing it in a pure oxygen environment, etc. To say that one object is more flammable than another is to compare how many ways there are to get them to burn. Or better, it is to employ a measure function which measures the physically possible ways to get the objects to burn. A physically possible way to get an object to burn can be thought of as picking out a set of physically possible worlds in which the object does burn. Thus to say that x is more flammable than y is to employ a measure comparing the set of all physically possible worlds in which x burns and the set of all physically possible worlds in which y burns. For example, if the only relevant difference between x and y is that y has a higher kindling point than x, then any way of making y burn is also a way of making x burn, so the measure of the set of physically possible worlds in which x burns must be greater than the measure of the set of physically possible worlds in which y burns. But what is the nature of this measure? It is surely inadequate simply to say that there is such a measure. Fortunately, the measure involved is a familiar one. All it does is measure the size of a set of physically possible worlds. Furthermore, if there are probabilistic laws which dictate that some possible worlds are less probable than others, the measure should reflect this. If all possible worlds in which x would burn are highly improbable in light of some basic probabilistic laws, then it is correspondingly more difficult to get x to burn, and so the measure of physically possible worlds in which x does burn should be correspondingly smaller. Thus the measure involved here does precisely the same thing as the measures involved in strong definite
244
CHAPTER I X
probabilities. To say that the measure of the set of physically possible worlds in which x burns is smaller than the measure of the set of physically possible worlds in which y burns is to compare the strong probability of x's burning with the strong probability of y's burning. Strong probabilities are conditional probabilities, so we cannot talk about the probability of x's burning simpliciter. A natural suggestion would be that as we are merely interested in comparing the set of worlds in which x burns with the set of worlds in which y burns, we should have that x is more flammable than y iff: probs(y burns/(x burns v y burns))
< probs(x burns/(x burns v y burns)). But this cannot be quite right. The difficulty is that if it were less likely for x to exist than for y to exist, this would correspondingly diminish the measure of the set of worlds in which x burns, but this should not be relevant to the flammability of x. This suggests that the conditional probabilities we should be looking at are instead the probabilities of objects burning given that they exist. On this proposal: (2.2)
x is more flammable than y iff probs( y burnsly exists) < probs(x burnslx exists)
Principle 2.2 is close to being correct, but it runs into a surprising difficulty. A piece of asbestos is much less flammable than a piece of sodium. So far, principle 2.2 seems to give the correct answer. But now consider a brick made of asbestos and a brick made of sodium. We want to say that the brick made of asbestos is less flammable than the brick made of sodium, but this result is not forthcoming from 2.2. The difficulty is that it is within the realm of physical possibility to transmute the elements in the asbestos into sodium atoms, thereby transforming the brick made of asbestos into a brick made of sodium. It is still the same brick, but it is no longer made of asbestos. Analogously, it is physically possible for the atoms of sodium in the second brick to be transmuted into the elements that make up asbestos thereby transforming the second brick into a brick made of asbestos. This has the consequence that for every physically possible world in which x, the brick originally made of asbestos, exists, there is a physically possible
DISPOSITIONS
245
world in which y, the brick originally made of sodium exists and has the same attributes as x which are relevant to their burning. Consequently, despite our judgment that x is less flammable than y, probs(x burns/x exists) = probs(y burnsly exists). The difficulty arises here because our probability measure simply looks at the size of the set of all physically possible worlds in which x burns, and does not take into account how likely it is for different members of the set to exist. It might seem that we could avoid these difficulties by changing the antecedent of our probability from rx exists1 to something like rx has the nature it presently has1. The latter is obviously a problematic notion, but the intention would be to judge x's flammability in terms of its present physical makeup. However, insofar as this would rule out our looking at those worlds in which transmutation of elements occurs, this is now too strong an antecedent. We do not want to rule out transmutation altogether; rather we want to take into account the fact that it is an unlikely occurrence. If instead transmutation were very easy to bring off, we might well find ourselves regarding bricks of asbestos as fuel and burning them in furnaces which first transmute their elements. In such a case, we would regard them as highly flammable. Thus we are not interested simply in the probability that x burns given that it has the nature it does. Instead, we are interested in the probability of x coming to burn given that it has the nature it does. This would allow for the possibility that the nature of x changes before it begins to burn. But what are we talking about when we talk about x having a certain nature? The intention is, among other things, to talk about x's physical composition. Thus, for example, x's being composed of asbestos is relevant to the computation of the probability measuring x's flammability. However, that x is now submerged in water is not relevant to its flammability, although including this fact in our antecedent would certainly alter the probability of x's coming to burn. Thus some facts about x should be included in its 'nature', but not all facts. Can we make sense of this? In fact, I think we can. The facts about x which are included in x's nature are just those simple truths which are about x. That x is composed of asbestos is a simple truth about x, but that x is
246
CHAPTER IX
immersed in water is not a simple truth. Rather, that x is immersed in y is a simple truth, and that y is water is a simple truth, but the conjunction of these two simple truths is not a simple truth. Thus it appears that the probability which measures x's degree of flammability is probs(x will come to burn/x has the set of simple attributes it in fact has). To make this precise, notice that this probability will be the same for any object at all having the same simple attributes as x. What is relevant to this probability is not the identity of the object x, but rather the set of simple attributes of x. Thus we can recast this as an indefinite probability. Let us define: (2.3)
Ua,t(x)=x has all of the simple attributes possessed by a at t.
Then we have: (2.4)
a is more flammable than b at time t iff probs(x will come to burn/Ut,,,(x))< probs(x will come to burn/Ua,,(x)).
Notice that the reference to the time t is essential here. We can make an object more flammable at one time than it was at another time by altering its nature, e.g., by transmuting its elements. Principle 2.4 meets our previous difficulties, but now we encounter a new source of complexity in the notion of the degree to which an object has a capability. Consider deflatability. To say that one object is more deflatable than another would ordinarily be taken as meaning that the first object can be deflated more completely, not that it can be deflated more easily. Similarly, to say that one object is more fluorescent than another is to say that it fluoresces more, not that it fluoresces more easily. What is happening here is that the notion of the degree to which an object has a capability is a two-dimensional notion. Consider flammability again. Suppose that when x is heated to 100Âcentigrade it bursts into flame, but when y is heated to 90' centigrade it just begins to smoulder and no matter how high the temperature of y is raised it never does more than smoulder. Assuming that we count smouldering as the lower limit of burning, x and y are both flammable. There is a sense in which y is more flammable than x - it can be made to burn more easily. But there is also a clear sense in which x is more flammable than y - y just barely burns whereas x burns brilliantly.
DISPOSITIONS
247
For most capab,ilities we can talk about the extent to which an object realizes that capability on a particular occasion, e.g., the extent to which something burns, fluoresces, dissolves, crumbles, breaks, deflates; etc. How this notion is defined will depend upon what capability we are talking about. Let us suppose that we have a real valued function E+,(x) which measures the extent to which an object is ffi-ing. Then in general, 'how ffi-able an object is' cannot be given by any single measure. The degree of (p-ability of an object is best represented by the probability distribution probs(it will be the case that E a ( x ) 2 r/K,,(x)). Given such a probability distribution we can generally make sense of statements comparing the extent to which two objects have a particular capability. For example, to say that one object a is more fragile than another object b generally means that it is easier to bring about a certain large degree of breaking in a than in b, i.e., for some particular r, probs(it will be that case that Ebreaks(x)2r/JTb,,(x))< probs(it will be the case that Ebreaks(x) 2 r/na,,(x)). O n the other hand, to say that one object a is more fluorescent than a second object b is generally to say that a certain minimal 'amount of effort' will result in a greater degree of fluorescing in a than in b, i.e., for some particular p, 1.u.b. {r; probs(it will be the case that Efiuoresces(~)=* r/JTb,,(x))2 p} < 1.u.b. {r; probs(it will be the case that Efluoresces(x) 2 r/fla,t(x))2 In general, it seems that 'comparative capability statements' will be analyzable in one of these two ways, the correct analysis being determined by the context. For certain capabilities (e.g., fragility, fluorescence, deflatability) it will be more common for one of these analyses to be correct than for the other one to be correct, but in general either analysis could be correct and the context will determine which is appropriate. It is generally claimed that absolute dispositions are dispositions to have certain 'manifest properties'. In light of our conclusions, this
248
CHAPTER I X
should be rephrased by saying that capabilities are capabilities of having certain 'manifest properties'. This is probably all right if we do not bear down too heavily on 'manifest'. The properties involved in capabilities may be extremely complex, and may even be other capabilities. For example, the 'manifest property' involved in magnetizability is the property of being magnetic, but the property of being magnetic would itself seem to be a capability. It is also worth noting how 'non-manifest' the 'manifest properties' of fluorescing and dissolving are. We do not say that an object is fluorescing simply because it is glowing in the dark. E.g., a burning light bulb is glowing in the dark. Nor do we say that something is dissolving just because it is benignly disappearing in a liquid. For example, a submerged radioactive substance which is rapidly and spontaneously breaking up into elementary particles is not thereby dissolving. Fluorescing and dissolving are complex processes which only occur if certain microphysical processes occur. Thus one must not rely very heavily on this notion of a manifest property.
Now let us turn to probabilistic dispositions. Except for their probabilistic aspect, they actually come closer to fitting the traditional analysis than do absolute dispositions. It is much easier to find examples of probabilistic dispositions than it is to find examples of capabilities. A random glance at the dictionary yielded the following list: 'abstemious', 'acquisitive', 'acrimonious', 'adamant', 'adaptable', 'affable', 'aggressive', 'courageous', 'cowardly', 'creative', 'credulous', 'critical', 'curious', 'inquisitive', 'deceitful', 'foolhardy', 'forgetful', 'forgiving'. Most of these terms are ambiguous between referring to enduring characteristics and (relatively) momentary states. For example, we can say either 'Joe is abstemious', or 'Joe is being abstemious'. The latter means simply that Joe is eating and drinking sparsely, and does not report a disposition. However, 'Joe is abstemious' reports a tendency in Joe to be abstemious. This tendency is what is being called 'a probabilistic disposition'. We make a similar distinction between 'Joe is acquisitive' and 'Joe is being acquisitive'; between 'Joe is acrimonious'
DISPOSITIONS
249
and 'Joe is being acrimonious'; between 'Joe is aggressive' and 'Joe is being aggressive'; etc. In each case, the former sentence reports a tendency to be in the state reported by the latter. It is the 'tendency' senses of these words which express probabilistic dispositions. It is worth noting that in some cases the momentary states reported by these words as used in their non-dispositional senses are themselves characterized dispositionally. For example, to say that Joe is pleasant is to say that Joe has a tendency to behave in a pleasant manner, and to say (non-dispositionally) that Joe is behaving in a pleasant manner on a particular occasion is to say that his behavior has the disposition to please people. Similar observations apply to 'abusive' and 'adamant'. Statements of probabilistic dispositions report tendencies. In the common case of character traits these are tendencies to do certain things or behave in certain ways. For example, to say that a person is foolhardy is to say that he tends to act without sufficient attention to his personal safety. This is a probability statement. It amounts to saying that the probability of an action of this person being taken without sufficient attention to his personal safety is rather high. The probabilities in these statements are subjunctive probabilities, and hence are conditional probabilities. For example, for S to be abstemious is for the probability of S's eating and drinking sparsely on an eating and drinking occasion to be high; for S to be acquisitive is for the probability of S's trying to acquire a thing when it appeals to him to be high; for S to be acrimonious is for the probability of S's manner being bitter or harsh to be high; for S to be courageous is for the probability of S's actions to be taken without undue attention to personal safety to be high; for S to be credulous is for the probability of S's believing something he is told to be high; and so on. As conditional probability statements, these disposition statements have antecedents. Thus statements attributing probabilistic dispositions are much more like the traditional model of disposition statements than are statements attributing absolute dispositions or capabilities. The former have built-in antecedents whereas the latter do not. We can talk about the degree to which a person (or object) has a probabilistic disposition, and just as in the case of capabilities, these degrees are two-dimensional. For example, in the case of foolhardiness we can talk both about how apt a person is to act in a foolhardy
250
CHAPTER I X
manner, and about how foolhardy he is apt to act. In general, the extent to which a person is foolhardy is best measured by the distribution of the probabilities of his acting foolhardy to differing degrees. That is, if degfooihardy(x) is the degree to which a particular act x is foolhardy, then the measure of the extent to which a person S has the disposition of foolhardiness is the distribution of the probabilities of an act x of S being such that degfo0ihardy(x) 2 r. What kind of probability is involved here? It is a kind of subjunctive indefinite probability, but we have isolated several varieties of subjunctive indefinite probabilities. In judging whether a person is foolhardy, what we want to know is the probability of a new action of his being taken without sufficient attention to his personal safety. In computing this probability, we should take into account everything that is true about the person's actual situation. In other words, it is the simple subjunctive indefinite probability probe which is in question here. The degree to which a person is foolhardy is measured by the probability distribution: prob+(degfooihardy(x) 2 r/x is an action of S). It is worth pointing out that the use of subjunctive probabilities here is absolutely essential. It would be nonsensical to try to analyze probabilistic dispositions in terms of degree of rational belief or some other kind of indicative probability. What is at issue is not how likely (on the basis of incomplete knowledge) it is that something is now true of S7sactual actions, but rather how likely (on the basis of everything about S) it is that something would be true of non-actual actions of S. The tendency of philosophers to connate different kinds of probability, no doubt fostered by their fear of subjunctive statements, has hopelessly muddled some earlier discussions of probabilistic tendencies. This completes the list of subjunctive concepts to be discussed in this book. Analyses have been proposed for subjunctive generalizations, subjunctive conditionals, causal statements, statements of subjunctive probability, and disposition statements. If these analyses ultimately prove successful, this will free philosophers to use subjunctive notions in the analysis of other philosophically problematic concepts. In particular, I suspect that subjunctive concepts will prove of paramount importance in ethics. Judgments about moral responsibility and the
DISPOSITIONS
25 1
rightness and wrongness of actions involve in essential ways judgments about what would have resulted had persons acted in ways other than they did. I suspect that here and elsewhere, subjunctive concepts will prove to be indispensable tools for philosophical analysis. NOTES
'
This implies, contrary to what many people have supposed, that the brick is not identical with the asbestos of which it is made. This is a conclusion I have defended at length in Pollock (1974), pp. 157-174. The brick is composed of the asbestos, but it is not identical with the asbestos.
BIBLIOGRAPHY
Brody, Baruch, 1973, 'Why Settle for Anything Less than Good Old-Fashioned AI totelian Essentialism?' Nous 7, 35 1-365. Carnap, Rudolph, 1950, Logical Foundations of Probability, University of Chicago Pres Carnap, Rudolph, 1952, The Continuum of Inductive Methods, University of Chicag Press. Carnap, Rudolph and Jeffrey, Richard C.,1971, Studies in Inductive Logic and Probabil ity, vol. I , University of California Press. Chisholm, Roderick, 1946, 'The Contrary-to-Fact Conditional', Mind 55, 289-307. Chisholm, Roderick, 1955, 'Law Statements and Counterfactual Inference', Analysis 15. 97-105. Chisholm, Roderick, 1967, 'Identity through Possible Worlds: Some Questions', Nous 1, 1-8. Davidson, Donald, 1967, 'Causal Relations', Journal of Philosophy 64, 691-703. Davidson, Donald, 1970, 'The Individuation of Events', in Essays in Honor of Carl Hempel, (ed. by Nicholas Rescher), Reidel, Dordrecht. Ducasse, C. J., 1966, 'Critique of Hume's Conception of Causality', Journal of Philosophy 63, 141-148. Goodman, Nelson, 1955, Fact, Fiction, and Forecast, Harvard. Jackson, Frank and Pargetter, Robert, 1973, 'Indefinite Probability Statements', Synthese 26, 205-217. Jeffrey, Richard and Carnap, Rudolph, 1971, Studies in Inductive Logic and Probability, vol. I, University of California Press. Kaplan, David, 1964, Foundations of Intensional Logic (dissertation, UCLA 1964). Kanger, Stig, 1957, 'The Morning Star Paradox', Theoria 23, 1-11. Kim, Jaegwon, 1970, 'Events and their Descriptions: Some Considerations', in Essays in Honor of Carl Hempel, (ed. by Nicholas Rescher), Reidel, Dordrecht. Kim, Jaegwon, 1971, 'Causes and Events: Mackie on Causation', Journal of Philosophy 68, 426-441. Kim, Jaegwon, 1973, 'Causation, Nomic Subsumption, and the Concept of an Event', Journal of Philosophy 70, 217-236. Kim, Jaegwon, 1973a, 'Causes and Counterfactuals', Journal of Philosophy 70,570-572. Kripke, Saul, 1971, 'Identity and Necessity', in Identity and Individuation, (ed. by Milton Munitz), New York University Press. Kripke, Saul, 1971, 'Naming and Necessity', in Semantics of Natural Language, (ed. by Donald Davidson and Gilvert Harman), Reidel, Dordrecht. Kyburg, Henry E., Jr., 1974, The Logical Foundations of Statistical Inference, Reidel, Dordrecht. Lewis, David, 1968, 'Counterpart Theory and Quantified Modal Logic', Journal $ Philosophy 65, 113-126. Lewis, David, 1972, 'Completeness and Decidability of Three Logics of Counterfactu Conditionals', Theoria 37, 74-85.
BIBLIOGRAPHY
253
Lewis, David, 1973, Counterfactuals, Harvard. Lewis, David, 1973a, 'Counterfactuals and Comparative Possibility', Journal of Philosophical Logic 2, 418-446. Lewis, David, 1973b, 'Causation', Journal of Philosophy 90, 556-567. Mackie, J. L., 1965, 'Causes and Conditions', American Philosophical Quarterly 2, 245-264. Pargetter, Robert and Jackson, Frank, 1973, 'Indefinite Probability Statements', Synthese 26, 205-217. Pollock, John L., 1967, 'Logical Validity in Modal Logic', The Monist, 51, 128-135. Pollock, John L., 1972, The Logic of Projectibility', Philosophy of Science 39,302-314. Pollock, John L., 1973, 'Laying the Raven to Rest', Journal of Philosophy 70, 747-754. Pollock, John L., 1974, Knowledge and Justification, Princeton. Pollock, John L., 1974a, 'Subjunctive Generalizations', Synthese 28, 199-2 14. Pollock, John L., 1975, 'Four Kinds of Conditionals', American Philosophical Quarterly 12, 51-60. Popper, Karl R., 1955, 'Two Autonomous Axiom Systems for the Calculus of Probabilities', British Journal for the Philosophy of Science 5, 51-57. Popper, Karl R., 1959, The Logic of Scientific Discovery, Basic Books. Scriven, Michael, 1964, 'The Structure of Science', Review of Metaphysics 17, 403-424. Stalnaker, Robert C., 1968, 'A Theory of Conditionals', American Philosophical Quarterly, monograph series 2, 98-112. Stalnaker, Robert C., 1972, 'Pragmatics', Semantics of Natural Language (ed. by Donald Davidson and Gilbert Harman), Reidel, Dordrecht. Stalnaker, Robert C. and Thomason, Richmond H., 1970, 'A Semantic Analysis of Conditional Logic'. Theoria 36, 23-42. Thomason, Richard H. and Stalnaker, Robert C., 1970, 'A Semantic Analysis of Conditional Logic', Theoria 36, 23-42. Vendler, Zeno, 1967, 'Causal Relations', Journal of Philosophy 64, 704-713. Vendler, Zeno, 1967a. Linguistics in Philosophy, Cornell, Ithaca.
INDEX
actual necessity 63 actual possibility 5 1, 63 basic causai statements 149 Brody, Baruch 108
strong 193, 195ff subjunctive 192 weak 193, 204ff internal negation 76 Kanger, Stig 109 Kaplan, David 109 Kim, Jaegwon 145, 155, 180 Kripke, Saul 111
C l 43 causal chains 186 causal sufficiency 157, 160ff Chisholm, Roderick 4, 108 comparative similarity 18ff Consequence Principle, Generalized 19 contingently sufficient conditions 162 cotenability lOff counter-identicals 6 counter-implicate 7 1 counter-legal conditionals 93ff counter-legal generalizations 48 counterparts 110 CQ
Lehrer, Keith 98 Lewis, David 3, 14, 17, 30, 43, 110, 184 limit assumption 18 linguistic approach 4, 141
Davidson, Donald 145, 152, 161 definite probabilities 209 strong 209ff weak 209ff dispositions 237ff absolute 238, 239ff probabilistic 238, 248ff Ducasse, C. J. 145
necessitation 26 necessitation conditionals 33ff
Mackie, J. L. 145, 162, 181 material generalizations 46 maximal-P-consistent subset 57 measure functions empirical 200 logical 200 'might be' conditionals 10, 28, 31ff
Goodman, Nelson 9, 48, 49, 237
P-world 70 physical laws 12, 46ff, 141 physical necessity 47, 141 physical possibility 47 possible worlds approach 13 pragmatic ambiguity 5ff, 15 probabilistic laws 189, 199, 212ff
implicate 49 indefinite probabilities 189ff material 194 simple 233
referential opacity 104 referential transparency 106 relative frequencies 189 rigid use of a term 105
epiphenomena 166 'even if conditionals 26, 29ff events 146ff
INDEX
scope of a definite description 107 Scriven, Michael 162, 180 simple propositions 73, 91ff simple states 73 simple subjunctive conditional 25,38ff simple subjunctive probability 219ff SS 42 stable propositions 72 Stalnaker, Robert 3, 5, 13, 43 subjunctive generalizations 13, 46ff, 142
strong 51, 54ff weak 51, 62ff total causes 161 transitivity of causal relations 184 transworld identity 108ff unbounded proposition 86 unbounded sequence of P-changes 85 undercutting 78 Vendler, Zeno
145
PHILOSOPHICAL STUDIES SERIES I N PHILOSOPHY Editors: WILFRID SELLARS, Univ. of Pittsburgh and KEITHLEHRER, Univ. of Arizona
Board of Consulting Editors: Jonathan Bennett, Alan Gibbard, Robert Stalnaker, and Robert G. Turnbull
1. JAYF. ROSENBERG, Linguistic Representation. 1974, xii+ 159 pp. Essays in Philosophy and Its History. 1974, xiii+462 pp. 2. WILFRIDSELLARS, 3. DICKINSON S. MILLER,Philosophical Analysis and Human Welfare. Selected Essays and Chapters from Six Decades. Edited with an Introduction by Loyd D. Easton. 1975, x+333 pp. 4. KEITHLEHRER(ed.), Analysis and Metaphysics. Essays in Honor of R. M. Chisholm. 1975, x+317 pp.
5. CARLGINET,Knowledge, Perception, and Memory. 1975, viii+212 pp.
H. MADDEN,Causing, Perceiving and Believing. An 6. PETERH. HAREand EDWARD Examination of the Philosophy of C. J. Ducasse. 1975, vii+211 pp. Thinking and Doing. The Philosophical Foundations 7. HECTOR-NERICASTANEDA, of Institutions. 1975, xviii+366 pp.