E. Scheibe: Between Rationalism and Empiricism
Springer New York Berlin Heidelberg Barcelona Hong Kong London Milan Paris Singapore Tokyo
Erhard Scheibe
Between Rationalism and Empiricism Selected Papers in the Philosophy of Physics Edited by Brigitte Falkenburg
,
Springer
Prof. Dr. Erhard Scheibe Moorbirkenkamp 2A 22391 Hamburg, Germany
Prof. Dr. Dr. Brigitte Falkenburg Institut flir Philosophie Universitiit Dortmund Emil-Figge-Strasse 50 44227 Dortmund, Germany
ISBN 0-387-98520-4 Springer-Verlag New York Berlin Heidelberg Library of Congress Cataloging-in-Publication Data. Scheibe. Erhard. Between rationalism and empiricism: selected papers in the philosophy of physics! Erhard Scheibe; edited by Brigitte Falkenburg. p.cm. Includes bibliographical references and index. ISBN 0-387-98520-4 (alk. paper) I. Physics-Philosophy. 1. Falkenburg. Brigitte. 1953- II. Title. QC6.2.S342001 530'.01-dc21
2001020199
© 2001 Springer-Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication even ifthe former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Springer-Verlag New York Berlin Heidelberg a member of BertelsmannSpringer Science+Business Media GmbH Data conversion: LE-TeX, Leipzig Cover design: H. Kirchner, Heidelberg Printed in Germany Printed on acid-free paper SPIN: 10676544 55!31411ba - 5 432 I 0
Preface
For a number of decades, Erhard Scheibe has been Germany's preeminent philosopher of physics. His work took starting points that where different from those of the empiricist philosophy of science, founded by the Vienna Circle, which subsequently moved to the United States with Carnap and Reichenbach, and was finally reimported by Stegmuller into the German-speaking countries after World War II. Scheibe's ceuvre, by contrast, continues the rationalist tradition of philosophy in which modern physics has been rooted since Descartes. At its center are the conceptual breaks in the foundations of modern physics which shaped the 20th century philosophy of science. Following his studies in Gottingen where he earned a Ph.D. in mathematics, Erhard Scheibe in 1957 became an assistant of Carl Friedrich von Weizsiicker in Hamburg. There, in 1963, he achieved his "Habilitation" with a study of the philosophical problems of quantum mechanics (Die kontingenten Aussagen in der Physik, Atheniium: Frankfurt am Main 1964). As professor of philosophy in Gottingen, he subsequently concentrated his investigations on the structure of physical theories and wrote the book, The Logical Analysis of Quantum Mechanics (Pergamon Press: Oxford 1973). This study begins with a comprehensive presentation of the philosophy of Niels Bohr and ends with a formal analysis of the thought experiment of Einstein, Podolsky and Rosen. Memberships in several academies of science, fellowships at the "Center for Philosophy of Science" in Pittsburgh and at the "Institute for Advanced Study" in Berlin as well as a guest professorship at the University of California, Irvine, attest to the international recognition gained on account of this work. In 1983, Erhard Scheibe accepted a call to the University of Heidelberg, where he taught until his retirement in 1992. Recently, he published a twovolume work which, beginning with the unificaction attempts by physicists, proceeds to develop, step by step, a new non-deductive theory of reduction in physics (Die Reduktion physikalischer Theorien. Part I: Grundlagen und elementare Theorie; Part II: Inkommensurabilitiit und Grenzfallreduktion. Springer: Berlin Heidelberg 1997, 1999). Apart from these monographs, numerous papers on the philosophy of physics appeared in diverse publications, a representative selection of which is compiled in the present volume. These essays trace the path of physics between rationalism and empiricism. On this path, most physicists continue
VI
Preface
to hold fast to the old ideal of the unity of physics - in spite of quantum theory, in spite of Kuhn and Feyerabend's thesis of incommensurability and in spite of the historicist and postmodern tendencies of the post empiricist philosophy of science. In Erhard Scheibe's view, they are right to do so. For, in the end, striving for unity is legitimized by the rationalist principles of cognition to which modern physics owes its great achievements. After all, in spite of all the problems of reduction, the theories of Newton, Maxwell and Einstein and the more recent unified quantum field theories are the most successful products of the Cartesian program "mathesis universalis". The present volume would not have been possible without the help of Wolf BeiglbOck and the support of Springer-Verlag New York, as well as financial grants from InterNationes Bonn and the G6ttingen Academy of Sciences. Hans-Jakob Wilhelm translated all of the papers hitherto available only in German, as well as this preface. Marcus Schulte looked after the edition and with the support of Matthias Gillissen prepared the script for press. Birgit Hase, Dorothee Heinen, Sonja Kiitker, Sabine Ihm, Anja Rosenkranz, Matthias Scholl and Patrick Hausmeier assisted with the typesetting and proofreading of the documents. Beyond this, my special thanks goes out to Erhard Scheibe himself, who agreed to this edition and supported it in every respect. Dortmund, January 2001
Brigitte Falkenburg
This collection of papers has been divided by the editor into eight chapters according to their contents. Each chapter comprises from four to six articles in chronological order. I have written a brief introduction to every chapter. These introductions do not follow any principle but, rather, spontaneous inspiration. Some are really introductory, others are more of an afterword. It is to be hoped nonetheless that they are useful for the reader. At the end of the volume there is a bibliography comprising all references to the literature given in the text. Hamburg, November 2000
Erhard Scheibe
Contents
I.
II.
Between Rationalism and Empiricism. . . . . . . . . . . . . . . . . . . I.l Remarks on the Concept of Cause (1969) . . . . . . . . . . . . . 1.2 Aspects of Wholeness in Science and Philosophy (1987) 1.3 Kant's Apriorism and Some Modern Positions (1988) 1.4 C. F. von Weizsiicker and the Unity of Physics (1993) .. 1.5 Between Rationalism and Empiricism: The Path of Physics (1994) .........................
1 4 23 36 54
The Philosophy of the Physicists. . . . . . . . . . . . . . . . . . . . . . .. II.6 The Physicists' Conception of Progress (1988) . . . . . . . .. II.7 Erwin Schrodinger and the Philosophy of the Physicists (1991) ............................ II.8 Albert Einstein: Theory, Experience, Reality (1992) .... II.9 Heisenberg's Concept of a Closed Theory (1993) ....... The Origin of Scientific Realism: II.lO Boltzmann, Planck, Einstein (1995) ..................
87 90
69
108 119 136 142
III.
Reconstruction ......................................... III.l1 On the Structure of Physical Theories (1979) ......... III.l2 A Comparison of Two Recent Views on Theories (1982) III.13 Towards a Rehabilitation of Reconstructionism (1984) .. III.l4 Paul Feyerabend and Rational Reconstructions (1988) ..
157 160 175 195 212
IV.
Laws of Nature ......................................... IV.15 Coherence and Contingency. Two Neglected Aspects of Theory Succession (1989) .... IV.16 Predication and Physical Law (1991) ................. IV.17 Substances, Physical Systems, and Quantum Mechanics (1991) ..................... IV.18 General Laws of Nature and the Uniqueness of the Universe (1991) .............................. IV.19 On Limitations of Physical Knowledge (1998) .........
229 232 246 261 276 289
VIII V.
VI.
Contents Reduction .............................................. V.20 The Explanation of Kepler's Laws (1973) ............. V.21 Are There Explanations of Theories? (1976) .......... V.22 A Case Study Concerning the Limiting Case Relation in Quantum Mechanics (1981) ....................... V.23 A New Theory of Reduction in Physics (1993) ......... V.24 The Rationality of Reductionism (1995) ..............
303 306 324
Foundations of Quantum Mechanics .................... VI.25 Quantum Logic and Some Aspects of Logic in General (1985) .......................... VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality? (1986) ......................... VI.27 The Copenhagen School and Its Opponents (1990) ..... VI. 28 J. von Neumann's and J. S. Bell's Theorem. A Comparison (1991) .............................. VI. 29 EPR-Situation and Bell's Inequality (1991) ........... VI.30 Three Remarks Concerning Bell's Inequality (1993) ....
379
339 352 369
383 391 402 419 434 445
VII. Spacetime, Invariance, Covariance ...................... VII.31 Invariance and Covariance (1982) .................... VII.32 Hermann Weyl and the Nature of Spacetime (1988) .... VII. 33 Covariance and the Non-Preference of Coordinate Systems (1991) ....................... VII.34 A Most General Principle of Invariance (1994) ........
453 457 475
VIII. Mathematics and Physics ............................... VIII.35 Kant's Philosophy of Mathematics (1977) ............. VIII.36 Mathematics and Physical Axiomatization (1986) ...... VIII.37 Calculemus! The Problem of the Application of Logic and Mathematics (1988) .................... VIII.38 The Mathematical Overdetermination of Physics (1997)
513 517 535
490 501
553 571
Acknowledgements ........................................... 585 Literature .................................................... 591 Index ......................................................... 625
I. Between Rationalism and Empiricism
The title of Ch.I is taken from article [5] and so is indeed the title of the whole collection. The reason for this emphasis is easily explained: The phrase in question expresses the basic embarrassment of the physicist who feels himself unable to subject his discipline to one of the epistemological positions known from the history of philosophy but rather finds himself somewhere in between the philosophical extremes. Einstein has gone so far as to call the physicist an "unscrupulous opportunist" who, depending on the circumstances, appears as a realist or an idealist or a positivist or even a Platonist (cf. [8], no.48). But this wavering attitude is by no means the product of opportunism. Rather it should be viewed as protest against exaggerated positions of philosophers which bring their inventors repute and even fame but lead them astray from the road to truth. Why should we obtain all our knowledge by means of reason? Why by the senses and not by reason? If one tries to apply such an extreme and one-sided doctrine to a concrete science like physics one cannot manage it. But whoever sets out in this situation to look for a workable synthesis is not an opportunist. He is not even an eclecticist in the pejorative sense of the word which we have in mind whenever we take the extreme positions to be the genuine ones. This is not justified if every attempt at their application to a further developed science fails. We have to turn the tables and view our sciences as the empirical material for an appropriate theory of scientific knowledge. A thoroughgoing analysis has to be started before one comes to generalizations. The characterization of general results obtained in this way as 'interim positions' is then nothing bu~ a fa<;on de parler using familiar historical categories. The three modern positions described in [3] can very well be claimed to have been obtained in this way, and the same holds already for their great ideal: Kant's attempt at a synthesis of rationalism and empiricism. Not later than Kant's famous criticism of Hume's radical empiricism, the concepts of cause and effect are established as among the philosophers' seeds of discord. Paper [1] enters into work done by Josef Konig who criticizes the idea of a cause as being an object or event to be met with in space and time - an idea becoming familiar in the time after Hume and taken over even by Kant. According to Konig a cause is not an object or event in space and time. Rather it is "a typical thought connection with the help of which man E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001
2
I. Between Rationalism and Empiricism
re-establishes a fundamental intellectual contact with his environment which, originally given, has been broken by an unexpected event" (cf. [1], no.2). In [1] this idea is brought together with related ideas of other authors who likewise emphasize the factor of surprise. They point out that just because of this feature and contrary to a widespread belief, causality does not belong to the basic categories of physics and indeed has disappeared from it in favor of functional relations between physical quantities. Accordingly, Konig's remark does not seem to have physical relevance at all. However, in [1] attention is drawn to a structural relationship between Konig's interpretation of a cause and a peculiarity of theoretical explanations in physics. 1 The remarkable thing in Konig's analysis evidently is that the typical situation in which we ask for a cause is not the situation in which we wait for confirmation of a known causal connection by a new instance of it. Rather the situation in question is that of a violation of that regularity by an unexpected counterinstance. The typical causal question is a question about the explanation or, for that matter, the cause of a violation of a normally expected regularity. Now this situation - mutatis mutandis - is essentially just a particular case of the following situation in which we wonder about the explanation of a general physical theory T (in contrast to the explanation of a singular event): As long as T is unchallenged in its field of application there seems to be no need to ask for an explanation of it. This question occurs only when T, although well confirmed already by many tests of relevant instances, finally turns out to be false. Unlike the explanations of singular events, where the explanation becomes unnecessary if the event does not occur, in the explanation of a theory it is precisely its falsity - its deviation from the 'path of truth' - which makes us ask for an explanation of it. Accordingly, the explanation, given by a theory T' better than T, does not only explain the validity of T in a certain domain but also its failure under certain other conditions. (For explanations of theories see also Ch.V.) Besides cause and explanation, unity and wholeness are concepts treated in Ch.I ([4] as well as [2] and [5]). They re-appear also in later chapters, mainly in II and IV-VI. The two concepts are not independent of each other, and a systematic investigation of their physically relevant connections would certainly be useful. It will not be undertaken in this collection. Of the two concepts, only that of wholeness will be analysed further in [2], the other one being used uncritically. There is, however, a close connection between unity and reduction in science. Some physicists would even take it to be a goal of physics to reduce the plurality of physical theories by reducing the number of those being irreducible to each other until one reaches, if possible, one fundamental theory - a theory of everything - to which all others can be reduced. In this sense the idea of the unity of physics stands behind the investigations in the papers of Ch.V.2 1
2
See also the sequel to this investigation in Scheibe 1970 and 1971 The most recent contributions to this subject are Scheibe 1997b and 1999
I. Between Rationalism and Empiricism
3
The concept of unity becomes fairly explicit in the philosophical program of v. Weizsiicker, dealt with in [4]. It contains a theory of the development of physics which is reductionistic in spirit and goes so far as to postulate a final theory in the sense indicated. The peculiar idea of v. Weizsacker, over and above the general idea of the co-operation between unity and reduction shared with other physicists2 , is the idea that the final theory formulates and discusses nothing more than the conditions of the possibility of experience, i. e. that experience which has made possible the development of physics in question. Thus far, v. Weizsacker follows in Kant's footsteps. On second thoughts, however, his approach is only semi-transcendental because in his view the conditions under discussion cannot be known a priori. They become known only in the course of the actual progress made by physics. The major difficulty for the realization of this program lies in the question of how a theory of the conditions of possible experience is related to a final theory in the conventional sense, i. e., an ordinary physical theory of elementary interactions. [2] is a defence of the natural sciences, more precisely, of physics, against the reproach of having neglected the aspect of wholeness in their research. The reproach comes from disciplines like psychology, education, medicine, even biology - at any rate from disciplines mainly dealing with complex systems. To this it can be objected that nature, as far as it is described in physics, as well as the theories by means of which this description is done can be viewed under a variety of different aspects of wholeness and have been viewed in this way. A list of these aspects in arbitrary order would look like this: 1. wholeness (of a piece of art) in the sense of Aristotle 2. wholeness as rigidity of a theory in the sense of Einstein (v.Weizsacker, Weinberg) 3. wholeness as closedness of a theory in the sense of Heisenberg 4. wholeness as a complete coherent system in the sense of idealism 5. wholeness of Newton's theory of gravitation as well as of a system of bodies moving according to this theory 6. Duhem/Quine holism 7. anthropic principles 8. ontic monism 9. individuality of quantum phenomena (inseparability)
The aspects in this list are presented briefly in [2]. Some are resumed in [5] and discussed in greater detail. A theory of the physically relevant aspects of wholeness showing them also in their total complexity and interrelatedness is an urgent task for the philosophy of science.
1.1 Remarks on the Concept of Cause* In the year 1949 Josef Konig published an extensive study entitled "Bemerkungen tiber den Begriff der Ursache" (Remarks on the Concept of Cause).l Addressed directly to Hume's and Kant's treatment of the question, taking little stock of subsequent - especially the most recent - developments, and having, for this or for some other reason, been passed over in silence by the pertinent literature since 1949, Konig's study may be said to occupy an isolated position among the more recent discussions of questions pertaining to the area of causality. Those who ignored that study did it - or rather, they did its author - a grave injustice that wants redress. I do not propose to take the present opportunity to make up for this entirely, but only to select a single thought, albeit a central one, from Konig's profoundly complex and widely ramified argumentations, and to make it useful for my own purposes. In restating this one idea, I have found it advantageous for my own intentions, while compatible - I believe - with Konig's, to take certain liberties, occasionally to follow an alternate sequence, or perhaps even to place the accents somewhat differently. Konig arrives at his idea by way of a two-fold criticism: It is directed not only against a very striking - and for that reason well known - result of the analysis to which Hume subjected the concept of cause, but against a much less striking presupposition as well which not only Hume, but also Kant, allowed to suffuse, as it were, his reflections on the concept. Hume's result, to which reference is made, is unfolded in a chain of assertions which, as mentioned, are well known: that we, as humans, have no a priori insight into a conjunction between A and a (numerically distinct) B which, when we assert A to be the cause of B, we allege to be necessary and at the same time objective; that we depend rather on experience alone for the ascertainment of a causal connection; that the experience in question consists in the repeated perception that something of the nature of B follows temporally upon something of the nature of A; and consequently that not only our knowledge of one causal connection or another but the very concept of causality itself arises from this type of experience alone. The tacit presupposition, however, which Hume and, following him, Kant make with respect to the concept of cause is that whatever stands in a causal relation to something, and which we accordingly call its cause, is also, that is, apart from its designation as cause, something like an object or an event, in any case something that can be localized in space and time. Against Hume's doctrine and the prejudice common to Hume and Kant, both of which have been recalled, Konig proposes his fundamental thesis, which he formulates as follows: .. First published as Scheibe 1969, translated by D.J. Marshall jr. 1 Konig 1949
4
1.1 Remarks on the Concept of Cause
5
The so-called cause is simply not a thing or a being; it is not something given at a certain place and time, but a certain typical thought connection with the help of which man re-establishes a fundamental intellectual contact with his environment, which, originally given, has been broken by an unexpected event. 2 Let us concentrate on the part of this thesis - that is, the second part with which Konig opposes Hume. In a different context we read the following: Hume teaches that the notion of something like a cause forms in us because and insofar as we experience that like always follows upon like. On this basis, the more a particular observation presents a sequence of like upon like, the more we should expect our awareness of causation to become lively and emphatically clear. Quite the contrary, it appears obvious to me that things are actually the other way around. We ask 'Why?' precisely when what occurs is different from what we expected, precisely, therefore, when what we observe, contrary to a tacit expectation, is not a sequence of like upon like. 3 Let us determine, first of all, what constitutes the peculiarity of an approach to an explanation of the concept of causality that begins with such a remark. Obviously it consists in the fact that (1) to answer the question as to the nature of cause, the author asks what the typical situation is, in which the knowledge of a cause is sought, in which the need to inquire after a cause arises, or in which the question as to the "Why?" of a thing simply comes, as if of its own accord. This typical situation is then (2) identified as one in which an occurrence is experienced, not merely insofar as something has occurred as it did, but insofar as something, in occurring the way it did, occurred differently from what was to be expected. But the really decisive step in the determination of the concept of cause is (3) this, that the concept incorporates within itself the conceptual description of the situation indicated. And this is accomplished in the only way possible for such an approach, by means of a definition of the concept of effect. 4 For what is given in the situation typical for the inquiry after a cause is precisely the effect of the cause sought. As a consequence of (2) that which is given does not consist in the fact that something has occurred in such and such a way, but rather - in a brief, if somewhat misleading manner of speaking - that it occurred in this, but not in that way. Only when both of these components are united can the given be termed an effect, and, viewed in this way, such an effect is not at all a change in the ordinary sense, that is, a change occurring in time. It is not, for example, a temporal change in the place or in the velocity of a body moving in space; it does not consist in the fact that a body is now here 2
3 4
Ibid., p. 29. Ibid., p. 28. Ibid., pp. 73ff.
6
1.1 Remarks on the Concept of Cause
and later somewhere else. Rather, when it is said in reference to an effect, that something, occurring the way it did, occurred differently from what was to be expected, the word 'differently' refers to the distinction between an entire temporal process (which actually took place) and a different entire temporal process (which though possible and even expected, did not actually take place); hence it refers to something which obviously cannot itself be understood as having taken place in time. On this basis it no longer appears strange - and herewith we are concerned with the other, that is, the first, part of Konig's thesis, directed against the prejudice, common to Hume and Kant, that a cause has the character of an event - that the notion 'cause' also refers, not to an object, or an event, or to anything at all that is found in space and time, but rather to the above cited ''typical thought connection", for the further characterization of which I should once again like to let Konig himself have the word. In view of the danger that Konig views as having befallen Hume and Kant - the danger, that is, of taking the notion of causality as an extensional concept referring to spatio-temporal objects - he suggests to the reader that he keep himself from the very beginning wide open to the natural, close at hand, and yet very essential view that the cause of a thing has been indicated when the spontaneous question "Why?" has been satisfactorily answered. Hence the concept of cause is first and foremost the concept of a satisfactory answer to the question "Why?". The so-called cause is ... what makes us understand why.... It is the "That's why" answering the "Why?".5 The characterization of cause given earlier as a typical thought connection, ''with the help of which man re-establishes a fundamental intellectual contact with his environment, which, originally given, has been broken by an unexpected event", finds resonance in this new characterization only insofar as the question "Why?", with which we seek a cause, is called spontaneous and is clearly presented as spontaneously arising precisely from what is wrong in the situation described. Concerning the thought connection itself, however, as which the cause is identified, we now learn that it must make us really understand something, so that a cause is in every case an explanation. At this point we get a clearer view of Konig establishing his central thesis, that the radical separation of the logical relation of premise to consequence from the allegedly real, translogical relation of cause to effect, which came to be accepted largely in the period following Leibniz, was not a step forward and essentially must come to be rejected. 6 The subsequent development of Konig's presentations shows that his answer to the question as to the nature of cause turns out not to be unlike 5 6
Ibid., p. 40. Ibid., p. 27.
1.1 Remarks on the Concept of Cause
7
the deductive-nomological model that at about the same time Hempel and Oppenheim proposed as their answer to the direct question (that is, to the question that does not go by way of the concept of cause) as to the nature of scientific explanation. 7 I speak of no more than a certain similarity because I wish to designate the one point in which it consists. Just as it is important for Hempel and Oppenheim, among others, to exhibit in an explanation that which is explained as a logical consequence of that which does the explaining, so it is essential to Konig to identify the necessity of a causal connection as a logical necessity and on precisely this basis as something intelligible. If, for example, to his question why this ball, which has just been at rest, is now in motion, someone is given the answer, because it was pushed, this answer will be a satisfactory one, that is, one that enables him to understand what he previously could not, only providing he is in possession of the general empirical principle that a ball which had been at rest and has been given a push is a ball in motion. Hence the only genuinely satisfactory answer is one which includes, besides the singular proposition indicating the cause in the commonly received sense, a general empirical principle as well, and from which, since it contains both, it is possible to deduce logically the proposition describing the situation that had provoked the question. Konig's treatment of this subject cannot be reduced to a formal model, as can Hempel's and Oppenheim's. Further, he does not deal with the concept of explanation, but of cause; nor, as a consequence, does he deal with the more modest assertion that an explanation is a satisfactory answer to a question asking "Why?", but with the assertion that a cause is such an answer. Aside from all of this, however, the point with which I, certainly, am most deeply concerned - although Konig perhaps is not - is one in which his reflections differ essentially from those of Hempel and Oppenheim. The particular situation in which or as a result of which the question "Why?" is asked - not merely in a devil-may-care manner but for a reason - and which provides, as it were, the fertile ground without which no understanding can thrive, is something that Hempel, in the subsequent development of his theory of scientific explanation, 8 is concerned to separate, as a purely pragmatic aspect, from the explanation itself. Konig's idea, however, that the indicated characterization of the typical situation in which we are led to ask "Why?" ought to be incorporated either in the concept of explanation or even - as Konig himself holds - in that of the cause, strikes me as a very happy one indeed. Before I proceed on the basis of these remarks of Konig to questions pertaining to physics, it will be well to examine more closely the peculiar approach he takes to the clarification of the concept of cause - the bringing to consciousness and the correct evaluation of the particular situation in which the question of a cause arises spontaneously. For this purpose I should like to draw attention to a writing published ten years later by Hart and Honore, 7 8
Hempel/Oppenheim 1948. Hempel 1965, pp. 425 ff
8
I.1 Remarks on the Concept of Cause
entitled "Causation in the Law". 9 As the title indicates, the purpose of these authors is to examine the concept of cause as it relates to the science of jurisprudence, or even more to the actual practice of the courts. Fundamentally their discussion is based on a criticism of the pertinent views of Hume; they outline the improvements due to J. S. Mill, but criticize them also. The discussion attaches a good deal of importance to the question of the criteria we use when, out of a large number of conditions necessary for the occurrence of an event, we select, as we always do, a single one which we treat as the cause, downgrading the others to the level of mere concomitant circumstances. The answer given by Hart and Honore to this question is centered around the notion that we always make causal judgments within a context that defines more or less clearly which circumstances are to be viewed as normal and which as abnormal. What is called "ordinary life" is, of course, the broadest context that can occur. On this context Hart and Honore say the following: In ordinary life the particular causal question is most often inspired by the wish for an explanation of a particular contingency the occurrence of which is puzzling because it is a departure from the normal, ordinary, and reasonably expected course of events: some accident, catastrophe, disaster or other deviation from the usual course of events. 10 Obviously this is the very same emphasis as is placed by Konig on the peculiar individual situation in which something occurs differently from what a given context would have led us to expect. And even though Hart and Honore do not draw the same conceptually rigorous consequences as Konig, yet it is no less important to them to differentiate their theory from that of Hume and Mill by indicating the context in which the notion of cause finds its application: The notion that a cause is essentially something which interferes with the course of events which would normally take place is central to the common sense concept of cause, and at least as essential as the notion of invariable or constant sequence so much stressed by Mill and HumeY But in pursuing their more proximate goal, the issue of how to single out a cause from merely circumstantial conditions, Hart and Honore are now in a position to place these secondary conditions, which are not identified as causal, precisely in the kind of context the absence of which makes it meaningless to speak of causality in the first place. Thus, for example, the oxygen in the air and the presence of adequately desiccated combustible material are conditions apart from which a house cannot burn down. Still they cannot be 9
10 11
Hart/Honore 1959 Ibid., p. 31. Ibid., p. 27.
I.1 Remarks on the Concept of Cause
9
identified as the cause of the conflagration since they are, or can be, present whether a fire breaks out or not. Hence they do not constitute the characteristic difference that gives rise to the kind of situation in which something develops differently from the way it would under normal circumstances. Since my present concern, which I have already indicated but which can emerge clearly only after gradual preparation, is directed to a proposal which, if not opposed to, is yet parallel to the explanation model of Hempel and Oppenheim - a proposal oriented to a situation in which an explanation becomes actually urgent and imperative only as a result of an occurrence differing from the one anticipated - I shall again refer to the text of a recent analysis given by Hart and Honore. Considering the case of a man who was shot by another, the authors, who, here as above, are concerned to differentiate a cause from concomitant circumstances, argue as follows: Here we shall treat the shooting, not the later deprivation of his bloodcells of oxygen, as the explanation and the cause of his death, although it is perfectly true that we could predict the man's death from knowledge of the earlier part of the process.1 2 Here we find the observation that the cause, insofar as it is to explain the present case, is not to be sought in those conditions which would make it possible to predict the course of events with the greatest possible certainty. Hence there emerges a kind of contest between a popular and a more scientific way of envisaging things; in fact, in a preliminary way Hart and Honore justify the identification of the cause in the present case as follows: One .. .important motive for rejecting the later conditions as the causal explanation in such common sense inquiries is the fact that in such cases we are not looking for the cause of "death", but for the cause of death under circumstances which call for an explanation. We want to know why Smith died when he did; we do not want to be told what is always the case whenever death occurs. The former is typical of the common sense interest in causal questions ('Why did this happen when normally it would not?'), the second is typical of the experimental sciences (,What are the general conditions of death?') ... 13 Opposing the cause of "death" to the cause of death under circumstances that call for an explanation, and drawing a parallel opposition between the scientific and common sense interest in the causal question, this passage suggests that in science, if the word 'explanation' is taken in a very fundamental sense, actually nothing is explained - because the very impetus leading to scientific inquiry is not the requirement for explanation in this sense. That this is the view of Hart and Honore is revealed by a second justification of the identification of the cause with a very early, but not with a later, part of the process under consideration. 12 13
Ibid., p. 36 (my italics). Ibid., p. 37.
10
1.1 Remarks on the Concept of Cause These later phases come to light only after we have identified through common experience abnormal occurrences ... of certain broadly described kinds ("shooting", "blows", etc.) which bring about disturbances of the normal course of things .... To cite these later phases of the process as the cause would be pointless in any explanatory inquiry for we know of them only as the usual or necessary accompaniments of the abnormal occurrence ... , which has been already recognized as "making the difference" between the normal course of events and what has in fact occurred, and so as explaining the latter. The details of the process have in themselves no explanatory force. 14
It has already been mentioned that such a use of the notion of explanation seems to indicate that in the physical sciences, which are concerned with the questions of detail, things are explained, if at all, only in a quite different sense, and that what happens when - as in the case of Hempel and Oppenheim - this model, in which the procedure typifying the physical sciences has been captured, comes under the rubric of 'explanatory model' is at the least extremely odd, even if it is specified that 'explanatory' is meant here only in a scientific sense. Indeed one must wonder whether the description of what occurs in the sciences does not itself require a notion of explanation derived as a legitimate descendant from the one which, characterized here in a preliminary fashion on the basis of considerations taken from Konig and from Hart and Honore, stems from the everyday concept of cause. This, as ought to be openly admitted from the start, is a delicate and a difficult question. To answer it, it may be appropriate to realize first of all with the greatest possible clarity something that may seem to point in exactly the opposite direction: In a certain sense, the evolution of physics has tended towards the elimination of every causal conceptuality in the ordinary sense of the word. In view of the current confusion of ideas in the area of causal concepts, for which the success of quantum theory was not least responsible, it is not only appropriate, but necessary, to encumber this clarification further with a kind of guerilla war, even though - measured by the goal for which we are striving - the battle must be won, so to speak, with our backs to the enemy. Of the partisans that must be driven from their hide-outs, some are dug in behind what may be called the popular knowledge of what has taken place and what is still going on in physics, and some will be found behind what has been called ''the spare-time philosophical works of scientists". What is going on in these circles is in part an innocent plea, and in part an all-out fight for the rather unsophisticated opinion that physics is concerned with causality. Let us for the moment take stock only of the popular knowledge of physics. It teaches us that physics is concerned to establish laws of nature that may just as well be called causal laws. Furthermore, the popular mind is sufficiently enlightened on the subject of the so-called principle of causality to understand it as the assertion that nature is governed by causality and 14
Ibid., p. 38.
1.1 Remarks on the Concept of Cause
11
every event has a cause. It has, no doubt, gotten abroad that this principle of causality has begun to totter in the more recent physics. But even this circumstance seems to indicate that it is still not completely out of place to take the view that the conceptualizations of physics are somehow - if only at a very general level and in some cases even in a negative sense - tied to that of causality. Now, it is entirely within the scope of the present study to view the problem of causality in physics as not clearly settled. The trouble is that the discussions I have in mind, for the sake of which guerilla operations must be carried out, are lacking in those distinctions which, in view of the developed state of physics and of the philosophical tradition surrounding the problem of causality, must be made if the question is to be in any way advanced. These circumstances seem to indicate that we call to mind the above mentioned tendency of physics to develop in the direction of an elimination of the notion of causality. In this connection I shall invoke authors who have grasped the situation with the greatest imaginable clarity, and have expressed themselves as unmistakably as possible. Unlike the popular mind, the scientific world of today should no longer be surprised to learn that the attempt to assign a legitimate place in physics to the concept of cause - even if it is taken in a completely immanent sense - is not without difficulty. For example, it ought to be known that even in his day Mach regarded the notions of cause and effect as unsuited to the genuinely scientific exploration of natural phenomena and attempted to have it completely superseded by that of functional dependence. It is perhaps less known, though in the present context significant, that when he comes to discuss questions pertaining to causality Mach is capable of making remarks similar to those excerpted from the analysis of Konig. The following is from his "Prinzipien der Warmelehre" (Principles of the Theory of Heat). In general we feel the need to inquire after causes only when an (unusual) change has occurred, because on the one hand only such a case draws our attention and gives rise to questions, and on the other only the occurrence of different cases (changes) lends meaning to the question as to what conditions determine the one or the other.15 I think it is important to take a moment to examine this argument a bit, and perhaps also to rectify it, because something can be learned from it that goes beyond the analysis carried out from a somewhat different perspective by Konig. Mach claims that in general we feel the need to inquire after causes only when an unusual change has occurred; to support this claim he gives two reasons: Obviously the first is connected with what is unusual, taken as is made clear by a further quote given immediately below - in the sense of what is unexpected in a change and is alone capable of giving rise to the need. This reason may in fact be viewed as indicating the grounds for our need to inquire after causes, the grounds, it may be said, for what Hempel refers to 15
Mach 1896, p. 340.
12
I.1 Remarks on the Concept of Cause
as the pragmatic part of that situation in which humans customarily inquire after causes. 16 Mach's second reason is not at all so much a reason for the need under discussion except, perhaps, in the weakened sense that humans in general should not have the need to ask meaningless questions. The assertion that only the occurrence of different cases lends meaning to the question of what conditions determined the one or the other supplies therefore a reason, not for the fact that the need to inquire after causes is present when it is, but for the circumstances under which alone such an inquiry becomes meaningful. And for this is adduced, not, as in the other case, what is unexpected in a change, but precisely and exclusively the bare fact that a change - or rather a difference - is present. Hence the second reason points, not to a pragmatic, but to a semantic aspect of the kind of situation in which a cause is sought. In "Erkenntnis and Irrtum" (Knowledge and Error) Mach returns to the theme of the passage taken above from his "Prinzipien der Warmelehre".
If everything were to pro cede on a completely regular basis without the slightest disturbance, just as the night follows the day, we would adapt to the sequence of events without ever thinking. Only a departure from the rule or the absence of a rule constrains us to ask why these events occur one time, those another Not much further we read the following: Once the presupposition of the constant conjunction of the elements has become embedded in our thought, we seek at once the cause of every change that occurs unexpectedly and for the first time .... Every change appears as a disturbance of the previous stability, as a dissolution of what formerly held together. It undoes the accustomed sequence ... and imp ells us to seek another, to cast about for a cause. 17 In terms of the distinction made just above, the aspect dominating these lines is clearly the pragmatic one; even so, in reading them one gets the feeling that he is watching someone touch, as it were, the very nerve of the problem of causality. The inference seems to lie within our grasp: The customary, regular and undisturbed course of events - or, on a higher conceptual level, the regular or lawlike connection hypothesized within the framework of a physical theory - is precisely not that for the description of which the concepts of cause and effect can, or even must, serve. Rather it is the case that such a course of events and such a connection must be presupposed in connection with a disruptive phenomenon, and this not merely to make inquiry after causes intelligible but also as a condition of the cause being meaningfully identified. When Mach speaks of cause in the last sentence of the above quoted passage, he refers to what, in a particular case, caused a new connection, and not the customary one, to take shape. There is no visible way for the 16 17
Hempel 1965, pp. 425 ff. Mach 1920, p. 277.
1.1 Remarks on the Concept of Cause
13
identification of such a cause to proceed without making reference to both connections. Furthermore it would have seemed but one step away to produce these very considerations as the reason for rejecting the use of causal notions in physics, to say that the theories of physics are constructed in such a way as to eliminate the element of surprise because they involve the formulation of general connections that under specified initial and boundary conditions determine clearly what is certainly or in all probability to be expected, and that this is precisely the reason why the concept of cause is of no use to them. In spite of this we see Mach slipping, in the immediate context of the passages quoted above, back into the wake of Hume. Following is the continuation of the first passage: Which things are invariably connected, which are mere chance concomitants? By means of this distinction we arrive at the notions of cause and effect. We call one event a cause, with which another (the effect) is invariably connected. 18 From this point on he accepts entirely Hume's characteristic - but fateful interpretation of the concept of cause as that which occurs in accordance with a physical law. It is not as though in saying "We call one event a cause ... " Mach was attempting to formulate a rigorous definition. This is obviously nothing more than a reminder of a current view. For Mach himself is concerned precisely to bar the notion of causality from physics. But in the final analysis the reasons he gives are connected with this current view of causality, which he considers too vague and superficial, and not with the remarks quoted above, of which the present study had to take stock. For they might have revealed to Mach its central thesis, that while the concept of cause is indeed not suited for the formulation of physical theories, it may yet occupy a legitimate intermediate position, as it were, between two theories applied to the same case or between two cases to which the same theory is applied, because the discovery of a cause permits us to understand why one, and not the other, of these two possibilities must be adopted. Having discussed Mach, I shall turn now to Campbell's careful analysis of the role in physics of the concepts connected with cause. 19 Campbell bases his analysis on the following concept of a causal relation. It is a dyadic relation between events, of which cause and effect are the relata, and which is (1) lawlike, (2) temporal, and (3) asymmetric. That it is lawlike means that the occurrence of an event of type B is invariably attended by the occurrence of an event of type A. The requirement of temporality means that if two classes of events, A and B, are in a causal relation to each other, then the temporal sequence of the occurrence of corresponding events is unambiguously determined. Lastly, the asymmetry of the causal relation consists in the fact that an event of type A, insofar as it is the cause of an event of type B, cannot be 18 19
Ibid. Campbell 1920, Ch. III.
14
1.1 Remarks on the Concept of Cause
at once the effect of such an event as well. Having defined in this manner the concept of a causal relation, Campbell proceeds to envisage certain relations which physicists have deemed worthy to be called physical laws, and then addresses the question whether these relations are of a causal nature in the defined sense. His finding turns out to be completely negative and contains two points that are relevant to the ideas under discussion. The first is this. Campbell lays it down as a part of his general position that the laws of physics must be open to experimental testing, and he observes that for this, at least with respect to all experiments carried out in laboratories, the arbitrary creation or modification of particular conditions is characteristic. For example, in testing Ohm's law several different values are given arbitrarily either to the current or to the voltage in a wire in order to measure the corresponding values that result either for the voltage or the current. Now, while in the case of a state of equilibrium it is meaningless to say that the current flowing in a wire is either the cause or the effect of the voltage it is carrying, it has a certain meaning to say that a change in the current is the cause of a corresponding change in the voltage. But the meaning is not that it has now become possible to treat the law linking changes in current to changes in voltage - unlike Ohm's law, from which it is derived as if it had the structure of a causal relationship. This would be merely to fall back into the old error. What is meant when a change in current is spoken of as the cause of a change in voltage is rather the arbitrarily performed act of a human being, by means of which a current flowing in a wire is given a value it did not have before. That this is indeed what is meant appears with particular clarity in the case of Ohm's law because in this case the arbitrary change may affect the voltage as well as the current. If the relation in question were understood merely in terms of the changes in quantity, it would not insure the asymmetry essential to the causal relation. For both changes the sequence of cause and effect would remain completely unspecified. To say that the change in current is the cause of the change in voltage as opposed to the reverse is simply to say that in the case before us someone elected to change the current and not the voltage. A parallel reflection shows the sense in which the causal relation, taken in this way, has a temporal nature. To say that the cause always precedes its effect in time is to assert merely - in the case of the experiment under discussion - that the current, for example, must be changed first before it can be determined what change in voltage has been brought about. This, then, is one of the points that Campbell's analysis takes as central: The use of the notion of causality has a place in physics at least when it is introduced to help describe the possibility and actuality of modifications brought about arbitrarily by a physicist in the course of an experiment. Campbell's subsequent analysis is concerned with the question whether the use in physics of the notion of causality can be defended in any sense other than the one indicated. In this connection he makes a second important
I.1 Remarks on the Concept of Cause
15
remark. It is that contemporary physics has come to take the notion of process as one of its fundamental concepts, using it in the quite uncomplicated sense of a mere temporal change in state, the simple course of an occurrence having a sequence in time in which the only thing that happens - as Toulmin once put it, borrowing an expression from Aldous Huxley - is "one damn thing after another". Physics, no doubt, is concerned to ascertain conditions under which processes not only follow their course, but follow it in the manner which the conditions uniquely prescribe - hence it seeks determining conditions, the knowledge of which makes it possible to predict the course a process will follow. This program led first to the development of classical mechanics. A moving body not acted upon by forces, a body falling in free fall, the bodies of the solar system - they all follow a certain path; hence they change their places as well as other properties, and certain combinations of properties determine already at one given moment every subsequent occurrence at every time thereafter. But thereby hangs the entire tale that classical mechanics can tell, and it has no room for causes and effects. Campbell uses the passing of a spark through a mixture of hydrogen and oxygen followed by an explosion as an example of a process in nature which cannot as yet be understood in terms of the schema outlined above, and which because of, or at least in accordance with, this is given a causal description: the spark is said to be the cause of the explosion. This type of case occasions him to remark as follows: We are forced at present, owing to a deficiency of knowledge, to state the relation between the spark and the explosion as causal; but we feel that if we knew more about it, we might be able to state it in the form that a process starts in the gas when the spark passes, and after continuing some time, becomes (not causes) an explosion. If we could state the law in that form it would be more satisfactory; the use of the causal relation in a law is a confession of incomplete knowledge .... So little is it our object to order our external judgments in terms of cause and effect that our efforts are consistently directed to ridding ourselves of the necessity for employing cause and effect at all. 20 We are now in possession of Campbell's second point. To speak of causes in cases that do not involve human actions is, in the light of the deterministic theories of classical mechanics, tantamount to the implicit admission that the processes in question have been insufficiently analysed; for if the analysis were pushed sufficiently it would lead automatically to the elimination of the concept of causality. Campbell's first idea, that the concept of causality is legitimately used in physics only to describe the carrying out of experiments, seems to relate antithetically to Konig's idea - with which the present study began - that it is possible to speak of causes and effects only in connection with unusual 20
Ibid., pp. 66-67.
1.1 Remarks on the Concept of Cause
16
situations. A series of experiments which is conceived by a human being and carried out according to plan and in which causes are introduced arbitrarily, seems a situation entirely different from one in which the cause of a single event is sought because the event is perceived as something unexpected or unusual. But the psychological aspect of what is unexpected in an event has already been shown to be secondary; it should not be permitted to lead us astray. To designate an occurrence as unexpected or surprising is only one way to characterize its relation to a sequence of events from which it departs, even if it is an important and - especially in everyday life - commonly used way. What is of primary concern if the concept of cause is to be used correctly is that it not be applied to a process in isolation but only insofar as it departs from another process. But this is precisely the situation that arises whenever there is human interference - especially in the case of an experiment. Every purposefully conducted experiment the course of which may be said to involve intended causes is an interference by means of which something is made to follow a course different from either the one it would have followed in some natural way if it had been left to itself, or the one followed by another experiment the result of which is already known. This is precisely the reason for the subjective value of an experiment: that it can teach us something. Other authors as well have recently stressed the fact that it is precisely the departures from a course of events presupposed as normal, that, resulting from human action, point essentially to the identification of causes as human actions. In the work of Hart and Honore referred to above we read the following: Human action in the simple cases, where we produce some desired effect by the manipulation of an object in our environment, is an interference in the natural course of events which makes a difference in the way these develop. 21 The really decisive component of a human action, insofar as it is designated as the cause of something, is brought into still clearer focus by Toulmin. 22 In his study of the concept of cause he too starts with the invitation to "consider first the sorts of everyday situations in which we have occasion to ask questions about causes". Then come examples ("A wireless set, instead of giving out a Haydn symphony, howls dismally... " etc.) showing that the next step remains quite in the same line. Then follows, briefly, the aspect of human interference: In the case of a failure, our interest in the arising of the cause is to find out what ought to have been done differently in order to prevent the catastrophe. Finally he says, It is not essential that the search for causes should be anthropocentric, but that it should be diagnostic, that is, focused on the an21
p. 27.
22
Toulmin 1953, pp. 119ff.
1.1 Remarks on the Concept of Cause
17
tecedents in some specific situations of some particular event, is essential.
In accordance herewith, Toulmin arrives, with respect to physics, at the result that the concept of caUSe has a role to play in the applications of physical theories, but not in the theories themselves. But insofar as human actions are adduced to legitimatize the concept of cause in applied physics, it is ultimately essential, if such actions are to be causes, that they be linked to situations characterized by the fact that one sequence of events stands sharply distinguished from another which provides, as it were, the logical occasion for recognizing the one as a caUSe. The fact that the caUSe can appear on the one hand as that which is spontaneously sought for, and on the other - as in the experiment - as that which is arbitrarily brought about, is of significance for the question as to how the notion of causality arises, but no longer for the question of its logical structure. In the former respect the experiment is obviously on a higher and later stage of development than the enquiry after a cause to which the mere perception of an unexpected event gives rise. Let us now turn our attention to Campbell's second idea, that the formulation of a physical law by reference to a causal relation always amounts to the admission of incomplete knowledge, and that for this reason the history of physics has shown a tendency to eliminate the use of such references. In order to locate this position immediately and as clearly as possible, it will be well to recall for a moment the previously mentioned carelessness with reference to causal concepts that has been cropping up widely of late. While Campbell points to incomplete knowledge as a reason for taking a process as causal, we read today quite commonly that our incomplete knowledge is responsible for the acausal character attributed by quantum mechanics to elementary processes. Whatever may be the concept of causality that figures in the background of the debate on causality in quantum mechanics, it must now be our task to determine which understanding of this conceptuality can justify Campbell's thesis, and for this it will not be enough to refer to his own explanation as given above. It will suffice, however, to have recourse to that causal notion which contains as an essential element the description of what Toulmin refers to as a diagnostic situation. In order to understand this, we must take stock of the fact that in human experience the world is not given as a whole. If it were - let this be laid down as a premise to set the course of the following considerations - the concept of cause would be unknown to us. As it is however, both ordinary experience and scientific experimentation and theorizing occur generally on the basis of certain very limited zones - to say the least - of the reality of nature. Hence the areas to which the experience both of everyday life and of science give us access are characterized by a border - more sharply defined in the one case than in the other - separating the things belonging to a particular area from the rest. These borders involve spatial and temporal limitations by which, however, they are not exhaustively defined; in some more general
18
1.1 Remarks on the Concept of Cause
sense they have a content of their own. Now it is important to realize that the situation of these borders, while perhaps not essentially subject to conditions, is yet determined by de facto conditions relating not only to man's situation in the world, but also to the goals he pursues in the sciences, and finally to the structure of that area of physical reality insofar as these goals are adapted to it and it is reciprocally adapted to them. As a consequence of these conditions, all of these areas of experience occur in groups. The areas belonging to a single one of these groups show a broad similarity that gives rise to generalizing assertions about the single area. Furthermore, each area taken in itself is sufficiently closed off from its surroundings to make events within it predictable to a degree commensurate with the needs of everyday life or with the higher requirements of science, whichever the case may be. The creation of the concept of such a closed system, as it is called technically, together with the fact that such systems are readily found in physical reality, was fundamental for the establishment of modern physics. Physical laws are concerned with what occurs in them, and what these laws bind together are not events or things which relate as causes and effects, but quantities, properties, and states of the particular systems. However, it often happens that the closed character of a system is significantly violated from without, that is, from beyond the boundary separating it from its surroundings, and that what occurs within it more or less suddenly starts to go awry in a manner that can no longer be made intelligible in terms of the conditions that have gone into the concept of the system itself. These are the cases in which the occurrence of an event can no longer be understood or even formulated if the closedness of the system is maintained and for which, relative to the system - which had previously been considered closed - the notion of cause is introduced. In saying this I have only stated in a way that is new and somewhat more determined in terms of physics what has concerned us for a long time. But now comes the issue under discussion: How are we to go on speaking in the new language when asked how a cause is in fact to be indicated, described or characterized? The following is a partial answer to this question. As long as that which is being characterized is to be understood as a cause of something, its characterization, regardless of the form it may take in a given case, will never attain - in most cases not even remotely - the kind of completeness achieved in the description of the states of the closed system in which the disturbance was produced by what is now to be characterized. This, unless I am mistaken, is the circumstance which Campbell envisages when he says that a causal description is characterized by a situation in which there is not as much knowledge as there could be; this statement can be understood in terms of a concept of cause satisfying at least the one condition that a cause only causes something to happen that could not have if the system the states of which have been described were really closed as the description assumes. The reason why the characterization of the cause, and hence of the effect -
I.1 Remarks on the Concept of Cause
19
taken in the ordinary sense as that which actually occurred - is necessarily incomplete is that a complete characterization would be possible only within the framework of a description of the states of a suitably extended system. In this framework however, it would be as meaningless to speak of a cause as it was in the original system. A typical example taken from physics, that can serve to clarify this situation, is the disturbance of a planetary orbit due to the presence of another planet. It is well known that the discovery of the planet Neptune was a consequence of disturbances recorded in the orbit of Uranus. This discovery led immediately to a typical situation in which the question of a cause was in the forefront, and in which the kind of answer given was to decide whether or not an entire theory would survive: The orbit of Uranus was different from what it should have been on the basis of the Newtonian theory of gravitation and of the factual data on all the then known planets. The most natural assumption, if an otherwise well established theory was still to be maintained, was that the deviations must be caused by a heretofore undiscovered planet. In this case it was even possible to derive from the disturbances such an exact quantitative determination of the cause, that on the basis of the calculations it was possible to locate it in the sky. But it must be made clear that even in a case like this the quantitative - hence informative - indication of a cause is subject to fundamental limitations. This would be most easily seen if a small planet of the size of the Earth were given as the cause of disturbances in the orbit of a large planet of the size of Jupiter. Everything will be all right as long as nothing further is asserted with respect to the system composed of the Sun and Jupiter than that the Earth is the cause of irregularities in Jupiter's orbit. Things start getting difficult if greater precision is attempted, for example if the Earth, subject to such and such conditions, is assigned as the cause of such and such irregularities in Jupiter's orbit. For since the plane-
tary influences are mutual, and since that of Jupiter on the Earth is markedly greater than its inverse, the disturbance produced by the Earth on Jupiter's orbit depends primarily on how Jupiter itself may happen to continue in its motion. Hence the precisely determined discrepancy between the disturbed and the undisturbed orbit of Jupiter cannot be entirely identified as a definite function of the Earth. In a case like this the only thing that will prove useful is to replace the system made up of the Sun and Jupiter with the extended system made up of the Sun, Jupiter, and the Earth in such a way as to subsume this entire system under the equations pertaining to gravitation. Once this is done, however, it is no longer meaningful to speak of a cause, but only of a manner in which the state of this entire system changes in the course of time. What has been described heretofore could be called briefly the dialectic of the causal and the functional or deterministic concept of nature in classical physics - the word 'deterministic' placing its peculiar emphasis on the temporality of things. In a brief recapitulation, this dialectic can be described as
20
1.1 Remarks on the Concept of Cause
follows. It is possible to abstract from the common sense notion of cause a central characteristic which must be taken as a minimal condition required of every causal concept, if that concept is to retain an essential similarity with the common sense one. According to this condition it is not meaningful to speak of the cause of an isolated occurrence, but only of the cause of the fact that one occurrence deviates from another. In many cases this other occurrence is the one that, within the framework of a particular set of parameters containing both empirical and rational elements, is expected or anticipated; in any case it is an occurrence that can be formulated, hence understood, in terms of such parameters. None of this is the cause for the deviating occurrence. What this set of parameters involves may vary from the experiential horizon of an Australian bushman to a highly complicated theory of modern physics supported by ingenious experiments. Such differences are not essential. What is essential is that a horizon of expectations is explicitly mentioned so that a deviating occurrence will stand out against its background, making it possible to inquire meaningfully after the cause of the discrepancy. Conceived in this way, the notion of cause has no immediate connection with the concept of a physical law. Rather, physical laws are formulated in terms of some set of parameters or another that must always be presupposed together with everything that belongs to it - including the physical laws - if it is to be possible meaningfully to seek and to identify causes. The laws themselves formulate purely functional connections in terms of the concept of the state of a system. Hume imposed qualifications on the idea that a causal connection is to be understood as a law, but both he and many of his followers persisted in taking the causal connection as the very prototype of physical law. This, however, is a very infelicitous way of associating things, because it relegates physical laws to a purely qualitative status, preventing them from achieving the degree of informativeness provided by quantitatively stated laws of a functional nature. And this deficiency can be made all the more plain if the concept of cause is further subjected to the above mentioned minimal requirement, by which it assumes, in the case of concrete application, a role mediating between two functionally stated laws. However, the non-quantitative character of the notion of cause points to the fact that in dealing with this notion we are dealing with a model for a concept of explanation and that as such it has a positive role to play in physics. Taking this as the result of the previous analyses, I shall return in closing to their beginning; it is to be recalled that I presented there the ideas of Konig in such a way as to bring out the possibility that the two parts of his basic thesis are connected. Once, as was pointed out, it has been postulated that a meaningful notion of cause must satisfy the minimal requirement to which a second reference has just been made, it becomes, not necessary perhaps, but at least extremely natural to go a step further and assert that cause and effect are things which are not given in space and time, but are rather connected with the understanding of the occurrences that are given in space
I.1 Remarks on the Concept of Cause
21
and time. The reason why this development appeared natural is that the description of the kind of situation in which typically the inquiry after causes arises involves necessarily the description of an event that never took place. Hence this very situation contains a component of a purely mental sort, and since, furthermore, its description ought to be included in the definition of the cause, it follows that a cause cannot very well have the character of an occurrence. Instead, as Konig suggested, the concept of cause ought to be taken as the concept of a satisfactory answer to the question "Why?". This puts the notion of cause in close proximity to that of explanation. For to answer the question as to what an explanation is by saying that it is a satisfactory answer to the question "Why?" is at all events to point in a definite direction. Whether a cause as well is - as Konig claims - of the same nature is a question on which the previous considerations have thrown some light and which I do not now wish to pursue any further; I should like to emphasize, however, that in view of all that has been said, Konig's position on this issue seems to me the natural one, provided that the concept of cause is acquired by way of the concept of the situation in which causes are usually inquired after. Still I should like to adopt Konig's position - taking the beginning and the end together - insofar as it casts light on the concept of explanation. Another thing that was pointed out towards the beginning of this study is that with his view of a satisfactory answer to the question "Why?" Konig comes close to the deductive- nomological explanation model of Hempel and Oppenheim while yet departing from them significantly. The agreement consists, as was indicated, in the requirement common to both that that which is to be explained follows logically from certain premises. On the other hand the difference, that must now be made clear, consists in the fact that while Konig takes the explanandum to be the kind of situation in which the course of events is different from the one expected, Hempel and Oppenheim view it merely as something that occurs in such and such a way, any condition, any event, or any state state of a system. Hence they permit the question "Why?" to be asked with regard to anything and everything of this kind, and there can be no doubt as to the paradigm the authors had in mind when they constructed their explanation model. Hempel writes: The best examples of explanation conforming to the D-N model are based on physical theories of deterministic character. ... The theory provides a set of laws ... which, given the positions and momenta of the elements of such a system at anyone time, mathematically determine their position and momenta at any other time. In particular, these laws make it possible to offer a D-N explanation of the system's being in a certain state at a given time, by specifying ... the state of the system at some earlier time. 23 23
Hempel 1965, p. 351.
22
1.1
Remarks on the Concept of Cause
At another point we find the very general observation that in a case of concrete application, the D-N model shows that given the particular circumstances and the laws in question the occurrence of the phenomenon was to be expected; and it is in this sense that the explanation enables us to understand why the phenomenon occurred. 24 I am now in a position to state why I have made it my purpose to draw a line of demarcation between the functionalistic program that has constantly proven more and more successful in the construction of physical theories, and a certain program of a concept of causality which is taken from the everyday sphere, which is viewed in certain circles as "the ancestor, dead and buried, of the functionalistic program" and in others as this program itself, but which in reality may take a completely proper and systematic place even in physics and even today. For first of all, it can be readily understood from what has been said how the D-N model - a proposal in some respects, no doubt, helpful, in others however extremely bizarre - came to be put forth as an explication of the concept of explanation. This, to repeat in all brevity even in illicit brevity - came about as the result of an identification, legitimate within certain limitations, of the structure of physical and deterministic laws, of the illegitimate identification of a deterministic and a causal scheme of thought, and finally of the legitimate connection of the notion of causality with the concept of explanation. The illicit middle step was the work of Hume and might have figured as the subject of a clever science fiction story even then. Secondly however, the present considerations may serve as a kind of propaedeutic for the development of a concept of explanation which is oriented, not to the scheme of functional laws and predictability, but to the kind of situation in which a part of physics, not permitting the explanation of a phenomenon, has become questionable, and in which the cause of this failure is inquired after. The question of how to elaborate positively such a concept of explanation goes beyond the limits of the present study.
24
Ibid., p. 337.
1.2 Aspects of Wholeness in Science and Philosophy* Introduction In daily life we are well acquainted with the meaning of the English adjective "whole" and its grammatical relatives. This holds in particular for those cases where we distinguish a whole from its parts. All of you will understand me if I now say that my talk will consist of several parts and that parts of it, if not the whole, is doubtful. If we turn from daily life to the sciences we still find quite innocent usages of the terms in question. If, for instance, we say that the solar system is part of our galaxy we are likely to mean that the set of bodies making up the solar system is a subset of the bodies making up our galaxy, and if we mean this the meaning is as clear as anything can be. However, the situation changes if we come to those disciplines whose major concern are organisms and their functions. From ancient times attempts were made to conceive the peculiarities of living organisms as opposed to an organic matter by using the categories of part and whole in a particular manner. One widely known common place originating in this field is the saying that here, if any where, a whole is more than the sum of its parts. In modern science such notions first stood in the background but in the course of time again played an increasing role. In our century the aspect of wholeness has been guiding principle and hallmark in creating a variety of mostly unorthodox versions of disciplines like pedagogy, psychology, medicine and biology. In German, these versions were even given names by prefixing the German equivalent of wholeness to the traditional name as if we would speak of wholeness psychology in English. "Wholeness psychology" is then not used in the sense of "social psychology". It is not the question of a special branch of psychology but rather of all of psychology treated under the principle of wholeness. Already the name is meant to signalize the re-installment of a manner of thinking allegedly lost in modern science. "For a long time" - said, for instance, Max Wertheimer - "it has been characteristic for European science that science in general could be done only in the following way. ... Basic is the reduction to elements, ... to piecewise ... relations between elements, the analysis, and on the other hand the synthesis by composition of the ... elements to obtain larger complexes. (By contrast) the Gestalttheorie knows: There are connections where what happens to the whole cannot be derived from single pieces and their composition. On the contrary, what happens to some part of a whole is determined by internal structural laws of that whole."! Now in view of such formulations two questions arise. First: What on earth is meant by them? For other than in our ordinary talk of part and whole such phrases as that a part is determined by its whole and that a whole has internal structural laws are badly in need of explication. Second: Given that a clear meaning has been found, distinct from the one with which * Originally published as Scheibe 1987b 1 Wertheimer 1925, pp. 6f
23
24
1.2 Aspects of Wholeness in Science and Philosophy
it is confronted in the previous quotation, is it really the case that modern European science was not also built on it but rather on the reductionist rival alone? Has science really been oblivious of deeper aspects of wholeness in the past, and is it at present? This has been doubted or even denied by living scientists. Hermann Haken, the founder of synergetics, readily admits that scientists analyse and synthesize nature. However, he compares them with the little boy who easily takes apart his toy car but runs into serious trouble when it comes to reassemble the parts. In this way - says Haken, - the scientist "early learns the importance of the saying that a whole is more than the sum of its part".2 Similarly, Peter and Jean Medawar, in their amusing "Aristotle to Zoos,,3 introduce biological holism as teaching "that any whole, and especially a whole organism, ... enjoys an integrity ... by reason of the functional interrelations and interdependences of its several parts". The arch enemy of the holist is "the reductive analytical-summative mechanist who believes that an organism is a mere additive sum of its constituent parts". However, the Medawars then simply deny that such a mechanist ever existed. Rather "he is a sort of lay devil invented by feebler natural-philosophers to give themselves an opportunity to enjoy the rites of exorcism." The holist's belief ''that mechanists study an organism in isolation ... reveals (their) naivete ... and their inexperience in practical biological matters, for the feat of studying an organ 'in isolation' is in reality an impossible one." Thus we have these two questions: What is the meaning of part-whole phrases and related ones if they are used in the intricate manner as illustrated? And secondly: Is the claim justified that western science does not actually put to use the deeper aspects of wholeness? My answer to the second question is a forceful: No. In line with the quotations given a moment ago I shall show that already in physics we can find non-trivial aspects of wholeness pertaining either to reality as it is conceived in physical theories or to the theories themselves. The question of meaning is involved, and I shall not try to give a general analysis according to the rules of the art. It is rather by giving the examples to refute the claim of the holists that I shall try to convince you that we can get the conceptions in question as clear as we wish if we try hard enough.
Philosophical Rationalism and Logical Atomism In order to approach my subject I start out from an aspect of wholeness that can already be found in Aristotle. 4 It does not concern science but poetry. Aristotle's idea is that "the unity of a plot does not consist ... in its having one man as its subject but rather one action. For it is an action, not a man, that can be conceived as being 'a complete whole, with its several incidents so 2 3
4
Haken 1984, p. 17 Medawar/Medawar, 1985, pp. 144£ Aristotle, De Poetica Ch.8
1.2 Aspects of Wholeness in Science and Philosophy
25
closely connected that the transposal or withdrawal of anyone of them will disjoin and dislocate the whole. For - continues Aristotle - that which makes no perceptible difference by its presence or absence is no real part of the whole." Aristotle's idea that not the minutest detail of a given whole may be modified without spoiling the whole has founded a tradition in philosophical aesthetics. It is used to define quite generally the perfection of a work of art. However, by referring to art, it seems, we have not yet arrived at science. Indeed Goethe, a critic of the science of his time, could imagine the crossing only as done by violence. "We must - says he - think of science as being art if we expect any kind of wholeness in it.,,5 Nevertheless philosophy has also elaborated one version of the Aristotelian idea that refers to the cognisance of reality and to reality itself. Physical reality, like a work of art, has this sensitive perfection, "the parts of the universe - as Pascal said6 - all are connected with each other in such a way that I think it to be impossible to understand anyone without the whole." And this thought of a global connection of all things, of knowledge as well as of the known, has been influential in philosophy until the very presence. Brand Blanshard,a declared idealist of our time, describes the connection as a system "in which every judgement entailed, and was entailed by, the rest of the system. .. The integration would be so complete that no part could be seen for what it was without seeing its relation to the whole, and the whole itself would be understood only through the contribution of every part.,,7 Can natural science offer a system fulfilling such an ambitious claim? Yes and no. The somewhat ambiguous situation may be elucidated by confronting the view of philosophical rationalism as just described by its opposite. The opposite is logical atomism. Logical atomism is an ontology, copied from the language form of modern logic. Wittgenstein has couched it in the pithy statements: 8 1.2 The world devides into facts 1.21 Each item can be the case or not the case while everything else remains the same 6.3 The exploration of logic means the exploration of everything that is subject to law and outside logic everything is accidental. Obviously, this is a position of total contingency of the world. You will not be surprised to hear that Blanshard has called logical atomism "the most formidable attack ever made on reason as an independent source of knowledge."g. And even if the matter is seen from the viewpoint of modern science this other extreme could not be accepted either. You cannot meet the efforts of physics, all this struggle for law and order, with simply pointing out that 5 6 7 8
9
Quoted from Grimm's W6rterbuch, item "Ganzheit". Pascal, Pensees no. 72 Blanshard 1939, vol. II, pp. 264ff Wittgenstein, Tractatus Logico-Philosophicus Blanshard 1961, p. 92
26
1.2 Aspects of Wholeness in Science and Philosophy
before the throne of logic all are equal - from Newton's laws of mechanics down to the most trivial statements on my present sense impressions. Thus we have these two extreme positions of philosophical rationalism and logical atomism, and I think it is fair to say that the method actually persued by physical science is a reasonable compromise between them. On the one hand the tremendous success of modern science is essentially grounded on the insight that we can investigate parts of the universe without considering everything. The conception of the universe as the unrestricted totality of everything existing may be an interesting conception from a philosophical point of view. In physics it would be of no use whatsoever. There a drastic selection takes place under various viewpoints: we idealize, we neglect, we isolate, we simplify, we abstract. In every case this means that we pass from a large whole that really is a piece of nature to some fraction of it, and it is only this fraction that we are going to deal with. Moreover, the selected portion is viewed as if it were a world of its own - a complete substitute for the actual universe. The object of electrodynamics as defined by Maxwell's equations is a field and charged matter and nothing else. Quantum mechanics of the hydrogen atom has as its object one hydrogen atom (or an ensemble of such) and nothing else, and so on. In each of these cases we act as if the object of our theory be the total universe although we know that this is not the case and sometimes mitigate the situation by introducing more complex systems. The principle to take the part for the whole - Galileo's principle as it might be called after its inventor - is applied throughout, and the fact that it works, is a highly non-trivial fact about our universe. Insofar as the natural sciences apply Galileo's principle they certainly do not satisfy the demand of philosophical rationalism to respect the universe as one integrated system. Moreover, insofar as this method is successful, reality certainly possesses features as they are ascribed to it by logical atomism. However, it is a far cry from this to conclude that reality as far as it is considered in a physical theory does not show features of internal coherence as philosophical rationalism would have them. On the contrary, the most obligatory, precise and simplest illustrations for the coherent behaviour in question come from science. A famous example is Newton's theory of universal gravity and the step from Kepler's theory of the solar system to Newton's theory. According to Kepler's theory any planet moves independently of any other. The statement how all planets move is the bare conjunction of the statement concerning the movements of each individual planet. By contrast, the theory of universal gravity, introducing an interaction also between any two planets, is an indecomposable theory representing a considerable gain in coherence as compared with Kepler's theory. An outstanding example has been the discovery of the planet Neptune. It was grounded on a forecast from data pertaining exclusively to two other planets. Such a forecast is impossible on the assumption of Kepler's theory. In general, the coherence of Newton's theory verifies and even makes intelligible many sayings of philosophical co-
1.2 Aspects of Wholeness in Science and Philosophy
27
herence theorists. What that theory has to say about one body as being a gravitating body cannot be said other than by relating it to every other body in the universe. Moreover, if we were to find a system of bodies moving exactly according to Newton's theory this very same theory would permit us to conclude that system to be all-inclusive. In other words, the part can only be understood by referring to the whole, and a completely coherent system must be the whole.
Holism in Physics The step from Kepler's to Newton's law as it was just used to illustrate the superiority of the latter over the former can also be taken as a paradigm for the development of physics insofar it, too, shows non-trivial features of wholeness. In 19th century the development of science has still been viewed as an essentially cumulative process. To give but one example I quote from John Stuart Mill's "System of Logic" the following argument:lO "The separation of a complicated phenomenon into its component parts is not like a connected and interdependent chain of proof. If one link of an argument breaks, the whole drops to the ground; but one step towards an analysis holds good and has an independent value, though we should never be able to make a second." Typically Mill's paradigm of such a concept of analysis comes from analytical chemistry. Having been taught by this science that "all things are at any rate compounded of ... elements" Mill concludes: "whether the elements themselves admit of decomposition, is an important inquiry, but does not affect the certainty of the science up to that point." Thus Mill's idea obviously is that science explores the universe like somebody visits a gallery: one hall after another until he has seen everything, but with the attitude that the visit could be stopped at any time without any loss with respect to the pictures already done. However, if science proceeds as was assumed previously, i.e. if idealizations, neglects, simplifications etc. are accepted in principle, then it is hardly to be expected that growth of knowledge is cumulative. Rather it will be a process of repeated self-correction in the course of which those idealizations, neglects, simplifications etc. are stepwise revoked. A corresponding model of the development of their discipline was known to physicists like Boltzmann and Nernst already at the end of the 19th century. 11 In a later paper 12 of 1922 Nernst explains with Einstein's and Newton's gravitational theories in mind, that "the modifications that have to be made with respect to the earlier theory are so small that according to the present state of research they can be neglected except for the computation of the orbit of Mercury. But as a matter of principle every computation that astronomers have performed so far must be changed." Generalizing Nernst 10 11 12
Mill 1847. Introd. no. 7 Scheibe 1988b (this vol. ch. 11.6) Nernst 1922, pp. 491 f
28
1.2 Aspects of Wholeness in Science and Philosophy
then goes on: "One might think that ... the laws of nature ... are valid with absolute precision in certain domains and that the matter could be settled very easily by pointing out the limits within which they remain valid. For all practical applications this is true enough, . .. From a strictly logical point of view, however, the matter appears much more disastrous. If a general law of nature becomes significantly inaccurate beyond certain limits, then the curse of this impression comes to roost on every application of the law even within these limits ...." What Nernst means by saying this can, I think, most aptly be paraphrased by saying that a physical theory, well confirmed to a certain extent but finally falsified, reacts as a whole against the falsification: Strictly speaking, its successor contradicts it everywhere, and only in this way the mistake that had been made can be corrected. The reason, however, that we do not get off more lightly if progress is our aim are those features of wholeness as they were already shown to be present in some of our physical theories. It is similar in a second case of wholeness related to the development of physics. One may ask whether improvements on theories will always be made in the way just indicated or whether we will come across theories that cannot be corrected in this way at all, these theories being wholes of a novel kind. That such theories exist, closed theories as he called them, was a conviction of Werner Heisenberg.13 For him Newton's mechanics was the first closed theory of modern science, and Einstein's special Theory of Relativity was another. Accordingly, Heisenberg criticized the reaction of most physicists who came to the conclusion that Newtonian mechanics had been disproved by special relativity. "From the point of view which we have finally reached in quantum theory says Heisenberg - such a statement would appear as a very poor description of the actual situation." What was the actual situation? It was that in this case one closed theory had succeeded another one. According to Heisenberg a theory is closed if its basic concepts already uniquely determine its basic laws, or, more precisely, it is a theory such that to the extent to which a phenomenon can be described by the basic concepts of the theory the laws of the theory hold good for that phenomenon. An equivalent definition, due to C. F. v. Weizsacker, clearly shows the aspect of wholeness connected with our concept. According to it a theory is closed if it cannot be improved by small and partial modifications. You will realize that on this definition the closeness of a scientific theory comes very near to the perfection of a work of art according to the Aristotelean tradition. The equivalence to Heisenberg's original definition is secured by the assumption that a big modification of a theory consists in a radical change of its conceptual basis as we certainly have it if we pass from classical to quantum mechanics. By contrast a small change will leave the concepts unchanged, and only the laws will undergo some corrections. Thus one may even think of closed theories as being quasi-analytical, assimilating 13
Heisenberg 1959a, pp. 80ff; Heisenberg 1969, pp. 132ff
1.2 Aspects of Wholeness in Science and Philosophy
29
them to such stable constructions as we find them in mathematics and logic. Indeed we can see Quine speaking of their laws as being ''true by virtue of the conceptual scheme" or, which he thinks to be equivalent, "by virtue of meanings". "Because these laws are so central ~ continues Quine ~ , any revision of them is felt to be the adoption of a new conceptual scheme, the imposition of new meanings on old words.,,14 It is only in passing that I would like to mention a kind of holism that is usually connected with the name of Duhem and is held at present in a particularly radical version by Quine. Duhem's holism is a holism faute de mieux. The natural attitude of a holist is that he recommends his position and quite positively tries to point out its advantages. By contrast, in the case of Duhem and Quine the argument always seems to be: There is this damned wholeness, and we have to put up with it. Duhem's holism 15 denies faute de mieux, what ordinary empiricism would like to have true. It denies that, given any two logically independent consequences of a physical theory, it be always possible to decide by an experiment about their respective truth values. "It is never an isolated hypothesis but always a whole group of such that a physicist can submit to experimental control." Quine is even more radical: " ... total science is like a field of force whose boundary conditions are experience ... it is misleading to speak (as the empiricists do) of the empirical content of an individual statement. .. Any statement can be held true, come what may, if we make drastic enough adjustments elsewhere in the system. .. Conversely ... no statement is immune to revision ... The unit of empirical significance is the whole of science.,,16 Obviously, the Duhem-Quine holism is a methodological holism setting more or less narrow boundaries to the empirical analysis of scientific theories. Apart from its general relevance to my subject I mention it in this section because Duhem seems to discuss essentially the same matter as we saw Nernst to have done, only under a different aspect: the methodological instead of the developmental. Moreover, Heisenberg's closed theories may even be better examples for what Duhem wants to bring home than were his own. I mention the matter also for another reason. Duhem likes to paraphrase his major claim by saying things like: "Physics ... ' is an organism in which one part cannot be made to function except when the parts that are most remote from it are called into play, some more so than others, but all to some degree." He even compares the physicist, confronted with an anomaly in his system, with the physician who cannot dissect a patient to find the diagnosis but rather has to look for those irregularities that show themselves in the body as a whole. Now this organismic analogy certainly is not out of place here. Yet you can see the weakness of attempts to clarify the situation by this analogy. For, though Duhem's holism has its own problems, it is a fairly specific claim 14 15 16
Quine 1950, p. XIV Duhem 1962, Part II, Ch. VI Quine 1953, pp. 41ff
30
1.2 Aspects of Wholeness in Science and Philosophy
about a fairly well understood discipline. Now assume that Duhem had told us nothing but his analogy and you will realize that we would be more than entitled to rejoin: What did you really want to tell us about physics?
The Anthropic Principles A field that has always been surrounded by an aura of aspects of wholeness is cosmology. Already its preliminary definition seems to demand the relevant terminology: One uses to say that cosmology deals with the universe as a whole. This does not mean that a cosmological theory may not omit anyone existent object. It only means that the theory is restricted to the description of some global features of the universe, in particular with respect to spacetime, and that it at least attempts to view its object as something unique. The following is a fairly appropriate formulation of the present situation in cosmology with respect to the aspects of wholeness related to it: 17 "All parts of the universe are, at least indirectly, causally related. Thus if we really understood these causal interactions, we might be able to prove that the universe could not self-consistently be made any other way than it is. In that case, we could aim eventually to understand all the universe by examining in complete detail anyone particular part of it; perhaps, for example, we could understand that there must be immense numbers of distant galaxies, because of the nature of everyday life. This sort of deduction would indeed be possible in principle if there is ultimately only one self-consistent structure of a universe in which we can live." This statement which the author hurries to make out to be an unproven conjecture is remarkable in several respects. Leaving the more abstract parts of his vision for the conclusion of this paper let me now draw your attention to the fact that it includes man: The thesis of uniqueness is restricted to a universe in which there is life. I will not dwell on this facet insofar as it re-introduces man into physical cosmology from which it had been expelled since Copernicus. Let us rather try to understand what is going on by first asking for the reason for this re-appearance. The laws of physics do not tell us anything about what the actual material content of the universe is. By assuming the real possibility of certain objects and processes they only tell us: If some of those objects and processes happen to have such and such properties then some (possibly other) objects and processes have such and such properties. Now, as long as a physical law is applied only innerworldly one has no reason to complain about this feature. The openness of the initial and boundary conditions then is precisely what provides for a wealth of applications of the law and in this way guarantees its empirical content. This openness, therefore, is nothing that we have to put up with faute de mieux. Rather it is the very feature that we want a law to have. 17
Ellis 1979, p. 533
1.2 Aspects of Wholeness in Science and Philosophy
31
However, the situation that there is no need to understand contingent conditions because they are different in different systems is changed drastically as soon as those conditions refer to the universe as a whole. The universe as a whole is given to us only once and the data in question are by no means adapted to the peculiarity of this situation. The age of the universe, the number of its galaxies, the mean density of its neutrinos - these are data like my age, the number of persons present in this room, the mean density of population in India. For such data we are prepared to give relative explanations according to the implicative structure of physical laws mentioned a moment ago. Using the laws we can shift the information contained in cosmological quantities to some other place, e.g. to the beginning of the universe. But wherever the contingent data are located, the uniqueness of the situation forces us either to accept a brute fact or to provide for an absolute explanation. In this somewhat desperate situation one has recently remembered that the least arbitrary contingent data in our universe might be those that concern ourselves. And the question has been asked: Could we not explain those cosmological data by them? As a general answer to this question one or the other so called anthropic principle has been formulated. 18 The variations of these formulations as well as their epistemological status and their role as "principles" have remained unclear so far. A fairly sound description of what is going on seems to be the following. We try to find an inference schema of the sort that has become famous by Descartes' instantiation cogito, ergo sum (I think, therefore I exist).
The anthropic inference schema starts from Descartes' conclusion and tries to infer from this that the universe has such and such features, i.e. we have 19 sumus, ergo mundus talis est (we exist, therefore the world is such and such)
Explaining this scheme the first thing to say is that the known natural laws stand behind it as tacit premises. Second, the explicit premise concerning our existence usually remains rather unspecified from the physical point of view. For it we can take any assumption characteristic for the only form of life known to us, e.g. the existence of chemical elements like carbon, oxygen, nitrogen. Finally, the conclusion concerning the universe can be any cosmological datum. The question then is: For which conclusions is the inference correct? This is not the occasion to give the answer in any detail. Suffice it to say that a universe that can be known by human beings must look like the universe that we know looks like de facto. In particular the universe must be 18 19
Popular presentation in Gale 1981. Advanced treatment in Carr/Rees 1979; Davies 1982; Barrow/Tipler 1986 Carter 1974
32
1.2 Aspects of Wholeness in Science and Philosophy
as old and as big as it is in order to make life possible - an expenditure that we might find particularly appealing. Even the value of the gravitational constant seems to be determined by the assumption of our existence. Fantastic as such inferences are if premise and conclusion lie wide apart they become more likely if we realize that we in fact have long chains of inferences before us and if we learn about some intermediate links. Such is, for instance, the existence of sufficiently many stars of the type of our sun whose life span and radiation energy makes organic life possible on a planet of appropriate size. Though there are such intermediate links to bring about the inferences in question the very conclusion has to be of a cosmological nature. Life on earth is possible only under the assumption that the earth's distance from the sun is, within very narrow margin, the one it in fact is. 2o But this distance is not a cosmological fact. Today nobody will risk his neck for the hypothesis of the anthropic uniqueness of the universe. We may well ask, though, what the hypotheses can mean to us if it were true. The scientist probably will be irritated mostly by the fact that an anthropic inference is not a causal explanation. Human existence is, of course, not the cause of the global and fundamental structure of the universe, even if the latter were uniquely determined by the former (and the laws of nature). However, we have to put up with the situation that a physical cause of the universe as a whole will hardly be found anyway. We cannot demand of the anthropic principle what is impossible on general grounds. A second option is a teleological interpretation of the inference: The universe is such and such in order to bring about mankind one day. This interpretation, sometimes called the strong anthropic principle,21 goes very far indeed. For it makes hardly any sense without an additional assumption of a non-physical kind telling us what is behind the universe. But even without a causal or teleological interpretation the anthropic principle can mean something to us if we are prepared to take seriously its premise of our own existence. For us this premise is unconditional and, therefore, any of its consequences receives the necessity that the universe as we perceive it, though it may be different from what it is, could not be perceived other than it is perceived by us. As scientists we must be content with this insight, and as man I think we can be content.
Inseparability in Quantum Theory It is well known that human perception has become controversial also in another physical context: in quantum theory. And as by the anthropic principle, our own existence is not sharply contrasted with, but rather woven 20 21
Hart 1978, esp. p. 36 Carter (in no. 18) makes another distinction: The anthrophic principle is weak or strong according to whether its conclusion refers to our existence (age of the universe) or not (gravitational constant).
1.2 Aspects of Wholeness in Science and Philosophy
33
into reality, so also the role of the observer in quantum theory reminds us of many-sided connections within reality. We have heard in a previous quotation how this many-sidedness can be expressed with particular succinctness in a formula of self-consistency according to which - I now quote from a quantum theorist 22 - "nature is as it is because this is the only possible nature consistent with itself." This abstract formulation can be given an ontological illustration by the idea that reality is, as it were, a monolith. Previously I confronted the rationalistic view of total coherence with Wittgenstein's logical atomism. If both positions are given an ontological interpretation they could be called "ontological monism" and "ontological atomism" respectively. The former could be illustrated by the picture of reality being a monolith. The latter is more than an analytic method to depict the phenomenal variety of reality. Rather it takes this variety as a symptom of independent existences, grounded on an atomic basis. What ontological atomism claims and ontological monism denies is then, for instance, that an universe without this piece of paper cannot only be imagined but may actually exist. Whoever has no quarrels about the meaning of these words will even find this statement extremely plausible. Insofar most of us in our unphilosophical moments appear to be ontological atomists. The trouble, however, is this: Once we make this step the matter runs away with us. There is then no reason why we should not continue the destruction until we are left with one single atom, or even: nothing. Thus the merit of ontological monism is that it saves us from the conclusion that the world in which we live is not more likely than the non-existence of anything at all. Does modern physics yield any argument in favour of this position? At the beginning I emphasized the tremendous success of Galileo's method in physical science. Clearly this success speaks against ontological monism. For it means that we are allowed to take the part for the whole at least within certain limits of error. However, I also emphasized that physicists recognize these limits and again and again have taken them into account. A man like Schrodinger, asking how we can come to make precise predictions about the future behaviour of a physical system, argues 23 that "it may be, and if we are entirely strict about it, it certainly is the case that we are forced to extend the system considered to the entire universe." This position is re- emphasized by the recent interest of physicists in what has come to be called "deterministic chaos". However, the most amazing support to ontological monism came from quantum theory. The main lesson of modern atomism is that the continued division of matter leads to objects showing a behaviour totally unexpected from all we know about reality according to classical physics. On the atomic level we meet with so-called quantum phenomena which according to Bohr show features of individuality or wholeness completely foreign to 22 23
Chew 1968 Schrodinger 1932, p. 2
34
1.2 Aspects of Wholeness in Science and Philosophy
classical physics. 24 The new non- classical unity of a quantum phenomenon consists, according to the Copenhagen interpretation, in the inseparability of a quantum object, e.g. one atom, and an apparatus preparing the object and measuring one of its observables. "The essential wholeness - says Bohr - of a quantum phenomenon finds indeed logical expression in the circumstance that any attempt at its well-defined subdivision would require a change in the experimental arrangement incompatible with the appearance of the phenomenon itself." Strictly speaking even this wholeness of object and experimental arrangement is given to us only if one has decided to describe certain parts of the arrangement, indispensable for the interpretation of the experiment, in the language of classical physics. We here encounter the very reason for the difficulties connected with the observer in quantum theoretical description of reality. It consists in a new conception of a special but all important part-whole relation. In classical physics the description of the momentary state of a physical system consisting of subsystems is the bare conjunction of the corresponding state descriptions of those subsystems. If the subsystems interact then we cannot in general predict the future behaviour of any subsystem from its initial state alone. As we have seen previously as regards interaction there may be very sensitive dependences between the parts of the total system. But the momentary state of any subsystem is entirely independent of the simultaneous states of the other subsystems. Equivalently, there are no states of the total system other than those conjunctions of the subsystems' states. In quantum theory the situation is entirely different. Schr6dinger 25 described it in the following words: "When two systems, of which we know the states by their respective representatives; enter into temporary physical interaction due to known forces between them, and when after a time of mutual influence the systems separate again, then they can no longer be described in the same way as before, viz. by endowing each of them with a representative of its own. I would not call that one but rather the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought." Thus, given a quantum theoretical system consisting of subsystems the latter are in general not in well defined states even if this is true of the total system. Rather the information contained in the total state goes into a complicated net of correlations between the subsystems. We can make the relative statements that, if by an appropriate measurement we were to find subsystem I in such and such state then (if only two subsystems occur) subsystem II would be found in such and such state, where the latter is uniquely determined by the former. The impossibility of making absolute statements on the states of the subsystems is an effect of wholeness in the following sense: As long as we do not make a measurement we cannot in general think of a subsystem of a physical system as having a separate 24 25
A review of Bohr's relevant remarks in Scheibe 1973c, Ch.1. Schr6dinger 1935b
1.2 Aspects of Wholeness in Science and Philosophy
35
existence of its own. To the extent to which there is mutual interaction between systems in the universe we therefore cannot isolate anyone of them. Only the universe as a whole would be in a definite state. The interaction, though it brings about the phenomenon, yet would not constitute it. It is the momentary entanglement of the subsystems that is at stake. Under the assumption of general validity of quantum theory this entanglement would be an excellent explication of our metaphoric description of the world's being a monolith. In this paper I tried to show that usually rather vague philosophical aspects of wholeness can be given more or less precise interpretations in ordinary science. This is not to say that all such aspects have been or can be considered in the way here suggested. The claim is only that we need not go outside ordinary science in order to at least exemplify non-trivial cases in point. Likewise, what was said it not to say that all case studies given are on the same level of clarity and precision. Rather all of them enter ground that does not belong to the well established parts of physics. But this was to be expected. Conceptions of wholeness by their very nature provoke speculation. At the same time they allow to speculate in a fairly controlled way. We have not been dealing with arbitrary spare time activities of scientists that could have been performed by the layman as well. Though more or less speculative all results presented have been obtained by making full use of the expertness of one or the other scientist. Indeed it is precisely this mixture of speculation and science that makes our subject so attractive. One last remark deserves special emphasis. I am anxious not to let you go away with the impression that I gave you a conceptual analysis of the notion of wholeness and its relatives. I emphatically did not do this. I did not do it because there is something boring about analytical considerations if presented outside the community of specialists. I did not do it also for another reason. Even if this audience had been one of specialists, the presentation of a logical analysis meeting the requirements of the profession and, at the same time, covering all the aspects I have touched upon would have been out of the question, simply because no such thing is available. The most I could do in this respect thus was to convey the impression that within the scope of analytical philosophy such an analysis belongs to the urgent desiderata hopefully to be developed in the not too distant future.
1.3 Kant's Apriorism and Some Modern Positions* The terms a priori and its counterpart a posteriori are of medieval origin. 1 In the fourteenth century they had assumed a proof-theoretical meaning: In a proof a priori we infer an effect from its cause. In a proof a posteriori we proceed in the opposite direction; a cause is inferred from its effect. This opposition mirrors the Aristotelian distinction between what is prior in nature (rrpOTepov rr, r.pvaeL) and what is prior in relation to us (7r pOTepOV 7r por:; 'TlJ.Liir:;). 2 For Aristotle what is prior in relation to us are those things which are nearest to sense-perception. Prior in nature are the universals. Identifying causes with universals and effects with individuals we get the scholastic opposition. This correspondence already foreshadows an occasional usage to be found in Leibniz. He calls a proof from primary truths a proof (probatio) a priori, paraphrasing it as being independent of experiment (independens ab experimento).3 On the other hand, Leibniz says that contingent truths can only be known a posteriori by experiments (nempe per experimenta).4 These sayings already approach the modern terminology that was established by Kant: Knowledge a priori is knowledge independent of experience, knowledge a posteriori is knowledge obtained by experience. This at least is the way we usually talk in the Kantian tradition. However, already Kant's definitions were not meant to imply that, just as a priori knowledge in fact is free from experience, experience in turn would be free from a priori elements. On the contrary, much of what Kant has to say in this context was meant to point out the a priori elements in experience. For this reason part of the theme of our conference ~ "The Role of Experience in Science" ~ is the role of the a priori in science, and it will be this more restricted theme that I am going to talk about to-night. Since it is Kant to whom we owe the most involved treatment of the matter that has ever been given I may be allowed to start with him and only afterwards· proceed to some recent positions. Although my considerations will thus be focussed on the a priori in Kant's sense we shall not loose sight of its Aristotelian origin. I
In his Critique of Pure Reason Kant does not develop a theory of science in the modern sense. His starting point is the deplorable state of metaphysics, and the central question of his book becomes: How is metaphysics as science possible? Accordingly, far away from viewing science in need of any help (least of all from metaphysics) Kant attempts to answer his major question by looking out for help from science to rescue metaphysics. Looking back into .. First published as Scheibe 1988a 1 Mittelstrass 1977 2 Aristotle, Anal. Post. A, 71 b33ff. 3 Couturat 1901, pp.518f. 4 Grua 1948, p. 304
36
1.3 Kant's Apriorism and Some Modern Positions
37
history he finds three disciplines that earlier or later have entered, as Kant expresses himself, "the secure path of a science." They are logic, mathematics and physics. What Kant may have learned from logic for his enterprise he does not tell us explicitly, and I shall come back to this question at a later stage of my presentation. As regards mathematics and physics Kant is quite explicit about those features of them that he wants to imitate in his "future metaphysics that can claim to be a science". For mathematics Kant points out: "A new light flashed upon the mind of the first man ... who demonstrated the properties of the isosceles triangle. The true method ... was not to inspect what he discerned in the figure or in the bare concept of it and from this, as it were, to read off its properties; but to bring out what was necessarily implied in the concepts that he had himself formed a priori and had put into the figure in the construction by which he presented it to himself.,,5 The situation in the natural sciences being similar as described for mathematics Kant now tries "to imitate their procedure so far as the analogy which they bear to metaphysics may permit." Having assumed hitherto that all our knowledge must conform to its objects philosophers have failed to extend this knowledge by means of a priori reasoning. Kant's proposal is "whether we may not have more success in the tasks of metaphysics, if we suppose that objects must conform to our knowledge.,,6 Elaborating this famous proposal to revolutionise metaphysics Kant takes it for granted that metaphysical knowledge is knowledge a priori and is to be expressed in true judgements a priori. Moreover, trying to reverse the traditionally assumed relation between objects and our knowledge of them Kant seems to be on the way to discover a new kind of a priori judgements. But what does he mean by this term? As predicated of judgements a priori and its opposite "empirical" are used by Kant to denote a distinction with respect to the source or the origin of the knowledge claimed in the judgement. Sources of knowledge Kant frequently identifies with one or the other of the various cognitive faculties that he is talking about. In the case before us the source is observation (Wahrnehmung), and a judgement is a priori if it can be verified without having recourse to observation. I wish to emphasise that Kant is fairly cor.sequent in maintaining this meaning of a priori. Whenever we find him saying that judgements of such and such a kind are a priori or ~ vice versa ~ that all judgements a priori also have such and such other properties he is not giving a redefinition but is making a statement. Such statements, claiming a connection between the a priori and some other property of judgements, are of some interest for us because modern treatments of related problems often can be understood as modifications of these statements, loosening the connection cla:med in them. Let me illustrate this by a preparatory example before I come to the main case that we shall be occupied with for the rest of this paper. Kant mentions 5 6
Kant 21787, B XI f. The translation, here as elsewhere, is Kemp- Smith's. ibid. B XVI
38
I.3 Kant's Apriorism and Some Modern Positions
two criteria for judgements to be a priori: necessity and strict universality. 7 These criteria are not directly related to Kant's central message but they are constantly used in his arguments. The following curiosity concerning them may first be pointed out. It is as certain as anything can be that Kant thought his criteria of necessity and universality to be not only sufficient but also necessary for a judgement to be a priori. 8 However, in the passage where they are systematically introduced the criteria are formulated only as being sufficient. This is remarkable because their necessity does not seem to be quite as evident. It seems not quite as evident that every judgement whose verification is independent of observation is a necessary judgement as that every necessary judgement is a priori. The matter is interesting also for another and more important reason. As the post-Kantian development of science and philosophy has shown scientists and philosophers have become more and more reluctant to accept any absolute dichotomies. Now, in the definition under discussion the absoluteness of the distinction between the a priori and the empirical is equivalent with an absolute concept of observation. But to-day we are used to relativise the latter in various ways, and if we look what this does to the former the following turns out. Whereas logical necessity will remain to be sufficient for a judgement to be a priori however relativised there are fairly obvious counterexamples for the converse relation. Perhaps the best known example from the history of physics is geometry. Although there is no general agreement, to-day most physicists look at geometry taken by itself as being empirical. On the other hand, geometry relativised to any of its well known extensions - classical mechanics, electrodynamics, quantum mechanics etc, - certainly is a priori: Every geometrical statement can in principle be verified (always in the broad sense) without having recourse to observations specific for any of those extensions, i.e. without measuring masses, forces, charges, spin etc. Correspondingly, any of those extensions is empirical with respect to geometry in the sense that at any rate their laws cannot be verified without having recourse to geometrical observations. As I mentioned, there are other relativisations of the a priori maintaining Kant's definition and relativising the concept of observation. 9 But I have to come back to Kant and remind us of the second important feature of metaphysical judgements: besides being a priori they have to be synthetic. "Synthetic" very roughly meaning "having content", it was a question worth to be asked how synthetic a priori judgements are possible. Kant's answer realises his idea to fill metaphysics with scientific content by partially reversing the epistemological order of objects and our experience of them. The judgements in question amount to knowledge - this is the an7 8 9
ibid. B 3 f This has been denied by van der Waerden in his (1971). The price, however, paid for this denial seems to be much too high, see p. 58 and 59 of the paper. See part III below.
1.3 Kant's Apriorism and Some Modern Positions
39
swer lO - if and only if they formulate necessary conditions of the possibility of experience in general and, at the same time, conditions of the possibility of the objects of experience. Obviously, the only-if-part of this thesis is Kant's solution of his problem to make metaphysics scientific. At the same time, it turns out that thereby metaphysics is transformed into epistemology. We shall not be concerned with this part of the thesis. Calling judgements formulating conditions of the possibility of experience transcendental the other part of Kant's thesis can be rephrased by saying that all transcendental judgements arc synthetic a priori. It is obvious that a transcendental judgement is meant to get, so to speak, its synthetic touch by expressing, not any experience itself, but only a condition of its possibility in general. At the same time, it is this weakening that is expected to make room for the a priori character of the judgement. The statement that all transcendental judgements arc a priori is another example of a statement where the property of being a priori is connected with some other property, and it is this statement that we shall pursue through some of its modern modifications. I will, therefore, conclude this brief review of Kant with the question what made him believe this statement. Of course, I don't know the answer because it is buried in the depths of Kant's reasoning. But the following remark may be of some help. Previously I mentioned that Kant may have been taught a lesson not only by mathematics and physics but also by logic. As is well known, at Kant's time logic was in a state no less deplorable than that of metaphysics. In spite of this Kant is surprisingly clear about the nature of logic in general: 11 Pure logic, as he calls it, is not about what men happen to think all the time. It is not concerned with the empirical conditions of making inferences but with the conditions to which inferences must conform if they arc to be valid. Put in modern terms, for Kant logic was a normative discipline, and it is for this reason that its theorems are a priori. If we now remember 1) that transcendental logic is introduced into the Critique of Pure Reason by reminding us of these virtues of pure logic, 2) that it is further developed in close contact with the latter and, last not least, 3) that it has to show how objects conform to our knowledge, it is suggestive to view also transcendental principles as consequences of certain norms or conditions to which our thinking must conform if we are to get knowledge, including empirical knowledge. It is true that much of what Kant said sounds as if he were telling us so many facts about the human mind and its cognitive faculties. 12 But this may be explained by his belief that those transcendental norms cannot be chosen arbitrary. He even entertained the view that they be unique. So what I want to suggest is that in Kant's transcendental philosophy there is this normative aspect 10 11
12
Kant, 21787, B 193 ff. ibid. B 74 ff. The statemental character of Kant's principles is evident in spite of his occasional indication to give them also a normative function.
40
1.3 Kant's Apriorism and Some Modern Positions
which in turn makes us understand why transcendental judgements must be a priori. II
The position that, from a modern point of view, is usually seen as being the very negation of Kant's Apriorism, is logical empiricism. I don't want to discuss logical empiricism in this talk. However, as a background for the review I in fact want to give of some other positions it may be useful to briefly remind us that strictly speaking logical empiricism is no direct anti-thesis of Kant's position if the latter is characterised by way of assertions. Modern analytical or linguistic philosophy frequently does not just deny traditional philosophical statements. It rather denies the problems to which those statements were supposed to be the answers the status of "theoretical" problems. And very often it is suggested to give them a practical and, in particular, a normative status instead. For instance, Carnap explicitly refused to define empiricism by a statement: "It seems to me - he says - that it is preferable to formulate the principle of empiricism not in the form of an assertion - ... 'all synthetic sentences that we can know are based on ... experiences' ... - but rather in the form of a proposal or requirement. As empiricists, we require the language of science to be restricted in a certain way; we require etc. ,,13 The same attitude can be seen in Carnap's criticism of the use of universal words and his method of translating pseudo-object-sentences into syntactical sentences. Again, Carnap's later distinction between external and internal questions regarding linguistic frameworks is meant to reinterpret philosophical problems as practical problems of language choice as opposed to theoretical questions that can meaningfully be raised only after that choice has been made. Now, by what I came to say on the possible normative aspect in Kant I did not want to imply that I could map Kant's principles of pure understanding on some system of language choice. But you will recall that Kant tried to derive his table of categories by looking at the table of judgements. For him the possible forms of judgements had epistemological and even transcendental relevance. Conversely, the modern strategy to transform philosophical problems into practical problems of language choice will not exclude a reidentification of synthetic a priori residues as features of those languages. However, as I said, I don't want to go into any details about logical empiricism. Still the first modern position I want to discuss is an empiricist position in a not too narrow sense. It is the view that the solution to Kant's problem about the preconditions of experience and their a priori character was implicitly contained in Darwin's theory and can be made explicit by the present theory of biological evolution. In fact this view is not as new as recent activities, brought under the heading of an "evolutionary epistemol13
Carnap, (1936-7), p. 33.
1.3 Kant's Apriorism and Some Modern Positions
41
ogy", might make one believe. Let me tell you a nice story on two leading figures of south-west-German Neokantianism: Windelband and Rickert. Both were professors in Heidelberg, - Rickert the successor of Windelband. But the story refers to an earlier period at about 1890 when Windelband still was in Strassbourg and Rickert was about to make his doctorate with him. One day in Windelband's seminar - and now I follow a contemporary report 14 - "Rickert had to give a talk on the logical laws of thought. The talk was conceived entirely along the lines of the then dominating Darwinistic theory, according to which the laws of thought were viewed as useful adaptations to reality. Their non-observance would be a disadvantage ·in the struggle for existence, continuous selection was going on also with respect to our understanding, and it would be as difficult for us in our times to think falsely as in former times it had been difficult to hit the truth. There could be no doubt that evolution would continue in this way. Windelband - so the report concludes - quietly let Rickert read his paper to its very end and then said nothing but. 'Alas, dear Mr. Rickert, if only it were the case (Ach, lieber Herr Rickert, war's doch so.)' " Although Kant is not mentioned in this story there can be no doubt as to what Windelband's reaction meant in regard to him, and as to Rickert the event is said to have been his Damascus. It is different with a less euphoric statement made by Ludwig Boltzmann. At about the same time Boltzmann writes: 15 "I am convinced that the laws of thought came into being by adaptation of the connection between our internal ideas of objects to the actually existing connection between the objects ... The existing laws of thought will be nothing but inherited habits of thinking in the sense of Darwin ... We may call these laws of thought a priori because through an experience of the species, lasting for many thousand years, they became innate in the individual. However - continues Boltzmann - it seems to be but a logical slip when Kant infers from this their infallibility in all cases ... ". It is obvious that from Boltzmann's formulation that the experience of the species becomes innate in the individual it is but one step to Konrad Lorenz' famous equation of the phylogenetic a posteriori and the ontogenetic a priori. Lorenz, too, likes to present his theses as modifying and even destroying Kant's view that the preconditions of experience are a priori. In his original paper of 1941 Lorenz gives his hypothesis the wording: 16 ''the 'a priori' is based on phylogenetically developed, hereditary differentiations of the central nervous system that .... determine [our] dispositions to think in certain forms. Clearly, - he continues - this view about the 'a priori' as being an organ amounts to the destruction of its concept: Something that resulted from a phylogenetic adaptation to the laws of the natural external world, in a sense, came into being a posteriori, even if this happened in a manner completely 14 15 16
Hensel 1947, p. 404. Boltzmann 1979, p. 179, 252, 253. Lorenz 1941,p. 96.
42
1.3
Kant's Apriorism and Some Modern Positions
different from that of abstraction or deduction of previous experience", i.e. different from the way human individuals obtain empirical knowledge. If you add to this that for Lorenz every process of adaptation is a process of cognition because it is essentially an information transfer then it becomes clear that evolutionary epistemology is an extreme form of naturalising epistemology. Sometimes one even gets the impression to have arrived at quite common biological science if only there were not these repeated and deliberate wordings of major assertions in philosophical terms. The resulting ambiguities are perhaps most evident from that feature of evolutionary epistemology that has been called its hypothetical realism. 17 On the one hand, this realism is pronounced as an anti-idealistic philosophical creed common to all, or almost all, modern scientists expressing their basic attitude towards nature as the object of their research. On the other hand, this realism is called hypothetical because its advocates claim to have empirical arguments in favour of it. Lorenz talks about "the factuality of the great evolutionary process" in the course of which all creatures together with all their various cognitive faculties have evolved as the result of a physical interaction with their respective environments. 18 However, although the evolutionary aspect extends the causal embedding of cognitive faculties and acts far beyond what was assumed in the classical theories, one still fails to see how this supports the position as a realism and, in particular, how it could be falsified as such. Lorenz' views, recently further developed by Gerhard Vollmer 19 and others, have met with sharp criticism on all sides. The criticism can be reduced to what presumably had been already Wittgenstein's reaction to the first movement of Darwinian epistemology: "The Darwinian theory - so Wittgenstein in Tractatus 4.1122 - has no more to do with philosophy than has any other hypothesis of natural science." Mutatis mutandis, the present critics seem to argue: Go on doing your ethology, your neuro-physiology, your genetics of the brain etc. What you will be doing there will be submitted to critical philosophy of science just as it has been and still is the case with respect to any other products of science. But it will not in turn yield any significant contribution to this kind of critical epistemology. It will not become its own judge. 20 Justified as it may be I don't think we can satisfy ourselves with this criticism. It tends to separate fields of research by explicitly arguing that they have nothing to do with each other although every body, in the depth of his heart, has the sure feeling that they do have to do with each other. To a large extent philosophy of our century has been analysis without a subsequent synthesis. Logic and language, for instance, have been separated from thinking, and this development has been praised as a success which in a
19
Lorenz 1977, pp. 17 £f. No. 16, pp. 97 £f. Vollmer 1985.
20
See, for instance, Stegmiiller 1984
17 18
1.3 Kant's Apriorism and Some Modern Positions
43
sense it was. However, biologists do not seem to have participated in it as long as we can find arguments in their books to the effect that we cannot draw false conclusions "because the excitation processes in our brain are such that from them only that excitation constellation results which is the basis of the [right conclusion] ... the laws of logic are the laws of the cerebral excitations." 21 Now, the genetic reason why to a present-day philosopher this sounds absurd is that it is an isolated endproduct of the decay of traditional epistemology where every author at least attempted to combine a host of epistemologically relevant aspects although nobody ever succeeded in producing a consistent whole. Evolutionary epistemology, I think, can be subsumed under a very comprehensive but none the less one-sided world view. This - as it may be called - cosmological view neglects the Aristotelian priority for us in favour of the priority in nature, it denies any transcendental basis of human knowledge in the sense of Kant, it expects from science to free man from "the bounds of sense" by whatever methods that we may come across. And part of this liberation would precisely consist in a scientific investigation of those bounds, i.e. of those biological preconditions of experience. The view has perhaps never been expressed more clearly than by Russell when he portrayed himself as standing in opposition to Kant: "I reverse the process - said Russell - which has been common in philosophy since Kant. It has been common among philosophers to begin with how we know and proceed afterwards to what we know. I think this a mistake, because knowing how we know is one small department of knowing what we know. I think it a mistake for another reason: it tends to give to knowing a cosmic importance which it by no means deserves ... I accept without qualification the view that results from astronomy and geology, from which it would appear that there is no evidence of anything mental except in a tiny fragment of space-time, and that the great processes of nebular and stellar evolution proceed according to laws in which mind plays no part." 22 III
Diametrically opposed to this outlook of evolutionary epistemology is the second modern position I want to review: constructive philosophy of science. As opposed to the cosmological position it may be called an anthropological view where, as a matter of course, in both cases no specific cosmology and no specific anthropology is implied. What is meant is that the imaginative picture underlying the anthropological view is just not the universe happening to contain man during some small period of its evolution. Rather it is man finding himself in the world and trying to understand his existence by developing his abilities and forming his environment. Accordingly, the firm ground on which all thinking, including science, rests is human life in the 21 22
Rohracher 1967, p. 115. Russell 1959, p. 16.
44
1.3 Kant's Apriorism and Some Modern Positions
ordinary sense. Partly relying on continental philosophy characterised by the names of Dilthey, Husserl and Heidegger, Paul Lorenzen, the founder of constructive philosophy of science, has emphasised that ''thinking has to start from life, from the practical situations in human life. All thinking is a sublimation of what we do in practical life all the time. The philosopher does no longer misunderstand himself - as it was common in modern times since Descartes and Locke - as being a mind having first to take cognisance of the world by means of sensations, intuitions and inferences. Rather the world is immediately given and immediately at hand ... Philosophy has acquired a new [kind of] immediateness." 23 Having this starting point constructive philosophy of science is by no means developed, as one might suspect, as an antiscientism. On the contrary, philosophy is basically understood as being philosophy of science. There are, however, several peculiarities as to what philosophy of science should be, and perhaps the most important one is this. Neither is philosophy expected to answer last questions left unanswered by science nor can science answer genuine philosophical questions. In a very strict sense philosophy is prior to science and has to prepare the ground for it. According to Lorenzen the legitimate hereditary of Aristotelian First Philosophy can only be "the construction of a conceptual framework within which the [traditional questions of the evolution of the stars, of life on earth, of the soul and of mind] can meaningfully be posed ... and that [necessarily] precedes any empirical cosmology, geology, biology etc." 24 Put in technical terms, philosophy has to be developed not as meta-science but as proto-science. Using Russell's phrase, the old Kantian order is restored: priority is given to how we know over what we know. Indeed, if certain re-interpretations are allowed, then the position under discussion is, if not the, then at any rate one modern version of Kant's Apriorism. In particular, the principle that every transcendental judgement is a priori, meaningless in evolutionary epistemology, can be maintained. As I have suggested, to view Kant's transcendental logic as a normative discipline would not be completely beside the point even as an interpretation of Kant. In constructive philosophy of science, although the matter is not formulated in these terms, transcendental judgements in the sense of conditions ot' the possibility of experience would be all the consequences of the synthetic norms that we find ourselves forced to introduce in a systematic reconstruction of science. And that there be such norms, i.e. norms besides linguistic conventions, is one of the basic tenets of constructive philosophy of science. It results from the strategy to maximise our understanding of the world by exploiting the active role man plays not only in life but also in science. Here, too, is a relation to Kant who is quoted to have repeated the old saying that ''we properly understand nothing but what we can produce ourselves if the material were given to us." 23 24
Lorenzen 1968, p. 26. Lorenzen 1974, p. 131.
1.3 Kant's Apriorism and Some Modern Positions
45
It is, of course, its emphasis on the normative aspect of experience and science that justifies the classification of constructive philosophy of science as an anthropological world view. For norms can only occur where human actions come into play. But there is one further, very important anthropological feature of our position. It is its insistence on methodical order. Within the cosmological world view the final state of science would be seen as yielding a description of the universe, objective in the sense that, although also containing a description of man and his achievements, it would not contain the methods by which just this picture of the universe has been obtained. By contrast, constructive philosophy of science aims at a state of science explicitly showing the methods by which all its results have been obtained and, moreover, showing them in due order. By consequence, since all scientific methods have their roots in practical life the only reasonable aim can be to show science as having become part of human life. Human life will never be merely the object but will always belong to the framework of science. As regards the order in question, the requirement simply is that, if not in the daily researchwork, then at any rate in a final presentation of science we must avoid circularity. This sounds very simple indeed because according to a scientific folk view non-circularity is the hall-mark of scientific thinking. And yet it is mainly with this requirement that constructive philosophy of science is apt to become critical of science as it is and, moreover, to provoke the solid resistance of the scientific community as it is. In order to show how this comes about let me illustrate the principles mentioned so far by quickly running through proto-science according to its present state of development. 25 Protoscience is roughly subdivided into logic, mathematics and proto-physics in this order, and this ordering seems to find general agreement. Logic is given a dialogical interpretation that is a particularly lucid example of the pragmatic intentions of constructive philosophy of science. The dialogues in question are governed by rules introducing the logical connectives by telling us how we may attack or defend statements governed by the respective connectives. Complex statements lead to extended dialogues and in some cases can be defended by a proponent irrespective of the meaning of their primitive constituents, Le. by the sole consideration of the rules introducing the connectives. By definition these are the logically true statements. The essence of logic, therefore, turns out to be a well-defined kind of success: the winning of a game that deserves our interest for pragmatic reasons: the pervasive role of the connectives in ordinary language. It is similar when we come to arithmetic, the basic discipline of pure mathematics. There is, however, the important difference that mathematical truth can no longer be defined on the sole basis of linguistic rules. In this sense it is synthetic truth but it is not synthetic truth in any model theoretic sense. Take, for instance, any statement of the form that one number is greater than another one. Their pragmatic justification comes from their frequent use in 25
Lorenzen/Schwemmer 1975.
46
1.3 Kant's Apriorism and Some Modern Positions
daily life. Their truth as mathematical statements, however, is not based on interpretation. On the mathematical level the relation "greater than" is recursively defined by certain rules of a calculus, and the only statements to be found on that level are statements of derivability of pairs of numerals according to those rules. Thus it is again some kind of success - the production of a derivation based on rules - that leads to. a priori and even synthetic truths, and it turns out that along these lines a considerable part of analysis - constructive analysis - can be reconstructed strictly observing the principle of methodical order. 26 How far can this be pushed ahead? I think advocates of the program under discussion would be anxious to emphasise that a calculus defining, say, a primitive arithmetical relation would not depend, as to its existence, on how much chalk there is available and how smooth our black-board is. On the other hand, entering proto- physics it is very natural now to extend the a priori constructions of logic and mathematics by hard ware productions of the most general kind. For here, too, we must put up 1) norms telling us what the things to be produced should be like and 2) rules of production such that if they are observed the resulting products will fulfill those norms. Obviously, in this case it does matter how accurate the fulfillment is. But not only are technical productions the natural approach to physics. Rather technics is defined to be its very object. Physical theories do not describe or explain nature. Instead a physical proposition invites us to realise an instrument or any device whatsoever and, to make it function in the way prescribed by that proposition. Quoting Peter Janich who has developed a major part of proto- physics "the only thing that we can know about nature is that we are successful with our physics." 27 Thus in perfect analogy to the situation in mathematics with respect to the 'world of numbers', "not physical sentences themselves formulate knowledge about nature, but the meta-sentences by means of which we state that certain physical sentences are (technically) successful. The rest is speculation or meaningless speech, so in particular the interpretation of physical laws as 'laws of nature"'. Thus we again end up with the pragmatist's criterion of practical success as the fulfillment of science beyond which we cannot reach. However, my last two points before leaving constructive philosophy of science are more specific ones. They concern measurement and physical law in their mutual relationship. Even if one does not make physics an entirely normative science as it transpires from the last quotation, everybody will agree that measurement is absolutely essential and that technics and norms and production rules enter physics through measurement. Insofar physics certainly is a technical and normative science. But besides measurement there are also the laws and theories of physics, and these seem to make it an empirical science. Now there is a twofold relation between measurement and law: Measurements are used 26 27
Lorenzen 1965. Janich 1977, p. 311.
1.3 Kant's Apriorism and Some Modern Positions
47
to test laws, and laws are used to perform measurement. It is obvious that this double relation is apt to lead into serious trouble with respect to the principle of non-circularity. According to this principle we would have to extend physics step by step, strictly observing a linear order. Thus if at any stage of this reconstruction a new quantity is introduced by some measuring procedure the latter is not allowed to use a law to be introduced only at a later stage. Correspondingly, if at a given stage a new law is introduced it is not allowed to test this law by means of measuring procedures that have not been introduced up to this stage. Now, the possibility cannot be excluded that one day physics can be presented according to such methodical principles. Its present state, however, as seen from the constructivist's point of view, can only appear as a veritable nightmare. There are thousands of cases of more or less serious inversions of what even a physicist would feel to be a natural order. For instance, in Millikan's method of measuring the charge of the electron a length (!), the diameter of the oildrops carrying the electrons, is measured by using apart from Newton's second law his gravitational law, Stoke's law and Archimedes' law. There is only one generally acknowledged principle controlling this kind of practice, and this is consistency. To mention but the most simple case, if a quantity is measured by two different methods then the physicist will look whether he has got the same result. It has to be admitted, therefore, that the present state of physics makes it virtually impossible to separate the a priori or, for that matter, normative contribution to physical science from the empirical one. In trying to disentangle the very complicated situation, constructivists tend to require rather strong norms for the properties of measuring instruments. 28 Even Euclidean geometry is reintroduced into physics in this way, and it is at present not clear how this can be reconciled with general relativity. This aprioristic tendency is a second point where constructive proto-physics becomes critical of physics as it is. The present controversy on both points is somewhat hopeless because either side rejects important parts of what the other side takes for granted. It even seems that the only remedy might be that both parties change their roles for a while so that, in particular, the constructivist philosophers would be given the opportunity to show how empirical physics is possible on the basis of their principles. IV
In the last part of my paper I shall be engaged on the position developed by C. F. von Weizsacker. 29 Although radical in other respects this position avoids the extremes we are coming from. A certain balance of the cosmological aspect of what we know and the anthropological aspect of how we know is characteristic of it. There is no definite order between philosophy and science the former either preparing the latter or rounding it off. Methodology deserves 28 29
Janich/Tetens 1985. Von Weizsacker 1971, 11.3; do. (1979).
48
1.3 Kant's Apriorism and Some Modern Positions
no interest for its own sake, and the philosophical importance of science lies in its problems and results in substance. Very much depends on what science will show us in the long run. On the other hand, there is no suspension of philosophical judgement. For, if at all, we have to do our philosophy now in view of the science as it is now. And therefore we shall not be spared some methodological refiection about how we should proceed. More specifically, Weizsiicker's position results from the development of two convictions with the help of two methodological principles. The convictions are, first, that all our thinking is historical but, second, that nonetheless our thinking can successfully strive for unity. The first methodological principle - the principle of refiection - says that "in developing a theory, one should make maximal use of the meaning, as understood beforehand, of those concepts without which it would not have been possible even to formulate the questions that the theory is meant to answer." 30 As a second methodological principle the principle of homogeneity requires "that a law of nature is as universal as possible." Each of the four elements introduced stands in a definite relation to some element of Kant's transcendental philosophy. This is perhaps least evident for the last mentioned principle of homogeneity because this sounds much more Cartesian. However, Kant conceived his transcendental philosophy to be a theory of objects in general. Similarly, if in the principle of homogeneity laws of nature are expected to be as universal as possible what is expected is that physics should never be satisfied with statements about objects that have to be premised by certain empirical conditions in order to be valid. The final laws of nature will talk about objects in an empirically unconditional way. How they will manage this in a manner even coming close to Kant's transcendentalism will become clear in the course of my presentation. In the principle of refiection we are invited to make use of the meaning of concepts as understood beforehand. This seems to imply a conceptual apriorism as we find it in Kant. As opposed to his treatment of a priori judgements Kant never defines what he means by a concept a priori. But since his categories are among them we must assume that they, too, have an absolute function in our understanding. By contrast, in Weizsiicker's view, although again reminiscent of Kant, the meaning of a concept is understood "beforehand" only in a relative sense. Certainly, at any given time there are concepts presupposed in much more of what we think than are others. But a timeless order of presuppositions would be incompatible with the first mentioned conviction of the historicity of our thinking. This inevitable timely character of human thinking and theorising is - to put it bluntly - a fiat denial of Kant's Apriorism. As for many others this denial was provoked by Kant's complete failure to establish Euclidean geometry as an a priori part of physics by assuming space to be a pure intuition. It was provoked by this event, as one should add, in combination with the conviction that once such 30
Von Weizsacker 1971, p. 195.(1980, p. 157).
1.3 Kant's Apriorism and Some Modern Positions
49
a thing has happened it is hopeless to start running fights on the same front. In the twenties this had been done by men like Hermann Weyl and even Carnap in their attempts to find an a priori foundation for some weakening of Euclidean geometry. But people were not convinced by these efforts. It may be mentioned in passing that Weizsiicker's Anti-Apriorism, in a sense, goes as far as to include logic. Much impressed by quantum theory he followed v. Neumann's idea that even logic may have been shaken by the development of physics. So here we seem even to have reached a point of no return, so to speak. However, the second conviction I have introduced to characterise the position brings us back to Kant, if it is combined with the positive aspect of the first. This was that human thinking is basically historical, and up to this point we have only exploited its negative consequence that men is unlikely to find any timeless, a priori truths. However, what we can do is to integrate at least part of our history and entertain hypotheses about where it leads to. If this is done for the history of the natural sciences - perhaps the only case in which it can be done - then the hypothesis suggests itself that we gradually approach a unified theory of nature. Not now knowing, of course, which theory this will be, the question is whether we don't have any conceptual means to suspect what this final theory will be like, - what kind of theory this will be. Strengthening the hypothesis of unification the answer is: the final theory will tell us nothing but the conditions of the possibility of experience, and this will constitute its unity. Let me quote von Weizsiicker: "To render this unity of physics comprehensible is the task contemporary physics sets before philosophy. We can refuse the task as too difficult but we cannot .... reduce it to a lesser task. The program that Kant formulated for classical physics will to-day prove either to be unrealisable or to have been realised as soon as self-evident assertions concerning the preconditions of the possibility of experience have led to the construction of precisely that unified physics at which the contemporary development so obviously aims." 31 Now, seen in the light of Kant's transcendentalism, the remarkable thing about the position under discussion is that is takes up Kant's fundamental idea of the conditions of the possibility of experience without feeling committed to view these conditions as being known a priori. Moreover, not only do we find these conditions only in the course of experience, their complete system will not be known until physics is finished, and then it will be known as the system of its fundamental laws. Weizsiicker's position may therefore be called an "independent transcendentalism" in the sense that it is a transcendentalism independent of its Kantian binding to apriorism. It is an empirically based transcendentalism. The preconditions of the possibility of experience are looked at in the manner in which Peirce looked at truth. Just as for Peirce truth was that "opinion which is fated to be ultimately agreed to by all who investigate" the preconditions of experience are viewed as the final result of basic physical research. In both cases it is, of course, allowed and 31
Von Weizsiicker 1971, p. 192 (1980, p. 155).
50
1.3 Kant's Apriorism and Some Modern Positions
possible to make hypotheses at any time not only about the kind of thing to be approached but also about its details. In Weizsacker's case this is what is explicitly allowed by the principle of reflection. It takes care of the fact that the development of physics will gradually and convergently reveal its own preconditions. For instance, part of what we mean by experience seems to be that we learn from the past for the future. We expose our theories to empirical tests, meaning thereby that we try to confirm or disconfirm their predictions. In this way reflection on the meaning of experience shows that historical time with its basic structure of past, present and future may be among the final preconditions of experience. It is obvious that independent transcendentalism implies or perhaps even is some kind of what conventionally is called reductionism. Moreover, it seems to imply a very strong reductionism that goes far beyond what mutatis mutandis Kant had aimed at. As is well known Kant's critical philosophy is characterised by a certain indecision as to how much of physics could (or should) be deduced from general principles of understanding. And in my introduction I have emphasised that at least Kant's original intentions were anything but reductionist. By contrast Weizsacker is quite explicit about what a final theory of physics in his sense would have to achieve: "Such a theory would have to allow us to deduce, in principle, the ... structure of the Lorentz group and of quantum mechanics, the existence and number, the masses and interaction constants of the ... elementary particles ... , each and every line in the spectrum of iron etc. etc .... " 32 Moreover, by a deliberate fusion of physics and philosophy, all this would not have to be deduced from a physical theory in the ordinary sense, but by a theory that, having the unique status of being final, tells us nothing but the conditions of any possible experience. It is obvious that this is a program no less ambitious than that of constructive philosophy of science, and, correspondingly, physicists may be no less reluctant to accept it. Without being in the position to enter a substantial discussion let me conclude this talk by some comments that may make the program more plausible than it seems at first sight. The comments will roughly follow a suitable subdivision of the matter into three successively increasing claims: local reduction, completability of physics and transcendental character of the final theory. 33 Perhaps the strongest argument that can be given comes from the development of the natural sciences in modern times. It can hardly be denied that this development in a sense shows features of reduction and unification. This can already be seen from the creation of such intermediate fields as biochemistry, physical chemistry and even biophysical chemistry. Weizsacker likes to use a model that has been invented by physicists under the impact of the development of physics in our century. According to this model the development can roughly be described by a series of closed theories. A closed 32 33
ibid. Scheibe, 1987e; do. 1986c; do. 1986b.
1.3 Kant's Apriorism and Some Modern Positions
51
theory is a theory that cannot be improved by small changes. In the series each theory contains its predecessor as a limiting case. These are particularly clear examples of local reductions. Similar models have been suggested by philosophers of science, e.g. by Lakatos in his theory of scientific research programs. On the other hand, much scepticism has been articulated by historians of science and scientists, in particular by chemists and biologists, as to the appropriateness of such reductive models when applied to historical reality. The questions raised in the controversy that was to follow and is perhaps best known by the names of Popper and Kuhn mark out the point of closest contact between the main stream of recent philosophy of science on the one side and Weizsacker's program on the other. From the viewpoint of independent transcendentalism the most interesting aspect of local reduction, i.e. reduction of one theory to another one, is the appearance of a condition or a set of conditions formulated in terms of the reducing theory and being the conditions under which, as seen from the reducing theory, the reduced theory is approximately valid. If, for illustration, our theories stand in the order of historical succession the conditions in question, being expressible only in the later theory, were not known at the time of the earlier theory. In a sense they were tacit presuppositions or preconditions of the experiences formulated in the earlier theory. I wish to emphasise at this point that independent transcendentalism, insofar as it contains the idea of a final unity of physics, is quite prepared to accept that a local reduction may involve dramatic conceptual changes as they have been the main issue in the controversy mentioned. The only demand that has to be made is that the reducing conditions can be understood as the explicit versions in the reducing theory of those tacit presuppositions of the reduced theory. In particular, their validity may become the explicit presupposition for the approximate definability of a basic concept of the reduced theory within the reducing theory. Since that validity can itself be assumed only approximately without contradicting the reducing theory a considerable conceptual change may be involved in such a case. However, being analysed in the way indicated there would be no mystery about it. The foregoing argument concerned the contingent conditions that together with the laws of a theory seem to have that preconditional or presuppositional function with respect to the concepts and laws of the reduced theory. In a final argument we have to consider to what extent the credit for this function has to be given to those laws themselves. For it is only the answer to this question that will have an immediate bearing on the main thesis of independent transcendentalism which was that the fundamental laws of a completed physics will formulate nothing but the preconditions of any possible experience. Now, assuming an admittedly somewhat vague absolute distinction between lawlike and contingent propositions, the first thing to be observed is that in local reduction for the derivation of the laws of the reduced theory the presence of the laws of the reducing theory is absolutely essential: A
52
1.3 Kant's Apriorism and Some Modern Positions
proper reduction of a lawlike proposition to a contingent proposition seems out of the question. Therefore, their will be an essential reductive connection between any special law and the final laws of physics. On the other hand, the presence of a contingent component is also important because it is only this component that expresses the limited range of validity of the reduced law. In this way local reduction shows that the content of the reduced law can be subdivided into a contingent and a lawlike component. This in turn shows that the reducing law has the greater universality as the reduced law or equivalently - that the lawlike content of a law decreases in the course of the development of physics to unity. It seems, therefore, at least plausible that at the very end of this development, if it exists, the fundamental laws cannot meaningfully be conditionalised and thus are themselves the preconditions of any possible experience. In this talk I have tried to review three epistemological views in their respective relation to Kant's apriorism, confined to the theorem that every transcendental judgement is a priori. We have seen that this apriorism survives in the modified form of a normative apriorism in constructive philosophy of science whereas it is rejected by the two other views, although for different reasons. Evolutionary epistemology concentrates on the biological preconditions of experience in the sense of material preconditions investigated in biology and in the theory of biological evolution. In this way transcendental judgements tend to become part of biology as an empirical science and thus loose their a priori character. Independent transcendentalism retains the original generality of transcendental judgements and perhaps their epistemological status as well. But it denies apriorism and provides transcendentalism, so to speak, with an eschatological aspect. I also used the admittedly crude distinction between a cosmological and an anthropological world view. It roughly corresponds to the Aristotelian distinction between what is prior in nature and what prior for us. We found the cosmological view behind evolutionary epistemology, the anthropological behind constructivism. For the first, life and observership of man are only accidents in a universe independent of observers, for the latter our knowledge about the universe is a knowledge about man in the first place. The advent of quantum theory has favoured an intermediate view. "Quantum mechanics - as seen by Wheeler - has led us to take seriously and explore the ... view that the observer is as essential to the creation of the universe as the universe is to the creation of the observer." Independent transcendentalism, also much impressed by quantum theory, takes a similar stand. In subsuming our three positions under these only very vaguely defined views I do not want to deny any of them the capability of developing very powerful methods of argumentation to be applied within them. I only wanted to remind us that it might be very difficult to argue for any of these positions taken as a whole, and that in our decision for or against them we might very well depend on those imaginative pictures of the universe.
1.3 Kant's Apriorism and Some Modern Positions
53
It goes without saying that there are other epistemological positions now before the public that lend themselves for a presentation in the light of Kant's approach. Again, the four positions I have chosen to talk about are either well-known or were created by members of our Academy that are very well equipped to speak for themselves. But I thought that a certain synopsis given beforehand may be of some help for further discussion during our meeting. Finally, you may have noticed that all positions of my review are of German origin. But my choice to make such a selection the subject of the opening address may be justified by the fact that - to the best of my knowledge this is the first meeting of the Academy to be held on German soil. Let us hope that it will become a success.
1.4
c. F.
von Weizsacker and the Unity of Physics* I
One of the classical problems of Western philosophy and science that are still topical today is the problem of the unity of knowledge. Ever since Plato philosophers have time and again made the unity of knowledge a central requirement, and to a growing extent science has - at least for parts of what was known at the relevant time - produced such unity. We seem to have a need to achieve a unity in the field of theoretical cognition, one that is similar to the metaphysical need that is so often invoked. And with respect to the former we appear to be in a better position than with respect to the latter. But 'unity of knowledge' exists also as a problem: At least there are the questions concerning what we mean by such a unity, to what extent it is achievable in the sense in which it is meant, and whether it is epistemologically speaking even desirable. Carl Friedrich von Weizsiicker is one of those thinkers who hold a conception of unity with respect to the realm of our knowledge of nature. It is no exaggeration to claim on behalf of this many-sided thinker that the problem of unity has formed and continues to form the centerpiece of his philosophical efforts, inasmuch as these efforts concern the natural sciences and in particular physics - Weizsiicker's point of origin. He himself has said that "[to] render [the] unity of physics comprehensible is the task contemporary physics sets before philosophy. We can refuse the task as too difficult but we cannot ... reduce it to a lesser task". 1 This view - that currently the problem of unity is the problem of a philosophy of physics - must be understood against the background of Weizsiicker's assessment of the state of today's physics. This assessment is above all characterised by 1. the claim "that physics is characterised by a greater real conceptual unity today than at any time in its history" and 2. the assumption "that completing the conceptual unity of physics is a finite task." 2 More than that, Weizsiicker sees the point in time at which physics could be completed in this sense in the not too distant future. These are far-reaching theses regarding the state of physics, and the said philosophical task, "of making the unity of physics comprehensible", is a difficult one by Weizsiicker's own estimation. One will therefore want to look around for allies. Does someone who today advocates such theses, and who takes on this task, find him or herself all alone? It almost seems that way. It is true that in the last hundred years physicists have from time to time evolved theses regarding a partial completion or completibility of physics. Just re* Dedicated to C. F. v. Weizsiicker in honor of his 80th birthday First published as Scheibe 1993c, translated by H. J. Wilhelm 1 Weizsiicker 1971, p. 192 (1980, p. 155) 2 ibid. p. 207 (p. 168)
54
1.4 C. F. von Weizsacker and the Unity of Physics
55
cently Stephen Hawking has gone quite far in this regard. 3 Envisaging the goal of physics as "a complete, consistent, and unified theory of the physical interactions", Hawking sees "some grounds for cautious optimism that we may see a complete theory within the lifetime of some of those present here". But on the whole there have been very few such voices from the field of science, and there simply does not exist a conception of a complete physics as the basis of all natural sciences - neither in the sense of a thoroughgoing, explicit consciousness of all scientists of working in the service of this conception; nor in the weaker sense, in which it is current in other circles, that there exists a group of experts which is already in possession of such a conception and which would at any time be ready to provide reliable information about it. The prospects do not turn out to be much more favourable when we look across from the field of science to contemporary philosophy. It seems to be all but forgotten that the last philosophical orthodoxy in matters of science - the logical positivism that reigned from the 1930s until the 1950s - classified the unity of science not merely as an incidental topic, and not even merely as an important topic, but rather as the supreme goal of the efforts of science and of the philosophy of science in genera1. 4 At that time, a grandiose reductive program was envisaged not only for the natural sciences in the narrower sense, but one that included the solution of the mind-bodyproblem as a reduction of psychology to physics. Since the beginning of the 1960s, however, with the emergence of the historically oriented and at the same time historically relativising conceptions of natural science of Th. Kuhn , Feyerabend and others, the positivist ideal of unity in particular has been either expressly attacked or indirectly undermined. Feyerabend noted that "the plurality of theories must not be regarded as a preliminary stage of knowledge that will at sometime in the future be replaced by the 'one true theory.' Theoretical pluralism is assumed to be an essential feature of all knowledge that claims to be objective.,,5 Even those thinkers who developed explicitly rationalist concepts of progress, like Popper and Lakatos, rejected the idea of a final goal or even of a final state of a convergent development of the natural sciences, for fear of historicist theses of determination. 6 Thus neither scientists nor philosophers seem to be particularly interested in von Weizsiicker's philosophical goal, "of making the unity of physics comprehensible", and to this extent the prospects are unfavourable. Yet it is just the aforementioned historical turn in the philosophy of science that contributed positively to the fact that this quite isolated project can be theoretically discussed at all. For in spite of the commonality of their professed objectives, Weizsiicker's philosophy stands as a quite foreign element with respect to the philosophy of logical positivism. It would probably not even have 3
4 5 6
Hawking 1980, p. 1£. Neurath et al. 1929; Carnap 1938; Oppenheim/Putnam 1958 Feyerabend 1965, p. 149; see also Kuhn 1962 Popper 1957; Lakatos 1970
56
1.4 C. F. von Weizsiicker and the Unity of Physics
been possible to initiate a fruitful discussion between the two. For already on very basic issues differences of opinion would have emerged that are almost impossible to bridge. Only a few remarks will suffice to show how separated Weizsiicker's way of thinking is with respect to this older orthodoxy. Recently Weizsiicker has construed the theory of science of our century as "a consciously subsequent philosophy". 7 It surrenders the claim of reliable knowledge that once animated philosophy to the positive sciences, and it asks subsequently, what enabled the latter to achieve such knowledge. As Weizsiicker himself remarks, however, this path was not pursued consistently to the end. It is true that the historical facticity of the sciences was presupposed. But one sought an explanation of their success, an explanation that would no longer depend on the progressive substantial results of science. In an especially crass way this is true of modern forms of apriorism such as the constructive theory of science. 8 But it is also true of logical empiricism. For it too sought to maintain a sharp distinction between analytic and synthetic propositions, and this tendency makes it impossible to provide the relevant explanation of success in dependence on the progress of science, something that Weizsiicker fundamentally wants to allow for. Logical empiricism was furthermore essentially a methodology which, although not anticipating the results of the sciences, develops - and this independently of the results norms for their methods. One even strove for a monism of method, but, according to Weizsiicker, this path taken independently of the object led "not even to an authentic unity of method". 9 Also, the (rather quiet) hope of the logical empiricists of finding the unity of science by way of the unity of method indicates the opposite path in comparison with Weizsacker's procedure of understanding the unity of physics from the unity of nature. Finally, for Weizsiicker, even the main problem of empiricism - the empirical foundation of the laws of nature - is no longer truly relevant. As he says, this ''was a meaningful question at a stage of physics in which many apparently independent laws were brought forward as hypotheses. .. But in present-day physics these laws are no longer independent. We accept them as necessary once we have accepted the basic theories from which they follow. And so today the only meaningful question is how we establish basic theories."l0 And this is the problem of the unity of physics. As this brief confrontation reveals, Weizsiicker is not the type of philosopher of science moulded by logical positivism. For the latters philosophy is essentially a philosophy of normal science and hence, as Weizsiicker puts it, "itself a normal science", whose fate it is ''to be superseded in the next scientific revolution".l1 An important proximity to Th. Kuhn, whose main 7 8 9 10
11
Weizsiicker 1985, p. 624 Lorenzen 1974 Weizsiicker 1971, p. 12 f. (1980, p. 5) ibid. p. 240 (p. 194) Weizsiicker 1985, p. 625
1.4 C. F. von Weizsacker and the Unity of Physics
57
concepts I am now appropriating, is on the other hand given through the fact that Weizsacker's thesis concerning the completibility of physics in the form of a conceptually unified theory is partly a historical thesis, and that early in his scientific career Weizsacker was able not only to help reap the fruits of a scientific revolution in Kuhn's sense - quantum theory - but was also immediately and expressly made aware of this process as a revolutionary one by his teacher Heisenberg. This meant that already quite early a certain idea of the historical development of a science towards its unity became an integral part of Weizsacker's unity thesis. And here we find those affinities to recent tendencies in the theory of science which allow Weizsacker's efforts to be moved into a perspective that includes them in current controversies, even if his main thesis has remained an extreme position that is shared by no one. II
With this I come to the first of what Weizsacker himself regards as the threepart task "of making the unity of physics comprehensible". This part is to "show historically that physics has developed toward unity.12 To prove his point, Weizsacker referred to a developmental model of which he himself says that it is "commonly accepted by today's physicists in their methodological reflections".13 Indeed, it is a model created by physicists themselves to account for the development of their discipline, a model that was probably first formulated by Boltzmann at the end of the last century.14 In his formulation, Boltzmann on the one hand rejects the idea that in physics one "keeps adding new basic notions and basic causes of appearances to those already established and thus in a continuous development discovers more and more of nature". In his opinion, "the development of theoretical physics was ... rather always one by leaps and bounds". On the other hand, Boltzmann does not want to suggest that this development by leaps and bounds implies that the relevant "old conception. . . was completely useless". Rather, "the old theory too was useful ... , in that it too partially provided a picture of the facts". It is just "that the new conception is . .. a better. .. picture, a more adequate description of the facts". The two basic features of Boltzmann's idea of the development of physics - progress through leaps between phases, as it were, while preserving that which has withstood the test of time - have subsequently been accentuated in the model of the physicists so as to yield the concepts of the closure and of the limiting case of a theory. According to Heisenberg, a theory is closed, if the applicability of its basic concepts in a domain of objects already entails the validity of its laws in this domain. 15 A later, equivalent formulation of the 12 13 14 15
Weizsacker 1971, p. 131 (1980, p. 103) Weizsacker ibid. p. 208 (p. 170) Scheibe 1988b (this vol. 11.6) Heisenberg 1969, p. 132 ff.
58
I.4 C. F. von Weizsacker and the Unity of Physics
concept of a closed theory was given by Weizsacker. 16 According to the latter, a theory is closed, if it cannot be improved upon by means of small changes. These two formulations are equivalent, if we understand small changes to be such that they leave the sense of the concepts of the theory untouched, and instead, for example, only supply a corrective term in an equation of motion. Hence, closed theories are just those units which in Boltzmann's sense in the development of physics can only be changed through a "leap" to a new closed theory, where this leap consists of a "radical recasting of the conceptual foundations" of the theory. Such a recasting becomes necessary, if, due to an expansion of the domain of application, the limit of the validity of the basic laws of the theory are made visible by the fact that the concepts of the theory become meaningless beyond this limit. Thus, for example, the limit of Newtonian mechanics - the first closed theory of modern physics are, among other things, characterised by the limit of the applicability of the concept of absolute time and of the concept of the particle orbit. These and other concepts had to be abandoned in the theory of relativity and in quantum mechanics in favour of concepts which are meaningful even beyond those limits. And it is here that we come upon the concept of a limiting case mentioned earlier. The idea is that a complete and superseded theory becomes the limiting case of its successor. According to a formulation by Weizsacker, in the expansion of its domain of application the theory "comes up against the limits of what it can grasp with its own concepts. Out of this crisis. .. a new completed theory. .. arises... This theory now includes the older one as a special case, and thereby delimits the accuracy within which the older theory applies in particular instances: only the new theory 'knows' the limits of the 0Id.,,17 This knowledge manifests itself in certain contingent conditions which, if added to the basic laws of the new theory, will lead back to the old theory, making them in this sense the conditions of the validity of the latter. In the example just mentioned such conditions express the fact that certain parameters which are decisive factors for the problem at hand fall into the order of magnitude of the speed of light or of the quantum of action. The occurrence of these new constants indicates that the relevant limiting case conditions were not available in the old theory: "the theory does not reveal its own limits", as Weizsacker puts it. The two features of the developmental model sketched so far do not yet immediately allow us to recognise that a development of physics subject to them will lead to a greater unity. All the same, they prepare the way for such a recognition, as can be readily seen with regard to the concept of a limiting case: By the fact that the conditions of validity of an older theory amount to a limitation of the domain of validity of the new theory, the latter also reveals itself to be the more comprehensive theory. Here we encounter a third 16 17
Weizsacker 1971, p. 193 (1980, p. 156) ; Heisenberg 1973, p. 140 Weizsacker 1971, p. 209 (1980, p. 169)
1.4 C. F. von Weizsacker and the Unity of Physics
59
basic feature of our developmental model, one which Weizsacker expresses as follows: "In the progress of physics, the later, closed theories on the whole contain their predecessors as special or limiting cases. This circumstance leads us to conjecture that the progress of physics steadily brings us closer to more comprehensive and thereby also more elementary premises." It then becomes possible that ''the more comprehensive system also interrelates phenomena that had remained isolated from one another in the narrower system; it therefore more closely approaches the ideal of the unity of physics".18 This brings us to what with regard to the idea of unity is the most important feature of the developmental model under discussion: the increasing inner coherence of physical theories. 19 Although it is probably Heisenberg's "radical recasting of the conceptual foundations" that is mainly responsible for this idea - Heisenberg speaks of the compactness of a closed theory rather than of its coherence2o - , I want - partly for reasons of facilitating understanding - to draw also on other theoretical transitions for illustration. A standard example is the unification of Galileo's law of falling bodies and Kepler's laws in Newton's theory of gravitation. Not only did the two older theories before Newton stand disconnected side by side. Already in Kepler's theory we have the incoherence that every planet moves independently of all the others. The proposition concerning the way all planets move is here the mere conjunction of the essentially equivalent propositions concerning the way each single planet moves. For Newton's theory such a decomposition is (strictly speaking) impossible. This theory is non-decomposible (or coherent) in the far-reaching sense that no partial system of a system that moves in accordance with the gravitational equations also satisfies these equations. The development from decomposable to non-decomposable theories can have the particular succinctness that the features bound up with the decomposable theory lose their independence and - as one could say appropriately - "merge into a greater whole". Some of these cases especially serve to justify Weizsiicker's viewpoint of a unity of physics that rests on the unity of nature. The move from quantum mechanics to quantum field theory provides examples for this, which in their simplest form find expression in the schemata for the transformation of elementary particles. An older example is the unification of static electric and magnetic fields in electrodynamics. A somewhat more abstract case is Einstein's identification of gravitation and metric in the general theory of relativity. And already the special theory of relativity offers the impressive example of a previously non-existent coherent treatment of space and time. Newton's theory of space and time was an obvious case of a decomposable theory. After Newton, with the so-called Galilean space-time, a conception developed from which space has already disappeared as an inde18 19
20
ibid. p. 194 and p. 135 (p. 156 and 106 f.) On this idea and on the simultaneous growth of contingency (see below) see Scheibe 1986b Heisenberg 1973, p. 141 f.
60
1.4
c. F. von Weizsacker and the Unity of Physics
pendent object, although time has not. With the special theory of relativity time finally disappears as a separate quantity as well. Later we shall see how Weizsiicker tentatively takes certain features of quantum mechanics to be ultimate features of physics, and without a doubt quantum mechanics is at least a closed theory. Thus it might seem puzzling that Weizsiicker begins by noting that quantum mechanics does not [take] an easily recognisable step beyond classical mechanics. 21 Indeed, the former has a structure that is analogous to the Hamiltonean version of classical mechanics, one that as such does not reveal any conceptual reductions or unifications. Weizsiicker points out, however, that, aside from isomorphism, in quantum mechanics we have only one space of states, and he suspects that this reveals "a unity of the methodological approach of all object-descriptions". Indeed, the set of possible information about an electron is exactly equal to the respective set for a many-particle-system of quantum mechanics of arbitrary complexity. And in this connection Weizsiicker could have also cited the phenomenon - completely foreign to classical mechanics - of the non-separability of states in composite systems: Instantaneous descriptions of a whole system that are complete in the sense of quantum mechanics in general do not also yield complete (instantaneous) descriptions of its subsystems. Instead, a potentially high percentage of the information about the whole system flows into contingent correlations between the subsystems. Thus in a certain sense the world becomes theoretically an indivisible whole, and using Weizsiicker's terminology one could call this "the unity of nature in the object". Nothing equivalent is known in classical physics. III With these suggestions, of course, the historical task associated with Weizsiicker's unity thesis is not exhausted - not even in the sense of history. Furthermore, we have made use of a developmental model - the physicist's model going back to Boltzmann - which so far has scarcely been analysed by theoreticians of science. Weizsiicker himself surmises that "the so-called theory of science has not yet developed the concepts required for a description of these structures".22 In a certain sense, however, he himself has delved deeper into the whole subject matter in the second part of the task "of making the unity of physics comprehensible". In this part the task is one of "justifying philosophically that the unity of physics is possible and necessary".23 Here I want to leave completely open the question, to what extent Weizsiicker has succeeded at giving such a justification. In order to assess his effort, one would have to specify more precisely what a philosophical justification consists in, and what possibility and necessity mean in connection to our topic. It is clear, however, that more is needed than merely the historical explanation 21 22 23
Weizsacker 1971, p. 153 (1980, p. 121) ibid. p. 209 (p. 170) ibid. p. 131 (p. 103)
1.4 C. F. von Weizsacker and the Unity of Physics
61
dealt with so far. "As long as physics has not actually achieved a unity that is recognisable as final" ~ Weizsiicker explains ~ "arguments from the historical development cannot prove anything either for or against the possibility of that unity.,,24 The philosophical thought which Weizsiicker now adds can perhaps be described as the inheritance falling due today from the joint development of modern empiricism and apriorism. It is the thought that if we are going to reach a completion of physics at all, it will be in the sense that in a final theory we no longer articulate any empirical laws, but only the conditions of the possibility of experience in general, and that it is just these conditions which in the end will provide the unity of physics. This thought thus appropriates Kant's idea of the possibility of transcendental judgements, that is, judgements with which we articulate the conditions of the possibility of experience in general. Yet, since Weizsiicker wants to see such judgements placed at the end of physics, he would avoid Kant's precipitancy of wanting to establish transcendental judgements as a priori valid judgements independently of the progress of the empirical sciences. This would thus be the empiricist share in the inheritance. For the purposes of further explication, I may point to the fact that in connection with his project of establishing metaphysics as a science, Kant attempted to set forth a system of synthetic judgements a priori that would be in a certain sense complete. With respect to the question, how a synthetic judgement - that is, a proposition about the objective world with positive content - could at the same time be justifiable a priori, that is, independently of experience, Kant fundamentally saw two possibilities: In intuitive judgements the content in question is guaranteed through the so-called pure intuition which for us humans is a spatio-temporal one. For this class of judgements Kant mainly relies on the necessity and strict universality of geometric propositions. In discursive judgements or judgements of the understanding on the other hand, the content can be guaranteed a priori only through the fact that such judgements, while not articulating any experiences, nevertheless articulate the conditions of their possibility. The principle of causality, according to which every change is the necessary consequence of a cause, is an example of the class of these so-called transcendental judgements that has become standard through Kant. Kant regards the aprioric character of such judgements as self-evident. For how could a judgement which first constitutes the possibility of experience in turn be justifiable through experience? From Weizsiicker's point of view there are above all two things to be said regarding Kant's project. Kant had failed as far as his apriorism was concerned ~ both with respect to geometry and with respect to the principles of the pure understanding. As Weizsiicker states with some emphasis: "We will never maintain that any judgement is a priori true; in this sense we will 24
ibid. p. 213 (p. 173)
62
1.4 C. F. von Weizsacker and the Unity of Physics
eliminate the word 'a priori' from our vocabulary." 25 Accordingly, Weizsacker seeks his conditions of the possibility of all experience - to put it briefly not independently of the experience of science. That is of course not to say that those propositions in which the said conditions, once they have been found, can be articulated will in turn be propositions of experience. But as presuppositions of possible experience the propositions in question would occupy a different logical rank than the propositions of experience themselves, and in the face of such a ranking, ignored by Kant, his strict dichotomy of a priori and empirical must in any case be reconsidered. If Kant went too far with his apriorism in the sense of the anticipations connected with it, he was too cautious in another regard, according to Weizsacker. It is well known that in the phase of his critical philosophy Kant was wavering as to how far he would be able to go with his transcendental foundation of physics. The "Critique of Pure Reason", the "Metaphysical Foundations of Natural Science", and the opus postumum are prominent stations on this path of indecisiveness. Not so Weizsacker. With astonishing confidence he explains: "The program that Kant formulated for classical physics will today prove either to be unrealisable or to have been realised as soon as self-evident assertions concerning the preconditions of the possibility of experience have led to the construction of precisely that unified physics at which the contemporary development so obviously aims. Let me state it more precisely: such a theory would have to allow us to deduce, in principle, the special mathematical structure of the Lorentz group and of quantum mechanics, the existence and number . .. of the so-called elementary particles . .. each and every line in the spectrum of iron, and the laws of celestial mechanics. Here we are not allowed to be modest. This road is impassable or it leads to the destination I described.,,26 It does not require special emphasis that this program comprehends an extreme reductionism, especially when one considers that at the pinnacle of Weizsacker's deductive structure there is supposed to be, not a physical theory in the usual sense, and not simply any complete theory, but a theory which, in having the distinguished status of a final theory, tells us nothing but the conditions of all possible experience. In light of this extreme and at the same time ambitious program of reduction one will ask for arguments which would at least make its feasibility seem plausible, even if they could not prove it. The situation regarding the historical evidence has already been presented, and in the last part it will have to be complemented with the present state of affairs. Yet the main philosophical thesis that has since been introduced - unity of physics on the basis of conditions of possible experience - seems to have made the systematic situation of evidence more critical, rather than easing it. For now we know that it is from the weakest conceivable premises that the whole of physics is to emerge. 25
26
Weizsacker 1979, p. 140 Weizsacker 1971, p.192 (1980, p. 155)
1.4 c. F. von Weizsiicker and the Unity of Physics
63
In concluding this part, I want to elaborate somewhat an argument of which Weizsacker gives the beginnings, in order to strengthen our confidence. The argument makes use of the third of the four basic features of the developmental model introduced earlier, that is, of the fact that the respective relative fundamental theories of physics become more universal over time, and that this increase in the degree of universality is furthermore an explicit aim. As Weizsacker says: "The progress of physics leads to increasingly universal laws of nature; the greater universality comes about through our recognising the conditions under which the previously known natural laws are valid, i. e. recognised the conditions as special cases of more comprehensive formally possible cases." 27 Since thus over time the conditions of validity of every closed theory become known in the framework of a more comprehensive theory, Weizsacker, with the help of an example, expresses the expectation "that the task of enumerating all the preconditions with respect to which classical physics was necessary and universal will be more difficult than the same task would be for, say, quantum mechanics. We would therefore today be in a much more favourable situation than Kant was.,,28 It is obvious that this argument immediately concerns only the local reduction, that is, the reduction of a closed theory to its successor. Indeed, with regard to the goal of the argument we can make the following "profit account". At the outset every new theory presents itself as absolute. As was explained earlier, it does not know its own limits. The theory has the relative necessity of having "the last word on the matter" to which no alternative is known. The classical example of this situation is the absolute rule that Euclidean geometry enjoyed over a long period of time. Now, the crucial phenomenon is that it happened at all in the history of physics that eventually, within the framework of a more comprehensive theory, the conditions of the validity of the earlier theory, and with it also the alternatives to it, have become known. For from such examples - examples such as the incorporation of the Euclidean geometry of space into a Lorentzean geometry of space-time - we can learn what it means for empirical concepts and propositions to have something like presuppositions, that is, preconditions, the truth of which first guarantees the sense and with it the possibility of those concepts and propositions. In this connection it must be emphasised that the presuppositional view of transcendental judgements, which has recently even been recruited on behalf of the interpretation of Kant 29 , can come into play in the present case precisely because the reduction of a closed theory in Heisenberg's sense amounts to a "radical recasting of its conceptual foundations". Yet, mutatis mutandis, even Th. Kuhn's and Feyerabend's view of theory-change comes into play, and it is at this point that the proximity, mentioned at the end of 27 28
29
ibid. p. 198 (p. 160) ibid. p. 194 (p. 156 f) Brittan 1978, Ch. 1
64
1.4 C. F. von Weizsacker and the Unity of Physics
the introduction, of Weizsacker's reflections to the current controversy about theory-change becomes especially evident. The argument delivered so far concerned the contingent conditions which together with the laws of a theory have the said presuppositional, yet eventually also deductive, function with respect to the concepts and laws of the reduced theory. In a further argument we must consider to what extent this function is due to the laws themselves. For only an answer to this question will be immediately relevant for the main thesis, that the laws of a final theory of physics will formulate nothing but the conditions of possible experience. The argument presupposes what is admittedly a vague absolute distinction between law-like and contingent propositions of physics. First, we observe that in a local reduction the presence of the laws of the theory to be reduced is indispensable. However we want to construe this distinction, the reduction of a law-like proposition to a contingent one is not an option. Hence there will be ~ now from a global point of view ~ a reductive connection between each special law of physics and those fundamental laws. On the other hand, in a local reduction the presence of a contingent component is just as indispensable, since it is this component (and not the law-like one) which brings to pass the limitation of the domain of validity of the reducing theory to that of the reduced theory. Thus the local reduction shows that the content of the reduced law can be subdivided into a law-like and a contingent component. Again from a global point of view, this means that in the course of the development of physics towards a unity the last law in each case will be subject to less contingent limitations than its predecessors. Hence it becomes plausible that at the ultimate end of this process, if such exists, the fundamental laws can no longer meaningfully said to be empirically conditioned, and will thus themselves be the preconditions of all possible experience. To avoid misunderstandings, I want to make it clear in an epilogue to this argument that Weizsacker at times gives his thesis of reduction a formulation that is somewhat ambiguous. An example of this is the following formulation: "Anyone who could analyse with sufficient acuity under what conditions experience is at all possible, would have to be able to show that these conditions already entail all general laws of physics.,,3o For this claim to agree with his idea of reduction, the word "entail" must evidently be read as "entail with the help of the appropriate contingent additional assumptions". This abbreviated way of talking is frequently found in the discourse of physics, for example, when one speaks of the "observable consequences" of a theory. Such consequences exist only if besides the theory other observational propositions are used as premises, in predictions, for example. 30
Weizsacker 1971, p. 217 (1980, p. 176)
1.4 C. F. von Weizsacker and the Unity of Physics
65
IV
I can now merely provide an outlook onto the final part of Weizsacker's task, "of making the unity of physics comprehensible". In this part the task is one of "demonstrating physically (and ... at the same time mathematically and logically) what concrete form this unity should take". 31 Obviously it is this most important part which should no longer contain any promises, unredeemed speculations, or programs that have not been carried out, but merely the logically, mathematically, and physically concrete solution of the task. Unfortunately, nothing of the sort exists. And this has often earned Weizsacker's project the reproach (especially from physicists) that it contains nothing sufficiently concrete, nothing really tangibly new, and hence nothing which others would want to, or indeed could, use as starting point. Under these circumstances it is now especially important to elucidate the true intentions of the author, by making full use of those hints which support the hope that these intentions can already now be realised at least partially. When Weizsacker says that the task which the physical science of our time sets for philosophy is one of making the unity of physics comprehensible, then this formulation resounds with the expectation that in our century physics has won a chance, greater than ever before, of completing its conceptual unity. And Weizsacker sees this chance in quantum theory. It is thus imperative for us to understand how one could arrive at this opinion. And this understanding must be brought about in light of the main philosophical idea just described with special consideration of the position occupied by quantum theory in the development of physics towards theories of increasing universality described earlier. To put the question methodically: If one defers the philosophical coronation of physics to its ultimate end, as has been done here, what can be done now to promote the transcendental task? For this purpose (and also quite generally) Weizsacker has formulated a methodical principle, according to which in such an undertaking we should "make maximal use of the meaning, as understood beforehand, of those concepts without which it would not have been possible even to formulate the questions that the theory is meant to answer".32 Since now the transcendental task consists in the search for the conditions of possible experience, and since in discharging this task we must make use of the for us already intelligible sense of the concept of experience, we must first inquire after this sense. According to Weizsacker, experience at any rate means that we learn from the past for the future. Thus here we are immediately confronted with the fundamental temporal structure of past, present, and future - the historicity of time, as Weizsacker calls it. 33 And it is simply analytically true that experience thus defined has this structure of time as the condition of its possibility. From this it becomes intelligible that precisely quantum theory 31
32 33
ibid. p. 131 (p. 103) ibid. p. 195 (p. 157) Weizsacker 1948, I. Lect.
66
1.4 C. F. von Weizsacker and the Unity of Physics
should have been a decisive step in the direction of the unity of physics in the transcendental sense. 34 For it is the historicity of time which at the same time determines the ontological and epistemological structure of the world: The past is closed and factual; it contains the facts, and in it is found what in principle could have been known. Propositions about the past are accordingly true or false. The future is open; it is the realm of what is possible and known fundamentally only with probability, which in the present - the seam of past and future - becomes phenomenally actual (or: perceived). If one takes this addition of the historicity of time - the truth-valuedness of what is past and the probability of what lies in the future - as a new standpoint and asks where this structure has taken on scientific form, the answer will be: in quantum theory, and specifically in its Copenhagen interpretation. For classical physics, fitted with a foundationalist claim, it was essential that in principle it could make do without the concept of probability, using a two-valued logic throughout. Only with quantum theory did it become apparent that again on a fundamental level - the concept of probability is in a very precise sense indispensable, namely, indispensable in its use for predictions, that is, for contingent propositions, relating to the respective future, about the outcome of possible measurements. Furthermore, according to the Copenhagen interpretation, the only other contingent propositions occurring in quantum mechanics are those that state the result of performed measurements. Hence, in quantum mechanics we meet again, in rigorous mathematical form, precisely that dual structure of the facticity of the past and of the probability of what lies in the future. In order to express the fundamental character of these relations, in the sense in which they concern the conditions of possible experience, Weizsacker wants to see the common structure of the historicity of time and of quantum theory treated in the framework of a logic of temporal propositions. In accordance with the view of the Copenhagen interpretation, that in the description of a measurement classical physics must have a part, this logic would agree with classical logic as far as perfect tense propositions about the past are concerned. For propositions in the future tense, however, it would from the start be a logic of merely modal or probabilistic valuations of the respective predictions. The propositions of probability are then propositions in the present tense about the outcome of a measurement dated in the future. Accordingly, they also satisfy classical logic. Only the propositions valued through probabilities (instead of truth values) behave deviantly. Here it must be recognised that no theory, hence also not logic, can tell which laws it does not satisfy. We are familiar with the deviations of quantum logic in the form of the quantumtheoretical indeterminism. The indeterminism itself is not a part of logic. As we know it from quantum theory, this indeterminism cannot be formulated as a law of such a theory. It states that the ortho-lattice of the subs paces 34
Weizsacker 1985, passim
1.4 C. F. von Weizsacker and the Unity of Physics
67
of a Hilbert-space does not allow for a truth-valuation that would respect the structure of the ortho-Iattice. Accordingly, Weizsiicker regards this indeterminism not even as one of the preconditions of possible experience. But one can meaningfully ask, what is the most powerful logic of future tense propositions that is compatible with it. And this logic - this is just what the indeterminism states - cannot be classical logic. The logic in question is not known today, or more precisely: it is not known with the inclusion of an appropriate proof of completeness. 35 But one can give a preliminary sketch of its structure without guarantee of completeness (and hence without a properly binding interpretation). Its integration into the full temporal logic in Weizsiicker's sense could perhaps look as follows: Propositions in the future tense referring to a fixed point in time in the future can be combined without restrictions through quantum-logical operations that correspond to the classical "and", "or", and "not". Yet, these propositions do not occur independently at all. Here is a sketch of a simple logical calculus which nevertheless rests on these propositions: Its formulae are: (1)
(2)
Na, where a is a quantum-logical proposition and "N" stands for "necessarily"; all combinations of the formulae (1) that are formed using classical conjunction or disjunction
Of these formulae the following would be logically valid: (I) (II)
(III)
all N a with a quantum-logically valid a certain formulae which create a connection between the quantum-logical and the classical conjunctions and disjunctions, all classical inferences from the formulae (I) and (II), in particular all classical tautologies.
This calculus could be expanded to cover propositions in the perfect tense. With the help of an additional, elementary operator one would state that by means of measurement the perfect tense propositions have been established as true. For this case too laws would then have to be established. 36 Looking back in conclusion to the project as a whole, insofar as it has been sketched here, we were able to see the development of physics, without distortion, as a development in the direction of a unity. Today no one is able to say whether this development will at one time have a completion in the form of a unified theory of elementary particles and their interaction, or even whether it will have a completion in any form whatsoever. Yet without question this development is also one towards ever greater universality of the respective basic laws, such that the empirical content of the theories shifts 35 36
Concerning the state of the so-called quantum logic see for instance Mittelstaedt 1986 and Mittelstaedt/Stachow 1985 For this issue see Scheibe 1964
68
1.4 C. F. von Weizsacker and the Unity of Physics
more and more into contingent additional assumptions that must be explicitly established in each case. In light of the Newtonean theory there were certain facts which allowed the Keplerean laws to come into view. Hence in light of a future physics there could again be certain facts due to which quantum theory plays the role that we have found for it. It is thus an idea to be taken seriously that the entire empirical content will have eventually migrated to contingent additional assumptions and that ultimate fundamental laws formulate nothing but the conditions of possible experience. It is again difficult to say what role quantum theory will be accorded in this. Precisely if it is a closed theory, as Weizsacker assumes, there could be surprises, if it should not be the final word after all. For in this case we would be just as unable to predict what we shall then be confronted with as was the case in the last century, or even more for Kant, with regard to quantum theory. Most of all, one would want to wish on behalf of Weizsacker's project a further development of physics itself, one that would allow us to understand better how what he wants to bridge, such as subject and object, experience and being, action and givenness, go together as what is near and what is far away.37
37
The question, how Weizsacker's ideas presented here are connected with the views of other physicists and philosophers on the development of physics is expressly not the topic of this paper.
1.5 Between Rationalism and Empiricism: The Path of Physics* I begin with the following quotation: "What led me to my science and what fascinated me from a young age was the, by no means self-evident, fact that our laws of thought agree with the regularities found in the succession of impressions we receive from the external world, that it is thus possible for the human being to gain enlightenment regarding these regularities by means of pure thought ... ,,1 You will surely agree that these are not the words of a positivist. Yet they are the words of a physicist. We find them at the beginning of Max Planck's autobiographical account of his life as a scientist. I am beginning my lecture with these words because, in a preliminary way, they confirm the thesis I shall want to defend. The thesis is that against the background of the classical opposition of rationalism and empiricism we can make sense of present-day physics at least as well from a rationalist position as we can from an empiricist position. In fact, it will become apparent once more that a satisfactory epistemological foundation of present-day natural science, and of present-day physics in particular, would ultimately consist of a synthesis of elements of both positions. Here I shall emphasize first of all the rationalist over the empiricist element, for the last philosophical school that was both relevant and influential - logical positivism or empiricism - held it the other way round, and was not particularly successful with its approach. All the same, the general trend is still to follow the lead of the 19th century in aligning natural science with empiricism rather than with rationalism. I further say that the idea of a synthesis suggests itself anew because we are all too familiar with the opposition in question and with the ever changing variety of attempts to overcome it. Yet it seems that philosophers cannot be convinced to give up their exaggerations. Rather, it seems that in philosophy exaggeration is a fast-acting, even if not especially long-lasting, recipe for success. As soon as Kant had offered what is perhaps the most significant synthesis of this kind, the young Schelling appeared on the scene to declare that the point of his philosophy of nature was "the conviction that there is such a complete opposition between the empirical and the theoretical that there cannot be a third element in which both could be united ... In what is now called physics the empirical and science are run together in a motley . .. Our purpose is . .. to separate science and the empirical as soul and body [are separated], and to strip the empirical of all theory by not admitting into the science anything that is not constructible a priori ... ,,2 While here rational cognition and experience are thus to be transformed into a radical opposition, we hear a more conciliatory tone in Planck's autobiographical testimonial, one that suggests a harmony, * 1 2
First published as Scheibe 1994a, translated by H. J. Wilhelm Planck 1990, p. 9. Schelling 1927, p. 282 f.
69
70
I.5 Between Rationalism and Empiricism
rather than an opposition. Without now wanting to analyze Planck's words more closely, it is his tone, and not the shrill trumpet blast of Schelling, which will serve as my keynote. At the present time the triangle of rationalism, empiricism, and physics seems scalene and oblique-angled. What could help us adjust it? If we could rely on a cooperation between physics and philosophy, there would perhaps already exist a generally binding solution. But such cooperation has never taken place, and regrettably it must be said that precisely since the days of German Idealism physics and philosophy have had a strained and disturbed relationship. In his inaugural address as principal of Heidelberg University Helmholtz describes the situation as it presented itself in the middle of the last century: "The philosophers accused the natural scientists of narrowmindedness, and the latter accused the former of senselessness. The natural scientists now began to place a certain emphasis on the claim that their work was completely free of all philosophical influences, and soon it came to the point that many of them, among them men of outstanding importance, condemned all philosophy as useless and even as harmful revery.,,3 Indeed, this bad atmosphere persisted until towards the end of the century. It was the problems surrounding the atomistics of the kinetic theory of gases, the discovery of the quantum of action, as well as finally Einstein's special theory of relativity and his general theory of gravity which forced physicists in the first half of our century to consider new epistemological questions and to discuss old questions anew, and to anSwer these questions in this or that way. There followed an exciting period in which physicists turned to philosophy with questions specific to their field. But apart from rudimentary beginnings it never came to a real dialogue between physicists and philosophers. The assessment of this period by people outside the field of physics was mixed. Adolf v. Harnack showed himself to be impressed, if we can trust the words that are attributed to him: "People complain - he is supposed to have said - that our generation has no philosophers. Quite unjustly: it is merely that today' s philosophers sit in another department, their names are Planck and Einstein.,,4 Others saw it differently. Gilson comments on the awakening of the physicists with the sarcastic remark: "Nothing equals the ignorance of modern philosophers in matters of science, except the ignorance of modern scientists in matters of philosophy."5 We find a balanced judgment from within the camp of the physicists itself when we look at what Sommerfeld said in 1948 in a summarizing remark about the period: "In the 20th century the relationship between physics and philosophy changed fundamentally. Right at the outset, in the year 1900, Planck discovered the quantum of action ... With that he gave philosophy the hardest nut to crack with which it will be struggling for a long time to come . .. The most decisive step toward a 3 4 5
Helmholtz 1903, p. 164. Quoted from Seelig 1952, p. 45, as well as Sommerfeld 1949, p. 99 (1955, p. 37). Quoted from Jaki 1966, p. 341.
1.5 Between Rationalism and Empiricism
71
philosophically enhanced physics was taken by Einstein in 1905. Since Einstein there no longer exists an alienation between physicists and philosophers. Physicists have become philosophers and philosophers are careful not to get into conflict with physics.,,6 As stated earlier, however, a joining of interests never came about. Since the middle of the century the remaining epistemological problems have fallen into the hands of specialized philosophers of science. What they did with these problems, and whatever else moved philosophy of science in general, has left physicists for the most part indifferent. In a recently published book Steven Weinberg calls philosophy of science "at its best . .. a pleasing gloss on the history and discoveries of science.,,7 In view of this not very encouraging general situation, we must acknowledge that a satisfying synthesis of rationalism and empiricism that would do justice to physics still eludes us. Hence in what follows I will merely present and discuss some aspects of the problem in a somewhat impressionistic fashion. Apart from detailing the relevant issues in physics itself, I will try to articulate mainly the physicists' understanding of the matter in question, leaving the contributions of philosophers more as a foil in the background. According to the traditional understanding of the matter, the opposition between rationalism and empiricism is in the first instance of an epistemological nature. But it would be inappropriate in the context of contemporary physics, if we were to discuss once more the nihil in intellectu quod non prius in sensu of Thomas Aquinas, or if with Kant we were to "pursue the pure concepts to their first germs and rudimentary forms in the human understanding". According to my impression, however, it would make sense for philosophy to return once more to the issues and questions that were discussed by people like Bradley, James, and Russell at the beginning of the century. And although I do not want to do that in this paper, I think it will become apparent that it is more with the epistemological questions relating to logic and ontology that contemporary physics is concerned. The various aspects of our topic to be discussed will be seen to share one common feature in that they represent oppositions, polarities, complementarities etc., the two sides of which can readily be identified as being respectively more rationalist or more empiricist in nature. In each case it will have to be specified how physics is able to mediate between these extremes. It is in this sense that in the title of the paper I speak of the path of physics as one which passes in between rationalism and empiricism - somewhat like the path between Scylla and Charybdis. I
Perhaps the best way to see that physicists actually lead such an intermediary existence is first to direct our attention to a great and influential physicist of our century whose scientific experience and eventually attained view on 6
7
Sommerfeld 1968, p. 640. Weinberg 21994, p. 167 (1993, p. 174).
72
1.5 Between Rationalism and Empiricism
things are characteristic for his profession. Long into the future our century and its physics will be associated with the name of Einstein. In the first instance this is on account of Einstein's achievements in physics. It is precisely these achievements, however, that have also left their creator in the kind of philosophical tension which we are discussing. Philipp Frank follows Planck in distinguishing a metaphysical and a positivist basic approach in the philosophy of nature, and he reports that "each of these regards Einstein as its chief advocate ... ". 8 Some have seen clearly that such a monopolization would be inappropriate. One interpreter says that "Einstein's position cannot be labelled by anyone of the current names of philosophic attitudes; it contains features of rationalism and extreme empiricism ... ". 9 Einstein fundamentally agreed with this description. But he added, as if apologizing, that "[a] wavering between these extremes [i. e. between rationalism and empiricism] seems to me unavoidable."l0 We shall want to see how this sentence is to be understood. To achieve this understanding one must first of all be mindful of the fact that Einstein underwent a philosophical development which was essentially determined by his own physical research. He himself writes in a letter: "Coming from sceptical empiricism of somewhat the kind of Mach's, I was made, by the problem of gravitation, into a believing rationalist ... ,,11 The most important point of this testimonial, to which many of the same sense could be added, is the acknowledgment that this change of mind was brought about through the work on the problem of gravitation. Before we turn to this passage, however, we want to note that this turn away from empiricism is to be understood merely as an integration of certain rationalist elements into an anti-rationalist early empiricism such as the young Einstein might have held. In a certain sense Einstein as a physicist naturally remained an empiricist throughout his life. Still in his later years he expresses the conviction that ''the concepts and propositions get 'meaning' or 'content' only through their connection with sense-experiences" and that "the degree of certainty with which this connection .,. can be undertaken, and nothing else, [differentiates] empty phantasy from scientific 'truth'" .12 It fits completely with this demand that throughout his life Einstein was vitally interested in an empirical confirmation of his own theories and that at times he was even anxiously worried about such confirmation. 13 This was due to the fact, as we shall come to understand more fully later, that with his gravitational theory Einstein had taken an extraordinarily high risk. Thus Einstein remained an empiricist. Over time, however, his empiricism developed into a rationalistically purified empiricism in the sense that he had a 8 9 10 11 12 13
Frank 1949, p. 271 (1955, p. 173). Margenau 1949, p. 245 (1955, p. 153). Einstein 1949 b, p. 680 (1955 b, p. 505). Quoted from Holton 1973, p.241; see also Holton 1981, p. 230 f. Einstein 1949 a, p. 13 (1955 a, pA). See the excellent account in Hentschel 1992.
1.5 Between Rationalism and Empiricism
73
low opinion of theoretically isolated experiments,14 that he rejected as unfulfillable the operationalist demand of an entirely empirical interpretation of a theory,15 and that he (reportedly) held the opinion that "theory first [decides] what one can observe.,,16 Einstein probably also would have subscribed to the statement, attributed to Eddington, that "one should never believe any experiment until it has been confirmed by theory.,,17 Such statements do not signify an abandonment of the fundamental principles of empiricism. Rather, these statements were meant to set the inner qualities of a physical theory into the proper light at a time when this was still necessary. Now, how is this turn connected to the work on the problem of gravitation? Here we can only focus on the general aspect of the problem which became apparent with Einstein's solution to it. And this was a certain opposition of theory and experiment. It is appropriate to help our understanding of Einstein's, in any case aphoristic, remarks on the topic by appealing to the bluntness to which he could rise in conversation more than in writing. Einstein is reported to have said to the physical chemist Herman Mark: "You make experiments and I make theories. Do you know the difference? A theory is something nobody believes except the person who made it, while an experiment is something everybody believes except the person who made it.,,1s This is in the first place the self-irony of a man who throughout his life, in light of the difficult situation of empirical proof for the general theory of relativity, pointed out its inner qualities, and who on the other hand, as if through an irony of fate, gained world-wide fame overnight through one critical, yet for his purposes favorable, observation. 19 Yet at the same time it is also a brilliant social reflection of the general position of the theoretical physicist who, in the absence of an intellectual intuition, is left to make independent uses of reason and of the senses, of theory and of experiment, as of two crutches, and who moreover must defend himself against those who would want to steal from him, be it as philosopher or as experimental physicist, one of the crutches. Einstein saw two epistemological insights confirmed or gained as a result of his work on the general theory of relativity. The first had already been formulated, in a way that set the standard for our century, in 1908 by Planck. 2o In Einstein's terminology, this is the insight that 1) the theoretical progress of physics consists in an increase in its logical uniformity, but that 2) this gain demands a high price in the form of a loss of proximity to experience: " ... it must be conceded - Einstein says - that a theory has an important Einstein 1907, p. 439. Einstein 1949 b, p. 679 (1955 b, p. 504). 16 Quoted from Heisenberg 1969, p. 92. 17 Quoted from Weinberg 21994, p. 128 (1993, p.134). 18 Quoted from Holton 1980, p. 57. 19 I am referring to the first observation in 1919 of the deflection of light at the edge of the sun. 20 Planck 1949 passim. 14
15
74
I.5 Between Rationalism and Empiricism
advantage, if its basic concepts and fundamental hypotheses are 'close to experience' ... Yet, as the depth of our knowledge increases, we must give up this advantage in our quest for logical simplicity and uniformity in the foundations of physical theory.,,21 Thus the general theory of relativity yielded a standardization of gravitation and forces of inertia and in a weaker sense also of gravitation and spatio-temporal metrics. But all successful empirical confirmations of the theory so far have been made in those unfavorable areas where the results of the theory deviate only very minimally from those of the Newtonian theory. The areas of drastic deviations still elude observation and perhaps do so as a matter of principle. 22 Thus theory and experiment ~ these two mutually complementing elements of our cognition of nature ~ can also fall into a mutually exclusive relation in the progress of physics. While, according to the first insight, the alienation between reason and the senses tends to be seen as grounded in the senses, according to the second insight, this is due to a peculiarity of reason. The highest epistemological principle which ultimately explains Einstein's view is that our entire conceptual world is a free invention of the mind, one that can be justified neither by appealing to its nature nor by any other a priori principles. This principle is directed against classical empiricism as well as against Kant, and it is perhaps for this reason that it aptly captures the basic attitude of today's physicists. We hear the man who perfected our contemporary conception of space-time say against Kant that the latter unfavorably influenced developments "by according the spatio-temporal concepts and their relations a special position with respect to other concepts.,,23 Einstein also rejects the other a priori concepts because, according to the Kantian doctrine, they belong to the unalterable given facts of our cognition. According to Einstein, empiricism in turn makes the mistake to believe "that the fundamental concepts and postulates of physics ... could be deduced from experience by 'abstraction' ~ that is to say, by logical means.". 24 Yet precisely the general theory of relativity has shown once and for all that this is impossible. Our conceptual world is logically completely cut off from our perceptions, and it is this separation that gives rise to the fundamental tension: 25 On the one hand, we have the confusing embarras de richesse of the conceptual world. The physicist tries to escape this by referring as directly as possible to the sensible world. "In this case his attitude is empirical." On the other hand, the physicist experiences the risk of this procedure without any logical path from the empirically given to the world of concepts. "His attitude becomes then more nearly rationalistic, because he recognizes the logical independence of the system." Now there 21 22
23 24 25
Einstein 1950, p. 15. Cf. Will 1981.
Einstein 1924, p. 169l. Einstein 1989, p. 116 (1954, p. 273). Einstein 1949 b, p. 680 (1955 b, p.504 f).
1.5 Between Rationalism and Empiricism
75
arises the danger of losing all contact with the world of experience. As stated earlier, however, an oscillation between the two extremes seems unavoidable. II
Thus here we have this first tension between theory and experiment, one that results from an overexertion of our senses in the extreme flights of reason, as it were. I shall return to this idea. Now we first of all want to enrich our conceptual arsenal by introducing the related pair of ideas of coherence and contingency. Within the rationalist system of Leibniz we find this pair in the form of the truths of reason and the truths of fact. Yet the term 'coherence' gives more direct expression to the fact that with our physical theories we want to establish certain connections and ultimately advance to totalities. And the concept of contingency is in one, for our purposes important, respect broader than the concept of a fact which excludes what is possible and yet has not actually occurred. Thus I choose this terminology over that of Leibniz. The idea of a coherent system was already conceptualized by Aristotle. Although Aristotle conceived this idea for poetics rather than for science, we shall see that both of these touch at least where they satisfy Aristotle's criterion. For Aristotle the unity of the dramatic play is guaranteed in the unity of its action, insofar as the latter "forms a complete whole, the various parts of which - the particular events - can be so closely tied to one another that their rearrangement or removal would make the whole come apart and become confused.,,26 This idea, that in a perfect work of art nothing - not the smallest detail - may be changed without ruining everything, established a tradition not only in aesthetics. Within philosophy a version of this idea was developed which explicitly refers to the cognition of reality, and even to reality itself. An important representative of contemporary philosophical rationalism, Brand Blanshard, describes the "ideal of completely coherent knowledge" - the final aim of all thought - as "knowledge in which every judgment entailed, and was entailed by, the rest of the system ... The integration would be so complete that no part could be seen for what it was without seeing its relation to the whole.,,27 Can the physics of today offer a knowledge which would allow one merely to hope that it might be developed into a system which would satisfy the demand of complete coherence? In spite of all the successes of modern natural science in the last four hundred years, this question is likely to leave one rather subdued, and one must gather the courage to answer it in the affirmative by recalling the specter of the other extreme. The other extreme is a view that has gained influence since Hume and that has become known in its most recent form, under a name coined by Russell, as 'logical atomism'. Logical 26 27
Aristotle, De Poetica, chap. 8. Blanshard 1939, vol. II, pp. 264, 266.
76
1.5 Between Rationalism and Empiricism
atomism is an ontology derived from the linguistic form of modern logic. Wittgenstein expressed this ontology in the following striking propositions: 28 1.2 The world divides into facts. 1.21 Each item can be the case or not be the case while everything else remains the same. 6.3 The investigation of logic means the investigation of all lawfulness. And outside of logic everything is accidental. This is obviously a position of a total contingency of the world, and you will not be surprised to hear that Blanshard described it as ''the most formidable attack ever made on reason as an independent source of knowledge.,,29 And indeed: Even from the modest standpoint of modern physics one would have to say that this other extreme is even less acceptable. We cannot meet the strenuous efforts of our natural sciences, this struggle to find law and order, with the remark that before the throne of logic all are equal- from Newton's laws of mechanics all the way down to the most trivial observations regarding my current sense-impressions. Thus here we have a further polarity with which physics must cope. Indeed one could say that it is with the combination (familiar to every physicist) of physical laws on the one hand, and initial and boundary conditions on the other, that physics has reacted to this challenge. This dualistic basic structure of a physical theory was first realized in Newton's celestial mechanics where juxtaposed to the unconditional claim of the validity of the law of gravitation we have the complete arbitrariness of all the positions and velocities of the bodies involved. But also the newer theories of the last two centuries, especially electrodynamics and quantum mechanics, are constructed according to the same principle: that juxtaposed to the laws claimed to be valid in the respective cases we have contingent propositions, the validity of which is left open by the theory. The philosophical idea of coherence is realized very well in Newton's gravitational equations. Everything these equations say about the gravity of a body amounts to no more than to say how the body behaves in a system of gravitating bodies. Consequently the body is what it is in relation to all other bodies. Moreover, strictly speaking, no partial system of a Newtonian system can for its part move in accordance with the gravitational equations. It is hence impossible to remove a partial system such that the rest would continue to move in the same manner as it would have without the intervention. Physical laws of mutual interaction of the type of the Newtonian gravitational equations satisfy the Aristotelian idea of a totality with a precision which other areas of intellectual concern, those that like to boast of their holistic thinking, can only dream of. 30 28 29
30
Wittgenstein 1961, p. 11 and 75. Blanshard 1962, p. 92. See Scheibe 1987b (this vol. 1.2).
1.5 Between Rationalism and Empiricism
77
On the other hand, the aspect of contingency cannot be eliminated from physics, since its theories typically do not answer every question, the posing of which they yet make possible. Here the answers remain relative, i. e. they always demand contingent data which together with the laws provide the basis for drawing conclusions. But just here it is also possible to counter the empiricist argument that physical laws are contingent with respect to logic. Given the nature of our logic, this is true. Yet nonetheless there exists this objective difference that the respective laws make the respective contingent descriptions redundant but not the other way around. A theory is coherent to the degree to which it makes its contingent descriptions redundant. An example of a theoretically very far advanced type of coherence is determinism. Taking celestial mechanics as its model, it states that with the help of the laws of the theory, the description of the temporal development of all parameters of a physical system is reducible to the statement of a small finite amount of data at a single point in time. This is fantastic, but there remains, as was said, an unexplained rest. This failure of scientific reason, however, is normally accompanied by the empiricist consolation that the openness of the initial and boundary conditions makes possible a multiplicity of applications of the theory in question, thereby securing its empirical content.
III The matter looks differently, however, when we enter the area of cosmological applications or inquire into the reason of the value of (dimensionless) constants of nature. In this case the openness of a theory seems to lose in value, for now we would be dealing with unique objects. Nonetheless, in dealing with these objects we do not have at our disposal a different kind of theory than the one just described. In fact, we face the problem of uniqueness and universality already in the usual applications of theories, and it is with this area that I shall continue to be concerned. In my third argumentative step I want to elucidate the concept of a law of nature in two respects that are important for our topic. I want to call these the extensional and the intensional dimension of a law of nature. According to its form, a law of nature states that all physical systems of a given kind behave in this and that manner. All bodies within the gravitational field of the sun move in ellipses with the sun in a focus; all gases have pressure, volume, and temperature in accordance with van der Waal's equation; all hydrogen atoms behave in accordance with Schrodinger's equation with coulomb potential, etc. Two things coincide in a law of this form: 1) the proper content of the law, and 2) its universal form. The content is what the physicist must work out, and it refers exclusively to the particular system. The fact that in addition all systems of a certain kind satisfy this determination of the content adds nothing to the physical content. Nevertheless this second part adds something of eminent importance: it sublimates
78
I.5 Between Rationalism and Empiricism
that content into a law. Now, the problem is that, as we have separated them here, these two parts of a law work against each other, as it were, if we make the innocuous assumption that it is the content of the law which expresses connections within reality, and that the universality of the law is to be understood in an innerworldly manner. With this term of innerworldliness we have not already arrived in the field of theology, rather, we are still firmly on the grounds of physics. Nevertheless something peculiar is happening. First of all, when we find out about the earth that it moves in an ellipsis around the sun, or about the air in this room that it conforms to van der Waal's equation etc., then, of course, in each of these particular cases we form an innerworldly proposition in the straightforward sense that we speak about objects in the only universe accessible to us, using the empirical concepts of the science that is responsible for those propositions. At the same time, with such a proposition we already express a connection, between the sun and the earth, between the three parameters of the air, between proton and electron in the hydrogen atom etc. - in any case, a connection or a dependence between particular objects or their properties, as we encounter them in the world. And just this connection in a single system is the coherence of which I spoke in the previous section. So far so good. If we move from the particular proposition over to the universal law, however, we see how this innerworldliness and this connection falls into a competition with certain features of a law which appear together with its universality. On the one hand, we are tempted to secure the lawful character by not conceiving the particular cases comprehended by the universality in an innerworldly sense, but rather by distributing them, as we could say following Leibniz, across different possible worlds, of which our world is only one. Using an immediately comprehensible terminology, one could call this view the counterfactual view, and one could oppose it to the innerworldly view (which of course likewise merits discussion) not only of the singular but also of the law-like propositions. This polarity exists in the extensional dimension of a law of nature, and it is clear in what sense it must be taken in order to appear as a special case of our standard law. On the other hand, the dignity of a law of nature requires that the particular cases comprehended in it be independent of one another. And this demand is in a certain way opposed to the fact that with our law we want to establish precisely a connection, i. e. dependencies. This opposition is more determined by the content of the law, and this is why I say that it belongs to the intensional dimension of the law. Both oppositions - this intensional dimension as well as the extensional one mentioned earlier - represent genuine challenges for any attempt at a consistent formulation of physics. Yet they are generally not discussed in this form in contemporary philosophy of science. Thus not only physicists have reason to mock at the current state of this art. In what follows I can indicate only very briefly the difficulties at hand. 31 31
See Scheibe 1991b (this vol. IV.16).
1.5 Between Rationalism and Empiricism
79
The opposition between the innerworldly and the counterfactual view of a law of nature concerns the problem of how we want to deal with the universality of a law in light of the indubitable fact of the uniqueness of world-events. We cannot deal with this question without demanding concessions on both sides. The rationalist, counterfactual view must admit what is granted by the other view, namely, that the empirical test of a law is possible only in our factually given world. The empiricist, innerworldly view must concede what is granted by the counterfactual view, namely, that the particular instances of a law constitute at least approximately independent cases, as if they belonged to different worlds. Not for the meaning but for the testing of a law both views depend on this world doing us the favor of containing independent cases of good approximation, i. e. that it is representative of the universality of the law. The progress of physics shows that this is the case, and this in itself is a non-trivial proposition about our world. If we inquire about the arguments in favor of one or the other of the two views, we find that their quality depends on the physical problem addressed. On the difficult question of absolute movements, Mach still took the Ptolemeian and the Copernican views as "equally correct", arguing in a typically empiricist-innerworldly manner: "The world-system is not given to us twice, with the earth at rest and with the earth in rotation, rather, [it is given] only once with its exclusively determinable relative movements. Hence we could not say what it would be like if the earth did not rotate."32 On the other hand, how would an empiricist respond, if a representative of the counterfactual position asked: Granted, between Mercury and Venus there is no other planet. But suppose we had the power to insert a small planet into this quarter (with suitable initial conditions). How would it move? There is no doubt that Kepler's answer would be given. But with this answer the refusal of our empiricist to abandon his innerworldly position would to this extent be a purely dogmatic insistence on principles. With regard to the concrete case he would have acknowledged what his rationalist opponent is after. The strongest argument in favor of the counterfactual position, however, does not depend on such details. Rather, it results from the historical fact that physics has become a successful science ever since it has applied a very specific method, one discovered by Galileo. It is often said that since Galileo physics has become successful on account of the combined deployment of mathematical laws of nature and of controlled experiments. Perhaps this statement already implies the main point, but it certainly does not yet make it explicit. The proper secret of the success of physics in this context lies in a quite radical jictionalism, one that reaches far beyond the question of the universality of a law of nature which we are addressing here. For it is not only a matter of bringing out the non-actualized instances of a law of nature. Rather, this fictionalism concerns the full content of physics. Our physical theories are fictions in a twofold sense, in that they 1) abstract from almost 32
Mach 1912, p. 226.
80
1.5 Between Rationalism and Empiricism
everything that constitutes reality in the full sense, and 2) usually even treat what is left over in a counterfactual manner. Yet, when we employ them correctly, we operate successfully with these fictions. For they are simple enough for us to be able to estimate that the errors made from the outset are small enough to yield from the theory the intended effect with sufficient accuracy and in some cases with fantastic accuracy. While thus the assumption of possible worlds seems to be of a piece with the attitude of this fictionalism, the associated independence of the respective contingent relations leads us back once more to the other, the intensional dimension of a law of nature. In terms of the goal intended in the formulation of a law of nature, i. e. the goal of establishing connections, this independence is really a nuisance. For as such this independence limits the establishment of connections. In a certain sense there is thus a mutual exclusion which can border on the paradoxical. On the basis of the connection established in it, a law can have the consequence that strictly speaking no two of its instances can be realized independently. Such is the case for example with Newton's theory of gravity, although on the other hand this theory allows excellent approximations of independent instances which are also realized. This is of considerable importance for scientific research. Thus Kepler's laws belong to the Newtonian approximations. For their strict validity it would be necessary that the planets moved independently of each other, each 'in its own world'. This they do only with a certain approximation. And if one had not thought of the idea that they possibly influence each other, we would still today not have progressed beyond Kepler. If, on the other hand, Kepler had not seen them as independent, we would probably not even have arrived at Kepler's laws. In this way supplementation and exclusion and with it the complemenarity of the two views contribute to the progress of physics. Nevertheless: for philosophical rationalism, which has as its goal the establishment of a complete coherent system of nature, this more empiricist concept of a universal law of nature represents at best an interim solution which accounts for the incompleteness of the system. IV We would have hardly touched on the topic of a synthesis of rationalism and empiricism that is adequate for physics, if we did not consider it in light of the idea of the unity of physics. Indeed, the concept of unity is so closely associated with the demands of scientific reason that the rationalist tradition of antiquity already made this concept a central focus of philosophy and brought it to bear against the empiricist idea of plurality. Even today, with reference to the natural sciences, we are alive to the thought of a possible completion of physics in a unified theory as well as to the empiricist counterdemand of a theoretical pluralism. According to C. F. v. Weizsacker the aim of unity would be realized "as soon as self-evident assertions concerning the preconditions of the possibility of experience have led to the construction
1.5 Between Rationalism and Empiricism
81
of precisely that unified physics at which the contemporary development so obviously aims."33 Against this view, the theoretical pluralism espoused for a time by Paul Feyerabend34 maintains that the empirical character of natural science can only be seen to be guaranteed, if the permanent state of natural science is conceived of as consisting of a plurality of competing theories. As a rule physicists have expressed themselves cautiously on this matter, holding a variety of opinions. The fact that physicists are relatively unanimous in their striving for a unified theory of the elementary interactions must not lead us to think that with such a theory we would at once also gain a philosophical conception of the unity of all natural sciences. Nonetheless, the idea of a unified final theory does appear in the dreams of some physicists. A book has recently appeared by Steven Weinberg which freely admits this already in its title: Dreams of a final theory. In such dreams physicists speak of the fact that they are not satisfied (and never have been satisfied) merely to describe nature as it is, but that they also want to know why it is the way it is. 35 On the other hand, physicists are aware, at least when they awake from their dreams, that this ideal of rational cognition cannot be attained, and for this reason they moderate their demands in two respects. The ideal of an absolute inner necessity of physical knowledge is replaced with the weaker requirement of a certain logical isolation of the respective theory36, and some physicists apply this requirement even to theories which we already possess. But this means, as we shall see, that in these cases one rejects the idea of a completely comprehensive theory in favor of the idea of a more or less loose bundle of partial completions of physics. The first, almost prophetic, thoughts on this matter date back to a time when a theory of the elementary particles was not even in sight. Already in 1908 in his famous lecture, entitled 'The Unity of the Physical Picture of the World', Planck reappropriated Aristotle's old idea in light of the development of modern physics. 37 There he compares the older physics with a collection of paintings from which one could remove any picture "without taking anything away from the others. This will not be possible in the future physical picture of the world. One will not be able to leave out any feature of it as inessential ... ". Already in 1919 Einstein was able to apply a similar idea to an existing theory, his own general theory of relativity: "The chief attraction - says Einstein - of the theory lies in its logical completeness. If a single one of the conclusions drawn from it proves wrong, it must be given up; to modify it without destroying the whole structure seems to be impossible".38 And thirty years later Einstein says of his so-called uniform field theory: "In favor of this 33 34 35
Weizsiicker 1971, p. 192 (1980, p. 155). Feyerabend 1965. See for example Einstein 1929, p. 126; Chew 1968; Ellis 1979, p. 533; Weinberg 21994, p. 219 (1993, p. 227 ).
36 37 38
This expression is used by Weinberg in 21994, p. 236 (1993, p. 245). Planck 1949, p. 45 f. Einstein 1989, p. 131. (1954, p. 232).
82
1.5 Between Rationalism and Empiricism
theory are its logical simplicity and its 'rigidity'. Rigidity means here that the theory is either true or false, but not modifiable.,,39 This application of criteria of wholeness or of unity to already existing theories raises the question whether theories which satisfy such a criterion can still figure as a part of a larger whole at all. One is here reminded of the coherence theory of truth which allows only one single, completely coherent system to be true. We can say with some certainty that Einstein believed in the coherence theory.40 At the same time he was unsure about the end of physics: We do not know whether we will ... ever attain a definite system. "If one is asked for his opinion, he is inclined to answer no. While wrestling with the problems, however, one will never give up the hope that this greatest of all aims can really be attained to a very high degree".41 Before investigating further, I should mention that Einstein's concept has not been forgotten. Weizsacker uses a similar concept to which I shall return shortly. In his new book Steven Weinberg uses what quite precisely amounts to Einstein's concept: "I do not mean to suggest that the final theory will be deduced from mathematics - our best hope is to identify the final theory as one that is so rigid that it cannot be warped into some slightly different theory without introducing logical absurdities like infinite energies.,,42 Since we do not yet have the final theory, Weinberg must test his concept on other theories, and in this regard quantum mechanics seems to him to be the most promising candidate. He reports on his attempt to change quantum mechanics by making a small non-linear correction. 43 Weinberg failed profoundly in this attempt, and he surmises therefore that quantum mechanics not only satisfies his criterion, but that it could even be "a precisely valid feature of the final theory". This remark is of some significance to an important question which I shall now discuss. The most extensive reflections on our present problem on the part of physicists go back to Werner Heisenberg. 44 These reflections operate with the concept of a closed theory. For Heisenberg a theory is closed, if "with the same degree of precision with which phenomena are describable with the concepts [of this theory] the laws [of the theory are valid] as well". 45 At a first glance, this concept of a closed theory seems intended to capture something completely different than the concept of Einstein and Weinberg. According to a consideration of Weizsacker, who developed the latter independently, this 39 Einstein 1950, p. 15. 40 This becomes apparent for example in Einstein 1924, p. 1685 f. and Einstein 1949 b, p. 668 (1955 b, p. 496). 41 Einstein 1979, p. 68. 42 Weinberg 21994, p. 17 (1993, p. 24 f. 43 Weinberg 21994, p. 85 (1993, p. 92 ff.) For the original literature and a discussion see Peres 1989. 44 For an overview see Scheibe 1993d (this vol. 11.9). 45 Heisenberg 1969, p. 135.
1.5 Between Rationalism and Empiricism
83
is not the case however. 46 For Weizsiicker a theory is closed, if through small changes it cannot be improved. This differs only negligibly from Einstein's and Weinberg's formulation of our concept. The question is only what are small and what are large changes in all of these formulations. Now, the most noticeable components of a theory are its concepts and its laws. Hence it would seem justified to speak of small changes in cases where one only changes the laws, e. g. through a corrective term, and to speak of large changes where already the conceptual framework is being modified. On this understanding, the two criteria are indeed equivalent. For according to Heisenberg's version the validity of the laws is tied to the applicability of the concepts. If one wanted to improve the laws of such a theory, one could only do it by changing the concepts, that is, one could not do it by effecting a small change. If a theory, however, is not closed in Heisenberg's sense, then one can modify its laws without changing the concepts. And in that case Weizsiicker's condition would also not apply. Heisenberg developed his concept of a closed theory - and this is now the aforementioned important point - expressly not for a characterization of the 'final theory'. He probably did not believe in the existence of such a theory.47 But he recognized the special feature of some theories which we already have, as for example Newtonian mechanics, classical electrodynamics, and quantum mechanics - namely, that we cannot develop these theories further in the usual sense. In order to improve them, we cannot merely extend the range of their application, we cannot merely modify their laws; rather, the only change still possible is a change of their concepts and with that, properly speaking, we would abolish the theory. Also a combining of two such closed theories into a higher unity is inconceivable, and no progress can eliminate them. "The edifice of the exact natural sciences can scarcely become a coherent unity in the naive sense previously hoped for ... Rather, it consists of individual parts, each of which, although standing in the most manifold relations to the others ... , nevertheless forms a self-enclosed unity.,,48 Heisenberg's view thus represents a twofold compromise between the extremes of unity and plurality: with regard to the concept of a final theory and with regard to its field of application. Weinberg on the other hand does not make the second concession. This imposes the requirement on him of having to demonstrate the provisional status of, say, Newtonian mechanics through its reduction to the final theory. One could try to proceed on a third path by taking the concept of rigidity, in a comparative sense and thus neutralizing them with respect to the problem of the final theory. Weinberg's own remarks tend in this direction, as when he compares the Einsteinian with the Newtonian theory of gravity, as well as his own theory of electro-weak interaction 46
47 48
Weizsacker 1971, p. 193 (1980, p. 156). Heisenberg 1971, p. 306 ff. Heisenberg 1934, p. 702.
84
1.5 Between Rationalism and Empiricism
with the older theory of Fermi. 49 Thus for example Einstein's theory is more rigid than Newton's theory in the sense that features of the latter that are easy to modify turn up in the former as features that are difficult to modify. It is a feature of Newton's theory that the gravitational force acting on a body is proportional to its inertial mass and inversely proportional to the square of its distance from the center of gravity. These assumptions can be modified as desired, without affecting the foundations of classical mechanics. Not so in Einstein's theory. Here they are compelling consequences of quite fundamental assumptions of a geometrical nature (and additional conditions). A geometrical theory of gravitation which would lead, for example, to a different dependence than to the quadratic dependence either does not exist at all or would look substantially different than the Einsteinian one. The conceptual field under discussion has seen little investigation, and Weinberg does not leave us much hope for success in such an investigation when he says of the problem in question that it is a matter of aesthetic taste. Albeit with certain reservations, Weinberg expressly compares the situation in science with the situation in art. 50 Just as the arrangement of the figures on Raphael's "Holy Family" leaves nothing to be desired, we readily follow Einstein's steps of reasoning through the edifice of the general theory of relativity. Here one can observe very well how an agreeably far-reaching rationalism applied to concepts of physics is immediately withdrawn as soon as meta-physical concepts are in the picture. But we do not have to be all that pessimistic about the matter. What we really need to do is to apply relationships which we have already worked out for logical theories to empirical theories. For classical two-valued propositional logic for example, the concepts of conjunction, disjunction, negation etc. are objects of definitions, on the basis of which in turn the validity of the laws of this logic is shown to rest. We have thus a clear case before us where, in Heisenberg's sense, the applicability of concepts brings with it the validity of laws pertaining to those concepts. A change of this logic would only be possible as a change of its logical concepts. And this is precisely what had to be done when for this or that reason one left classical logic and introduced intuitionist logic, quantum logic, or other non-standard kinds of logic. In the case where one is dealing not with logical, but with empirical theories, the main difficulty is that here one cannot define the concepts in such a way that something would already follow from them. For this reason one would first have to address the question, whether the claim that an empirical theory is closed (or rigid etc.) is for its part an empirical claim or not. While we are far from wanting to find a definition of the concepts in question, answering such weaker questions could contribute to the clarification of the matter. 49
50
Weinberg 21994, p. 105 and 123 (1993, p. 111 ff. and 129 f.) Weinberg 21994, Chap. VI; it is not said which "Holy Family" of Raphael is meant.
1.5 Between Rationalism and Empiricism
85
In the first part of my lecture I explained Einstein's oscillation between rationalism and empiricism partly with reference to the reciprocity between the logical unity of a theory and its proximity to experience. I also spoke of a very high risk which Einstein incurred with his general theory of relativity. We can now better understand the significance of these statements. It has always been a concern of empiricism to be able to test the propositions of a theory individually and independently of one another. It seemed that this was the only way to secure the empirical content of a theory and to guarantee a cumulative gain of the natural sciences without incurring too much of a risk of having to start all over again in case of an empirical refutation. Within the more recent philosophical discussion it was Quine who opposed this view. "Science - he says - is a unified structure, and in principle it is the structure as a whole, and not its component statements, one by one, that experience either confirms or shows to be imperfect ... The unit of empirical significance is the whole of science.,,51 It appears that this statement differs from the rationalist ideal of a completely coherent system only in that in place of truth it speaks of empirical confirmation. Specifically, a translation of the last sentence would then simply read: The unit of that which is capable of truth is the whole of science. And this appears to be just the view of the coherentist. Yet in spite of this seemingly formal proximity with Quine we are in fact a long distance away from the coherence theory. For the Quinean holism is easily conceived as a standpoint faute de mieux - possibly by Quine himself. With an air of regret, so to speak, we acknowledge that science is only confirmable as a whole. Whoever thinks this way, however, is still under the sway of empiricism. For on this interpretation holism is based on the idea that it is possible to absorb apparent empirical refutations in one place by making adjustments in the system in another place. But this would merely be an empiricism supplemented with conventionalist strategies. And if that were the only concern, we would indeed have cause to complain. In what was said earlier we have become familiar with a completely different and far more bold interpretation of the Quinean statements - if one wants to formulate it this way for the sake of contrast. We sacrifice the more exact empirical analysis, not for the sake of a science that remains to be modified at will by conventions, but rather for the sake of a system that is as coherent as possible. Here it is the associated rational gain that compensates for the risk of having to start all over again in case of failure. In a short epilogue I want to recall that this was not intended to be a lecture about reason, but rather about physics as a product of reason - or more precisely: a common product of reason and sense-experience. We have seen how both of these elements take something away from each other, something that reappears in ever changing guises. It seems reasonable to understand nature in terms of empirically remote objects and of empirically doubtful 51
Quine 1976, p. 211 and 1961, p. 42.
86
1.5 Between Rationalism and Empiricism
schemes of order. Yet in this way physics risks the danger of withdrawing too far from the secure ground of experience. It seems reasonable to build up our knowledge about nature as a coherent system, one that as such is as complete as possible. Yet the logic of our language allows this only by way of contingent assertions, without allowing us to rid ourselves of them in the end. We are searching for functional dependencies in the laws of nature, yet these can only be confirmed on the basis of certain independencies. Our best way of understanding a law of nature is in terms of its validity in many worlds, yet we have only one world for its realization. We desire inner necessity of our theories, but we must be content with the certainty of having now and then reached an oasis, as it were, from which to depart would not be impossible, but unreasonable. Thus it seems that our accomplishments can only be represented as compromises between extremes, extremes which cannot be realized simultaneously, even though each for itself would represent a considerable value. And the more frequently we arrive at this conclusion the more it seems reasonable to suspect that with these extremes we are allowing ourselves to be guided by traditional categories which are wholly inadequate to what we can actually accomplish. Einstein's "oscillation between the extremes" is perhaps not unavoidable after all. Then it would be better to search for something that from the outset is more adequate to our possibilities. Although this is the old insight of epistemology since Descartes and Locke, as a philosophical insight, it is also one that is forever new.
II. The Philosophy of the Physicists
It is a fact, generally accepted but still not sufficiently appreciated, that during the first half of our century the physicists, in connection with problems in their discipline, developed a philosophy of their own - some physicists only, of course, and not all in the same way. Nevertheless, the event is remarkable and deserves to be made the subject matter of a larger monograph. The modern physicist with Galileo as his prototype is inclined by nature to avoid philosophical questions, or at any rate to keep physics free of them. Therefore, whenever he violates this rule, there certainly must be good reasons for doing so. As is well known, for the movement under discussion the reasons were mainly the establishment of the special and general theory of relativity and, even more so, quantum theory. In the articles of Ch. II it is not so much these theories themselves but rather some consequences of their establishment that are selected for closer inspection: the clarification and securing of an appropriate concept of progress ([6J and [9]), Schr6dinger's reaction to the new physics ([7]), the endeavors of Einstein to obtain a physically justifiable understanding of reality ([8]) and incorporation of quantum theory into Planck's idea of scientific realism ([10]). One aspect under which the new physical theories became of philosophical importance was the unusual relation they had to their predecessors, e. g. quantum mechanics to classical mechanics and general relativity to Newton's theory of gravitation. Especially in the case of quantum mechanics the relation was not that of a small change in the basic equations but a complete rupture of the conceptual framework of its predecessor. Already Boltzmann, impressed by the emergence of classical electrodynamics in the 19th century, had argued that progress in physics is not always smooth but happens by more or less discontinuous leaps ([6]). The leaps come about because, on the one hand, the scientists are in principle free in their proposals of new hypotheses or even new theories. On the other hand, they will not make new proposals unless they are convinced that something went wrong with the old conceptual or propositional equipment. One aspect of progress, therefore, is always a correction of the older ideas, and it becomes a problem under which conditions the correction can still be viewed as progress and not just a change or even a step back. There is no question that the relation between the theory superseded and its successor has to be understood as a kind of E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001
88
II. The Philosophy of the Physicists
reduction of the former to the latter (cf. Ch. V). But the existing concepts of reduction did not seem to be sufficiently liberal to cover also such serious cases as mentioned. Not only philosophers and historians of science but also physicists have raised doubts about reducibility in these cases. 1 Doubts were raised by Heisenberg in the form of his concept of a closed theory ([9]). There are theories, Heisenberg argued, which develop a significant resistance against their improvement. By definition a theory is closed if the applicability of its concepts already entails the validity of its laws. A revision of the laws, therefore, is not possible without a simultaneous revision of the basic concepts, and this complicates the comparison of the revised theory with its predecessor and may even render a reduction impossible. According to Heisenberg Newton's mechanics and classical electrodynamics are closed theories, and since they do not cover all of physics there would be no final theory, such a theory being an improvement of every other. As a consequence belief in the existence of several non-equivalent closed theories does not allow the belief in a development of physics towards unity. Einstein, unlike Heisenberg, did not seem to be aware of this: he coined the concept of the rigidity of a theory which is essentially equivalent to Heisenberg's closedness. In Einstein's view his theory of general relativity is rigid. A recent revival of the concept came from Weinberg, another spokesman for the unity of physics (cf. [5], nos. 39 and 42). Schrodinger ([7]) had completely different worries about the new physics. Among the physicists of his time he may have been the most learned in philosophical matters. On the occasion of an (early) call to a secluded university in an (k. u. k.) Austrian province, he was on the verge of accepting it simply because that would have given him the opportunity of following up his philosophical inclinations. For Schrodinger - as for Heisenberg - quantum theory had been a philosophical challenge of the first rank. But - unlike Heisenberg - he never accepted the Copenhagen interpretation. Moreover, he did not feel urged to draw far-reaching philosophical conclusions from the new situation in physics. To find out how far one had to go Schrodinger asked himself what had been the proper characteristic of western scientific thought since the time of the Greeks. His answer was that European science and, in particular, modern physics, are characterized by the satisfaction of two epistemological postulates: the postulate of objectification and the postulate of understandability. Part of objectification is the elimination of all means of obtaining knowledge, part of understandability is the establishment of a picture of nature showing a unique behaviour in space and time. In Schrodinger's view both postulates were violated by the Copenhagen interpretation of quantum theory, and he has criticized this interpretation for precisely this reason although he also saw that quantum theory opens a chance to loosen the restricitions implied by those two postulates in favour of Schrodinger's philosophical position: an idealistic monism. 1
See also Scheibe 1997b, Ch. I.2 and 3, and 1999 passim
II. The Philosophy of the Physicists
89
The remaining papers [8] and [10] deal with the epistemological positions of three eminent physicists: Boltzmann, Planck and Einstein. All three were realists with a tendency to metaphysical realism, and this made them opponents of Mach - the great empiricist in the physics of the time. But all three also had reservations of an empiricist kind. Boltzmann was an atomist. He believed in atoms and in the scientific value of an atomic theory at a time when no empirical evidence for their existence was at hand and one had to satisfy oneself with theoretical reasoning. Boltzmann's reservation, adopted from Hertz, was that atomism could not be an absolute stand. Quite in general, physical reality can be known only by way of mental pictures that we make ourselves of the objects. The question whether atoms (or whatever) exist can, therefore, only be the question whether the theories based on such pictures are empirically successful or not. Planck and Einstein, too, were atomists but no longer as pioneers. Planck, the true inventor of scientific realism, founded realism mainly on the empirical success of science, especially physics. He saw "the task of physics in the exploration of the external real world (reale Auf&enwelt)". This world, however, is not immediately and subjectively given. It is rather a hypothetical construction independent of the knowing subject. Its reality is shown by the progress of physics in the direction of a system of ever growing simplicity and unity. To this view Einstein also attached himself ("the 'real' in physics has to be conceived as a kind of program"). Against the Copenhagen interpretation of quantum mechanics he repeatedly formulated his conviction "that there is such a thing as the 'real state' of a physical system existing objectively and independently of any observation". In spite of this confession Einstein sometimes gave his conviction the coherence-theoretic interpretation that the hypothesis of physical reality is possibly nothing but the assumption that the totality of our experiences admits a (unique?) logically impeccable conceptual system connecting those experiences.
11.6 The Physicists' Conception of Progress* Modern physical textbooks occasionally give an account of a peculiar relation between physical theories. The relation although logical in nature concerns pairs of theories one of which, in the historical development of physics, has become the successor of the other. Thus Misner, Thorne and Wheeler in their monograph on gravitation 1 expound the view that "as physics develops and expands, its unity is maintained by a network of correspondence principles, through which simpler theories maintain their vitality by links to more sophisticated but more accurate ones." As examples they mention geometrical optics, Newtonian mechanics, thermodynamics and Hamiltonian mechanics as being 'correspondence and principle limits' of physical optics, relativistic mechanics, statistical mechanics and quantum mechanics respectively. Then they study in more detail the correspondence structure of General Relativity, pointing out four limits of this theory, one of which is Newton's gravitational theory. "In all these examples and others", summarize the authors, "the newer, more sophisticated theory is 'better' than its predecessor because it gives a good description of a more extended domain of physics, or a more accurate description of the same domain, or both." But not only is there an empirical superiority. There is also a "correspondence between the newer theory and its predecessor [giving) one the power to recover the older theory from the newer, [a correspondence which) can be exhibited by straightforward mathematics." A second example of such a correspondence relation between physical theories occurs in Rohrlich's monograph on classical charged particles 2 . In the development of physics, Rohrlich writes, "Newtonian mechanics was replaced by relativistic mechanics, thermodynamics by statistical mechanics, classical by quantum mechanics." He distinguishes between two aspects of these replacements. On the one hand, the old theories, having been proved correct over a long time, did not really become wrong. They only became restricted to a limited domain of validity. "For example, Newtonian mechanics became restricted to phenomena in which the velocities are small compared with the velocity of light. It becomes an approximate theory ... But there is another aspect to a theory. While the predictions of a theory will always remain correct when used in the domain of validity ... , the foundations of the theory, its axioms and the underlying picture (model) may be radically modified by a more general theory: the notions of absolute space and absolute time are abandoned in the special theory of relativity ... In this way, the conceptual framework of every theory is eventually superseded." Because of these apparently opposed aspects, both realized in one and the same step from a theory to its successor, Rohrlich is somewhat more cautious in his description of the correspondence between successive theories than Misner, * First published as Scheibe 1988b 1
2
Misner et al. 1973, Section 17.4. Rohrlich 1965, Ch. 1
90
11.6 The Physicists' Conception of Progress
91
Thorne and Wheeler seem to be. Their optimistic view that correspondence is a straightforward mathematical affair is replaced by Rohrlich's view ''that the development of physical theory ... builds a hierarchy of theories [such that although] it is essential that the lower-level theory be derivable from the covering theory [Le. the theory superseding it] this must be true not so much with respect to the axiomatic framework, which is in general not a special case of the covering theory, but with respect to certain basic equations and postulates which contain all the predictive power of the lower-level theory." So here we have two accounts of the nature of progress in physics actually presented in modern physical texts. Although there is some difference in emphasis the accounts express essentially the same view. The question I wish to answer in this paper is: Whence comes this view? My contention is that it has been developed exclusively by physicists without any recognizable influence from philosophers or historians of science. Moreover, many if not all of the ideas of central importance in matters of scientific progress recently developed by philosophers or historians of science have been anticipated by physicists belonging to the tradition I am going to point out. Yet this tradition seems to have remained almost unnoticed. In spite of the vital interest in questions of scientific development that characterizes philosophy of science during the last two decades, I did not find it reported in any of the numerous relevant contributions. It does not appear in the work that Th. Kuhn has done in the field 3, and I did not find in I. B. Cohen's recent and fairly comprehensive 'Revolution in Science' 4. It is time, therefore, to put it before the public. The main feature of the physicists' view on progress is already evident in a short passage in an obituary for Joseph Stefan written by Ludwig Boltzmann in 1995 5. I shall quote the passage in two parts. In the first part Boltzmann describes the development of physical theory in the following way: The layman may have the idea that to the existing basic notions and basic causes of the phenomena gradually new notions and causes are added and that in this way our knowledge of nature undergoes a continuous development. This view, however, is erroneous, and the development of theoretical physics has always been one by leaps. In many cases it took decades or even more than a century to articulate fully a theory such that a clear picture of a certain class of phenomena was accomplished. But finally new phenomena became known which were incompatible with the theory; in vain was the attempt to assimilate the former for the latter. A struggle began between the adherents of the theory and the advocates of an entirely new conception until, eventually, the latter was generally accepted. 3 4
5
Kuhn 21970; Kuhn 1977 Cohen 1985 Boltzmann 1905, pp. 94ff
92
11.6 The Physicists' Conception of Progress
It is obvious, then, that Boltzmann anticipates here the view recently sug-
gested by Thomas Kuhn: A physical discipline develops in alternating phases. The first phase is marked by a fairly continuous development. In it we find the physicists doing what they normally do. In Boltzmann's words they gradually articulate a theory until they have achieved a clear picture of the phenomena belonging to a certain domain governed by the theory. This is normal science in Kuhn's sense. However, as time goes on new phenomena incompatible as Boltzmann puts it - with the theory become known. That this incompatibility is not a straightforward matter becomes clear when Boltzmann goes on to say that there is a period in which physicists try to assimilate these recalcitrant phenomena to the theory. In Kuhn's terminology this is the crisis in which the physicists are uncertain about whether the deviating phenomena are proper falsifications of the theory or mere anomalies brought about by causes not responsible for a real clash between theory and experience. Finally, when Boltzmann talks about advocates of 'an entirely new conception' that eventually superseded the old theory, this is essentially what Kuhn calls a scientific revolution. The crisis results in a revolution, and in spite of possibly long periods of continuous development these periods are regularly interrupted by sudden discontinuous changes. What recently has excited so many philosophers in Kuhn's book on the structure of scientific revolutions was already known to some physicists at the end of the 19th century. But this is only half the story. In the second part of his remarks Boltzmann somewhat mitigates the rupture between the old theory and its revolutionary successor. Thus he writes: Formerly one used to say that the old view has been recognized as false. This sounds as if the new ideas were absolutely true and, on the other hand, the old (being false) had been entirely useless. Nowadays, to avoid confusion in this respect, one is content to say: The new way of ideas is a better, a more complete and a more adequate description of the facts. Thereby it is clearly expressed 1) that the earlier theory, too, had been useful because it gave an, if only partially, true picture of the facts, and 2) that the possibility is not excluded that the new theory in turn will be superseded by a more suitable one. Here Boltzmann rejects the view that the discontinuous step is a step from a theory now recognized as being entirely mistaken to another wholly true theory. Clearly, if such were the case then, relative to a given domain of phenomena, physics would not have a development in any proper sense. On the contrary, Boltzmann says that physics is an essentially changing - perhaps an ever changing - enterprise. At no moment is it quite right but - more importantly - seldom is it quite wrong. Thus, physical theories well confirmed and accepted for a long time, will in a restricted sense be of eternal value. For all the discontinuity characterizing the development of physics there is some continuity even in the sense of bridging over those discontinuities. To
11.6 The Physicists' Conception of Progress
93
some extent a well supported theory is preserved and is recovered from its successor. One may ask how Boltzmann came to his view on progress in physics. Since his view is introduced as correcting an earlier one, one might even ask what this earlier view was like and who held it. I shall not go into these questions. It may very well be the case that Boltzmann was the first to formulate the view in question. Let us just remind ourselves that it was in the second half of the 19th century that theoretical physics became an independent discipline. Now Boltzmann's obituary praises Joseph Stefan as having been a theoretical physicist. He himself was a theoretical physicist, and it is theoretical physics whose development he attempted to characterize in the text quoted. Thus this characterization came at a time when theoretical physics was still badly in need of a justification for its independent existence. We can see this by having a glance at the introduction to the first edition of a text book on Theoretical Chemistry by Walther Nernst 6 . It came out in 1893, and begins with an 'Introduction to some basic principles of modern natural science'. In this introduction Nernst, who was awarded the Nobel prize for chemistry in 1920, distinguishes "two widely differing methods for the discovery of a law of nature"; one empirical, the other theoretical. And he is anxious to convince the reader of the importance, if not the superiority of the theoretical method. In contrast to the empirical method of fact gathering followed by inductive generalizations, there is a second way where "thoroughgoing ideas on the nature of certain phenomena [are developed] by a purely speculative activity, [leading to] new knowledge whose correctness has to be tested by experiment only subsequently." The scientist searching for such theoretical hypotheses "is continuously in the danger of being led astray by the delusive light of unfortunately chosen principles." And although the development of hypotheses is necessary to deepen our knowledge of the phenomena, their eventual abandonment always has to be expected. Put to the test, "their success, though not proving their correctness, does prove [their] usefulness, while a failure displays not only [their] uselessness but [their] falsity as well." Returning to Boltzmann we may conclude, therefore, that in trying to get clear about the nature of theoretical physics he sees it choosing the second method. The discontinuities in its development, then, are just the price it has to pay for that choice. The first to have developed Boltzmann's view further was Nernst. In the introduction to the 1911 English edition of his 'Theoretical Chemistry,7 he reminds us first that, because of the unavoidable inadequacy of human inquiry "many a long-recognized law has had to undergo revision to meet the requirements of the progress of knowledge." Then he writes: If we consider the matter more closely, it is obvious that the law in question has retained its validity over a wide range, but that the 6
7
Nernst 1893, pp. 2f Nernst 1911, pp. 4f
94
11.6 The Physicists' Conception of Progress
limits of its applicability have been more sharply defined. It can even be said that since the development of the exact natural sciences, there is scarcely one law established by an investigator of the highest rank which has not preserved for all time a wide range of applicability, i.e., which has not remained a serviceable law of nature within certain limits. We cannot say, for example, that the electromagnetic theory of light has completely overthrown the older optical theory put forward by Fresnel and others. On the contrary, now as formerly, an enormous range of phenomena can be adequately dealt with by the older theory. It is only in special cases that the latter fails; and further, there are many relations between optical and electrical phenomena which certainly exist, but of which the older theory takes no account. Hence the electromagnetic theory implies a great advance, but by no means nullifies the successes of the older theory. Generalizing Nernst concludes: So scientific theories, far from dropping off like withered leaves in the course of time, appear to be endowed under certain restrictions with eternal life; every famous theoretical discovery of the day will doubtless undergo certain restrictions on future development, and yet remain for all time the essence of a certain sum of truths. For Nernst what happens when a theory is superseded by another one is that it is restricted to a certain domain of validity. A new piece of terminology is introduced and henceforth becomes commonplace. A theory has a limited range of validity, and the limits of this range become known only after its successor theory has been established. This is what Nernst means when he says that "the limits of its application have been more sharply defined." The limitation has two aspects, one qualitative and the other quantitative. Qualitatively, it is certainly a limitation of any theory of gravity that gravitation is not the only force acting in the universe. The theory, in other words, is limited to certain kinds of phenomena. Quantitatively, Newton's theory of gravity, for instance, is valid only for small velocities and weak gravitational fields. The quantitative aspect undergoes refinement in Nernst's article on 'The Domain of Validity of the General Laws of Nature'S, and has a remarkable repercussion on the revolutionary component of theory succession. It is evident in this article, published in 1922, that Nernst knew Boltzmann's paper. He maintains first that the modification of general laws of nature by no means entirely overthrows the earlier laws; rather the latter are modified only for more or less extreme cases . .. . 8
Nernst 1922, pp. 489 and.49lf
11.6 The Physicists' Conception of Progress
95
However, made wiser by the lesson of relativity theory, Nernst goes on to point out some difficulties that we may have to face in describing the relation of a theory to its predecessor. With Einstein's and Newton's gravitational theories in mind he writes: The modifications that have to be made with respect to the earlier theory are so small that according to the present state of research they can be neglected except for the computation of the orbit of Mercury. But as a matter of principle every computation that astronomers have performed so far must be changed. And it is this principal aspect of the problem, not the numerical amount of the correction, that is our point. To avoid any misunderstanding: the works of Galileo and Newton are "as glorious as on that first day", but they have not brought us the final laws of the motions of the celestial bodies, and nobody would claim this for the theory of relativity .... The remarkable thing about this suggestion is that Nernst relates the two theories in question by explicit reference to the observational data. When he says that "in principle every computation that astronomers have performed so far must be changed", he means that the observational data used in one computation on the basis of Newton's theory must now be matched with Einstein's theory to obtain a corresponding computation on the basis of that theory such that the new result coincides with the earlier one within the margins of experimental error whenever the Newtonian computation had been successful. That this business of transforming the older computations into new ones is essentially approximate Nernst makes clear in the following passage: One might think that ... the laws of nature ... are valid with absolute precision in certain domains and that the matter could be settled very easily by pointing out the limits within which they remain valid. For all practical applications this is true enough, and it was for this reason that we could ascribe eternal values to the discoveries of Galileo, Newton, ... etc. From a strictly logical point of view, however, the matter appears much more disastrous. If a general law of nature becomes significantly inaccurate beyond certain limits, then the curse of this imprecision comes to roost on every application of the law even within these limits, though the magnitudes of the errors are below the threshold of measurement for the time being. This remark is meant only as being a qualification of the previous one. If we could increase the accuracy of measurement indefinitely we would still have to "change the computations". There is no more than asymptotic equivalence of the theories in the limiting case. Newton's theory is at best an asymptotic limit of Einstein's, nowhere precisely coinciding with it.
96
11.6 The Physicists' Conception of Progress
That a similar case can be made for quantum theory vis-a-vis classical mechanics had already been stressed by Einstein. In his 1914 inaugural address to the Royal Prussian Academy of Sciences he says: 9 With his quantum hypothesis [Planck] overthrew classical mechanics in the case where sufficiently small masses move with sufficiently small velocities and sufficiently high accelerations. Consequently, we view the laws of motion established by Galileo and Newton as being limiting laws only. Whereas Einstein, although allowing the old laws the status of limiting laws, yet talks of an overthrow of classical mechanics by the quantum hypothesis Planck himself follows a much more conservative line 10 . Showing that for high temperatures his radiation law approximates that of Lord Rayleigh he emphasizes that "Rayleigh's radiation law deserves an eminent theoretical interest because it represents that distribution of energy that is obtained for the equilibrium of bodily molecules with radiation from classical dynamics without the introduction of the quantum hypothesis." Thus Planck did not only want to characterize Rayleigh's law as a limiting case - a classical limit - of his own new radiation law. It was the latter that, to a progressively minded scientist, deserved an eminent theoretical interest precisely because it was not and probably could not be derived by using only classical ideas. If Planck ascribes an eminent theoretical interest to Rayleigh's law it is because he, although in a sense the inventor of quantum theory, did not like it and in the depth of his heart never accepted it. Once quantum mechanics had been fully established by 1927 we know that Einstein, too, was reluctant to accept it. But in the earlier statement just quoted his attitude seems to be different. It belongs to a period when he was about to revolutionize physics through General Relativity, having, in the famous paper of 1905, introduced the idea of light quanta bearing an energy proportional to their frequency with Planck's constant as the factor of proportionality. No wonder that we find him telling his colleagues that quantum theory would overthrow classical physics. Thus, here and elsewhere we find physicists putting different emphasis on either the conservative or the progressive element in theory change. By the end of the twenties we can roughly distinguish between a 'disproof view' and a 'conceptual change view' of progress in physics. These views need not contradict each other because typically they are applied to different theory successions. Thus it seems that astronomers, cosmologists, relativity theorists are more inclined to take the disproof view whereas quantum theorists more often adhere to the conceptual change view. Let us now have a closer look at these views in turn. In an article on 'The Nature of Astronomical Research' of 1933 the G6ttingen Astronomer Kienle shows himself to be a typical advocate of the disproof 9 10
Einstein 1914, pp. 740f Planck 1913, p. 164
11.6 The Physicists' Conception of Progress
97
view. It is, however, a qualified disproof view anticipating Lakatos' and also Kuhn's views on the role of anomalies in empirical science 11. Accordingly, Kienle's account of the view is premised by the following passage:
No experiment realizes in pure form the idealized assumptions of a theory; consequently every test of the theory is possible only with a certain degree of accuracy. To decide whether the deviations between observation and theory are essential or not, whether they are cases of random perturbations of the experiment or of principal faults of the theory, - to decide this is not always an easy matter. It is only after this general warning that Kienle continues:
At any rate, progress in our knowledge of nature is always achieved where something goes wrong, where repetition and variation of the experiments always lead to deviations from the theory in the same sense. By means of a stepwise approximation we arrive at ... ever simpler and more comprehensive formulations of the basic laws. In this connection time-honored ideas sometimes have to be abandoned, and apparently well-founded laws discarded. Not, however, because they were 'false' in any absolute sense. Rather they have to yield to something new and more comprehensive in which they remain contained as approximations in their domain of validity whose limits become knowable by means of the new theory. It is obvious that in this statement several elements of the conception of progress are contained that we have met with already in the preceding quotations. The view that progress is by disproof of an empirically well supported theory had been stated essentially already by Boltzmann, and it is implied by what Nernst had to say on the relation of Newton's theory of gravity to Einstein's. With the same example in mind Kienle makes the point explicitly by saying that progress is invariably launched by the clash of theory and experience. Still, the view is qualified, as it was in Boltzmann, by adding that it may be quite a delicate matter to decide whether the observational data are sound enough and clearly relevant to the overthrow of the theory. That the disproof view is still alive among physicists may be marked finally by an admirable piece of rhetoric from the pen of the cosmologist Hermann Bondi. In his article 'What is progress in science?' 12 Bondi describes the fate of Newton's theory of gravitation by comparing the enormous number of tests each of which the theory had passed brilliantly with its final disproof by a slight discrepancy in the motion of the planet Mercury. In Bondi's eyes: 11 12
Kienle 1933, pp. 115f; Lakatos 1970; for Kuhn see no. 3 Bondi 1983, pp. 89f. It has to be noticed that Bondi is influenced by Popper's philosophy of science. I nonetheless quote him as a spokesman of physics because I feel ceratin that he is physicist enough to have rejected Popper's view had he found obvious reasons for that from physics
98
11.6 The Physicists' Conception of Progress
It had been thought that whatever in the world might be difficult, might be complex, might be hard to understand, at least Newton's theory of gravitation was good and solid, tested well over a hundred thousand times. And when such a theory falls victim to the increasing precision of observation and calculation, one certainly feels that one can never again rest assured. This is the stuff of progress. You cannot therefore speak of progress as progress in a particular direction, as a progress in which knowledge becomes more and more all-embracing. At times we make discoveries that sharply reduce the knowledge that we have, and it is discoveries of this kind that are indeed the seminal point in science. It is they that are the real roots of progress and lead to the jumps in understanding, but in the first instance they reduce what we regard as assured knowledge. Now, insofar as Bondi's argument is at all explicit this is an assessment only of the observational disproof of Newton's theory, and so supports the view we are at present considering. However, between the lines something else is at work and may explain why we are persuaded to find this disproof so exciting. In writing the passage Bondi, of course, knew what the successor of Newton's theory was, and knowing this he knew that the true importance of this disproof was that it led to a jump in understanding. However, he seems quite determined to attach great importance merely to the aspect of disproof, concluding his statement by saying that "in the first instance Ithe new discoveries] reduce what we regard as assured knowledge." That Bondi is in the tradition in physics that I have traced back to Boltzmann, can be seen by the sequel of the passage quoted. Once again we hear the last part of Boltzmann's song. Though there are indeed these leaps in understanding, these reductions of knowledge or whatever, nevertheless: It is, of course, important to remember that when a theory has passed a very large number of tests, like Newton's theory, and is then disproved - and we can certainly speak of its disproof now - you would not say that everything before - all those forecasts - were wrong. They were right, and you know therefore that although the theory qua general theory is no longer tenable, yet it is something that described a significant volume of experience quite well. And indeed, although we have a newer and better theory of gravitation - Einstein's theory and one or two variants of it in addition - nevertheless, whenever we do not want to carry out calculations of the motions of planets and satellites with extreme precision we use Newton's theory because it is simpler.
Let us now leave the disproof view and turn to the conceptual change view. Since there has been so much talk about meaning change during the last two decades this rubric will certainly evoke all kinds of associations. And indeed, we here meet with the fact, hard to believe, that a view similar to the
11.6 The Physicists' Conception of Progress
99
one developed by Kuhn and Feyerabend had already been outlined by the physicists, above all by Bohr and Heisenberg, 30 years earlier without this view having been so much as mentioned in the whole controversy aroused by the philosophers' version 13 . The following quotation, typical of the more conventional way in which Heisenberg occasionally expressed himself, will already confirm my claim. Heisenberg occasionally talked about those strange developments, which have resulted in a change of meaning in many of the most fundamental concepts of physics . .. Nature has taught us . .. , by the unexpected phenomena in electrodynamics and atomic physics, that . .. words or concepts have only a limited range of applicability. And when we have to go beyond this range, we are left with rather abstract concepts, ... which can be understood by the experts, but cannot be translated without ambiguity into the simple language of daily life. The new phenomena can be understood, but they cannot be understood in the same sense as the phenomena of earlier physics. The word 'understanding' itself has changed its meaning .... 14 I think these words speak for themselves vis-a-vis the meaning change debate in philosophy. However, here is not the occasion for an explicit comparison of the philosophers' and the physicists' views. Rather I shall want to show how the physicists' conceptual change view fits into the tradition I am going to outline in this paper. The best introduction to the conceptual change view is perhaps a passage from Heisenberg's Physics and Philosophy - his Gifford Lectures. There he introduces it by contrasting it with the disproof view. Having described the situation created by special relativity, Heisenberg writes: 15 Under the impression of this completely new situation many physicists came to the following somewhat rash conclusion: Newtonian mechanics had finally been disproved. The primary reality is the field and not the body, and the structure of space and time is correctly described by the formulas of Lorentz and Einstein, and not by the axioms of Newton. The mechanics of Newton was a good approximation in many cases, but now it must be improved to give a more rigorous description of nature. From the point of view which we have finally reached in quantum theory such a statement would appear as a very poor description of the actual situation ... What is this general point of view finally reached in quantum theory? It is threefold and all three aspects are outcomes of the new mechanics, or - more 13 14
15
One exception is Feyerabend, but only in a very incidental manner. See his 1965b, p. 271, and his 1970a, p. 300 Heisenberg 1975, p. 161 Heisenberg 1958, pp. 96£ (1959a, p.84)
100
11.6 The Physicists' Conception of Progress
precisely - they are generalizations made on the occasion of the adventure of quantum mechanics or - even better - of the Copenhagen interpretation of quantum mechanics. The first and most important point is captured by Heisenberg's notion of a closed theory. According to Heisenberg the aim of physics is the establishment of closed theories. And whatever progress may be made during the period in which a closed theory is established, great progress consists in the transition from one closed theory to another that becomes its successor. What is a closed theory? There are two definitions. According to one it is a theory whose basic concepts already uniquely determine the basic laws of the theory, or equivalently - it is a theory such that whenever certain phenomena can be described by the basic concepts of the theory the laws of the theory will be valid for those phenomena, or - more precisely - it is a theory such that to the extent to which a phenomenon can be described by the basic concepts of the theory the laws of the theory will hold good for that phenomenon. The second definition alludes to the possibility of improving a theory by changing it in various ways. According to this definition a closed theory is a theory that cannot be improved by small changes. The two definitions are equivalent on account of the following consideration. If a theory is closed in the sense of the first definition then the need for its improvement will only occur when its basic concepts become inapplicable. Its improvement will then involve a conceptual change, with the size of the change being taken to be considerable. If, secondly, a theory is not closed in the sense of the first definition then the need for its improvement may already occur on the occasion of a falsification of its laws. Their correction will then be possible without a modification of the conceptual basis, i.e. it will be possible whilst involving only a small change. For illustration let us look at general Newtonian mechanics, Heisenberg's paradigm for a closed theory. For this case he gives the following formulation of the situation: 16 I believe that Newtonian mechanics cannot be improved at all; and thereby I mean the following: As far as any phenomenon can be described by the concepts of Newtonian mechanics, namely position, velocity, acceleration, mass, force etc., the Newtonian laws are also valid with absolute precision, and this will not change during the next hundred thousand years. More precisely I should perhaps say: With that degree of accuracy with which the phenomena can be described by the Newtonian concepts, the Newtonian laws are also valid. Since nobody has clarified it very much, Heisenberg's idea of a closed theory is still very difficult to understand. On the one side, it seems certain that the question what theories are closed is an empirical question. From a purely logical point of view we could change Newton's second law. In retrospect 16
Heisenberg 1969, p. 135
11.6 The Physicists' Conception of Progress
101
Aristotelian mechanics, in a sense, appears as a modification of Newton's, with force being proportional, not to the acceleration, but to the velocity of a body. But Newton's mechanics turned out to be the better theory for empirical reasons. On the other side, Heisenberg's definitions give to a closed theory a quasi-analytic status. If we look for illustrations in ordinary language we would have to resort to meaning postulates. Our ordinary concepts are very flexible, and we can use them to describe widely differing situations without changing their meanings. But even in our daily use of language we sometimes are forced to violate meaning postulates. The analogy is not complete because in ordinary talk there seems to be no third category between the contingent statements usually made and the conventional rules of language. By contrast, in Newton's mechanics we have the various specializations given by the various dynamical laws, and these are actually the typical non-closed theories: we can pass from one force law to another without changing our basic mechanical concepts. According to Heisenberg there are four closed theories of physics: Newtonian mechanics, statistical thermodynamics, classical electrodynamics (including special relativity) and quantum mechanics. It was a conjecture of Heisenberg's that a fifth closed theory will come up in connection with a final theory of elementary particles. Of these theories classical electrodynamics and quantum mechanics are successors of Newtonian mechanics, the former with respect to relativistic mechanics of charged particles, the latter in an obvious sense. The still missing fifth closed theory will become a successor of each of these three theories. Up to now we know of only one closed theory having been superseded, and since the notion of a closed theory, on account of its peculiar definition, can be the better understood the more cases of successorship we know, the material at hand is anything but abounding. This situation is aggravated when we turn now to the second point of view that has emerged in connection with the supersession of Newtonian by Quantum Mechanics. It is the question of the relation between the two theories and, more generally, of the relation between any two closed theories, one of which is the successor of the other. Of course, there is the relation in time in which one theory succeeds the other, and there is all the historical stuff that such a case involves. But Bohr and Heisenberg were convinced that above this there must be some logical relation between the theories expressing a definite correspondence of their respective contents. Heisenberg is very definite about this: 17 Apparently progress in science could not always be achieved by using the known laws of nature for explaining new phenomena. In some cases new phenomena that had been observed could only be understood by new concepts which were adapted to the new phenomena in the same way as Newton's concepts were to the mechanical events. 17
Op.cit. in no. 15, pp. 97f (p. 85)
102
II.6 The Physicists' Conception of Progress These new concepts again could be connected in a closed system and represented by mathematical symbols. But if physics or, more generally, natural science proceeded in this way, the question arose: What is the relation between the different sets of concepts? If, for instance, the same concepts or words occur in two different sets and are defined differently with regard to their connection and mathematical representation, in what sense do the concepts represent reality?
Asking this, Heisenberg obviously intends that the relation in question is not just that the two theories contradict each other and that , therefore, the old theory is 'disproved' not only by the empirical evidence but by the new theory. In Heisenberg's words: 18 The behaviour of the atom in many experiments can be described by means of the concepts of mechanics - and in these experiments also the laws of classical mechanics correctly represent the behaviour in question ... There are, however, other experiments in which other, non-mechanical concepts are necessary for the description of the atomic state, e.g. concepts that express the chemical behaviour of the atom. In these cases no idea of the atom using mechanical pictures can be given. Therefore, the question does not even arise as to whether the laws of mechanics are valid.
In other words: There can be no disproof. For we were not mistaken about a question of truth. More intricate or, at any rate, much richer and more complete conceptual connections have been assumed to hold between classical and quantum mechanics since the days of Bohr's theory of the atom. Already then it was assumed, to give the best known example, that for high quantum numbers the orbital frequencies of an electron in the atom would approximate the radiation frequencies. This was the original assumption which became the paradigm for various correspondences between classical and quantum concepts. The existence of such correspondences was postulated in Bohr's correspondence principle which was to display "quantum theory as a rational generalization of the classical theories".19 Although this seems to express quite a positive attitude as to the prevailing importance of the classical theories, Bohr warns us against over-simplifications of the correspondence in question. The generalization achieved in quantum theory "does not mean. .. that classical electron theory may be regarded simply as the limiting case of a vanishing quantum of action." 20 In greater detail Bohr says:21 The [asymptotic connection of atomic properties with classical electrodynamics, demanded by the correspondence principle] means that 18 19 20
21
Heisenberg 1942, pp. 18f (my italics) Bohr 1934, p. 70 ibid. p. 87 ibid. p. 85
11.6 The Physicists' Conception of Progress
103
in the limit of large quantum numbers, where the relative difference between adjacent stationary states vanishes asymptotically, mechanical pictures of electronic motion may be rationally utilized. It must be emphasized, however, that this connection cannot be regarded as a gradual transition towards classical theory in the sense that the quantum postulate would lose its significance for high quantum numbers. These rather technical remarks of Bohr's harmonize with the more general sayings in which he wants to emphasize the principal impact that quantum theory has had on human knowledge. A la Heisenberg, Bohr could say, for instance, that ''the extension of physical experience in our days has ... necessitated a radical revision of the foundation for the unambiguous use of our most elementary concepts.,,22 Thus, although the correspondence principle certainly represents the conservative component in the Copenhagen conception of progress - the bow, so to speak, to the time-honoured but finally superseded predecessor -, yet all of Bohr's attempts at an adequate general formulation of the principle emphasize that the correspondence is between two fundamentally different theories. In one of these attempts he even separates "the demand of a direct concurrence of the quantum mechanical description with the customary [classical] description in the border region where the quantum of action may be neglected." And thus the correspondence principle proper expresses "the endeavors to utilize in the quantum theory every classical concept in a reinterpretation which fulfills this demand without being at variance with the postulate of the indivisibility of the quantum of action.,,23 So far I have let Bohr answer Heisenberg's question about the relation between a progressive succession of closed theories. And I will leave it at that because in the book I have quoted from, Heisenberg's answer is very brief: he just uses the formula of the limiting case. Since we have seen that Bohr warns us to apply this formula only with great care, it may even seem that the two men had different opinions on the matter. But this is not the case, and apparent differences are only due to occasional variations of expression. 24 These variations are quite excusable in view of the tremendous difficulties presented by the case. The difficulties were moreover aggravated still further by the third point of view I mentioned above. The two other points of view, taken together, easily lead to the conclusion that a progressive succession of closed theories in Heisenberg's sense contains the two major parts of progress that were introduced by Boltzmann: Progress proceeds by leaps, by jumps in understanding, but there is also an element of continuity taking into consideration the merits of the superseded theory. With the advent of quantum mechanics, however, a new element entered, and this directly leads to the 22 23
24
Bohr 1963, p. 9 Op. cit. in no. 15, p. 110 On matters controversial between Bohr and Heisenberg at times see Folse 1985, Ch. 3, sect. 7
104
11.6 The Physicists' Conception of Progress
third point, which makes it difficult if not impossible to characterize theory progression by disproof and contradiction or at least: merely by disproof and contradiction. The third point of view has been called a paradox by Heisenberg. 25 The paradox is that, according to the Copenhagen interpretation, the experiments designed to test quantum mechanics must be described in classical concepts, or more generally - the experiments designed to test a closed theory must be described in terms of its predecessor, i.e. in terms whose limitations in principle are revealed by the successor theory. Thus the terms whose limited applicability has been made apparent by a new closed theory belong in a sense to the preconditions of its own applicability. Bohr has repeated over and over again that "however far the phenomena transcend the scope of classical physical explanation, the account of all evidence must be expressed in classical terms. The argument is simply that by the word 'experiment' we refer to a situation where we can tell others what we have done ... and that, therefore, the account of the experimental arrangement and of the results of the observations must be expressed in unambiguous language with suitable application of the terminology of classical physics.,,26 Heisenberg has generalized this by saying: "Even if the border lines of a 'closed theory' have been passed, i.e. if new regions of experience have been systematized by new concepts, still the conceptual system of the old closed theory constitutes an indispensable part of the language in which we talk about nature. The closed theory belongs to the preconditions of further research: We can express the result of an experiment only in terms of a previous closed theory.,,27 Although the third part of the Copenhagen view of progress in physics is fairly controversial even among physicists, I have included it because it certainly is an integral part of the Bohr-Heisenberg conception which, as regards the two other parts, obviously belongs to the tradition I am describing in this paper. It is in fact the climax of that tradition, and I will conclude my review with two more recent citations to round off this account. There is, first, a particularly well-balanced description by von Weizsacker who, having been a disciple of Heisenberg, belongs to the Copenhagen Circle in a wider sense. Speaking of the development of physics toward unity Weizsacker says:28 In this process, the earlier theories are modified by the later ones. But they are not really overthrown; rather their domain of validity is delimited ... we can describe this successive self-correction of physics roughly as follows: an older closed theory - classical mechanics, for 25 26
27 28
Op. cit. in no. 15, Ch. III. Bohr 1958, p. 39 Heisenberg 1948, p. 335 Weizsacker 1971, p. 208f (1980, p. 169f)
II.6 The Physicists' Conception of Progress
105
instance - adequately describes a certain domain of experience. This domain, we later learn, is limited. But so long as the particular theory is all physics can say about that domain of experience, physics simply does not know its borders; the theory does not delimit its own validity. For this very reason, the completed theory serves also as the initial scheme for the opening up of a much wider domain of experiences. Somewhere within this wide domain it then comes up against the limits of what it can grasp with its concepts. Out of this crisis in the initial basic scheme a new completed theory finally arises - special relativity, for example. This theory now includes the older one as a special case, and thereby delimits the accuracy within which the older theory applies in particular instances: only the new theory 'knows' the limits of the old. The new theory in turn is an initial scheme with regard to a still wider domain of experiences, whose borders it might intuit but cannot sharply delimit. Weizsacker uses this picture in order to convince his reader that "physics is characterized by a greater conceptual unity today than at any time in its history." Essentially the same picture is used by Sambursky to demarcate the development of modern physics against the situation in antiquity. In his "The physical world of the Greeks" Sambursky, himself a trained physicist, says: 29
Whoever adopts the theory of relativity may in every actual case fall back on Newton's theory as a first approximation to reality, an approximation which is frequently quite sufficient for the description of the facts. In spite of the theoretical and philosophical difference between classical mechanics and the new theory, and in spite of the formal difference in their mathematical method, the former is still included in the latter as a first approximation. History of science in the past three hundred years is characterized by a chronological and almost organic sequence in its development which was not found to anything like the same extent in Greek science. It is above all the history of physics from Galileo to our time that makes us realize that science advances towards reality so to say by concentric approximations - each theory containing its forerunner as a 'special case'. On the other hand there is nothing to bridge the gulf, e.g. between Aristotle's conception and that of Democritus before him. On the whole, one would rather say that many of Aristotle's ideas, as much as one admires their intellectual acumen, were in the nature of a regression from those of the early atomists, ... With this quotation I have brought my account of the physicists' conception of progress to a conclusion. Perhaps the reader may wish to know 29
Sambursky 1956, p. 97
106
11.6 The Physicists' Conception of Progress
what is the purpose of such an account. The answer is that, as far as there are historical facts, some of them are worth knowing, and that in my view the tradition I have centered on deserves to be better known than it is. It is badly in need of further analysis, both logical and philosophical. Whereas a man like Heisenberg certainly was convinced that what he had said about the notion of a closed theory was, on a general level, all that conceivably could be said about the matter, and similarly Bohr vis-a-vis his correspondence principle, and so every other physicist I have mentioned, a philosopher of science would still ask: But what on earth is the relation between concepts and laws as conceived in the notion of a closed theory? What is the relation between two successive closed theories? And so on, with respect to many other notions involved in the physicists' conception of progress. Indeed, given the viewpoint of philosophy of science the physicists' account is hopelessly incomplete, and this can be said without detracting one bit from their reputation as physicists or belittling the merits of the admittedly incompletely developed conception in question. It is the philosopher's task after all to answer the philosopher's questions. Part of the analysis required would, of course, be a comparison of the physicists' view with the corresponding ideas developed in philosophy of science. Naturally, this comparison would include questions of mutual dependence. As was said at the beginning, the physicists' view was developed independently. Conversely, the dependence of philosophy of science should be treated on three different levels. Many relevant contributions pay attention to this or that part of physics. Only few papers refer explicitly to this or that view of some physicists on the matter. And almost no philosopher of science seems to be aware of the particular tradition that has been noted in this paper. We have heard Boltzmann making two fundamental distinctions in the description of physical progress. He distinguishes, first, between two alternating phases in the development of theoretical physics, one normal and the other revolutionary. Second, as to what happens during the subsequent, revolutionary phase, he distinguishes between an element of discontinuity and one of continuity. The first distinction does not seem to have been further developed by the physicists after Boltzmann. But I have already suggested that in making it Boltzmann anticipated Th. Kuhn. It is similar with the various branches into which Boltzmann's second distinction has been developed by the physicists. In reading Kemeny and Oppenheim's famous paper "On Reduction"30 one is, if only dimly, reminded of what Nernst in a more appropriate way had to say on the role of observational data in progressive theory succession. The disproof view taken together with the limiting case idea reappears in Popper's work, and Popper does even seem to be semi-aware of his 30
Kemeny/Oppenheim 1958
11.6 The Physicists' Conception of Progress
107
indebtedness to Bohr. 31 Finally, the concept of theory incommensurability as suggested by Feyerabend and Kuhn 32 seems quite close to Heisenberg and Bohr's conceptual change view. Although the first-mentioned authors are well informed about the Copenhagen interpretation of quantum mechanics they did not discuss the relation between their concept of incommensurability and the corresponding physicists' one. This is the more surprising since here perhaps the most interesting perspectives open up. For one thing, the usual incommensurability and, in particular, complementarity between quantum mechanical observables essentially satisfy the conditions defining Feyerabend's language incommensurability. The latter is an interpretational incompatibility between two statements in the sense that under no circumstances can the statements be simultaneously meaningful. According to the Copenhagen interpretation a statement of the form 'an electron has position x' or 'an electron has momentum p' becomes meaningful under the presupposition that a position or momentum measurement has been made. Since these measurements exclude each other we have here before us a case of meaning incompatibility. One would even be inclined to say that it is the best elaborated case we know of and that, therefore, incommensurable languages can be united in one theory. Consequently, there is no principal reason to be horrified by incommensurabilities. Moreover, as I understand him, Bohr always meant his concept of complementarity to become a fundamental epistemological concept without which all striving for unity would be doomed to failure. Further investigations on the relation between the two conceptions may, therefore, throw more light on the important issue of theoretical pluralism as an alleged antithesis to the unity of science. Acknowledgement - I am grateful to my colleague and friend Joe Lambert for having served as a post-editor of this paper in order to make the most of my poor English. It is not his fault if the result leaves much to be desired.
31
32
Popper 1972. On p. 202 Popper suggests calling 'the demand that a new theory should contain the old one approximately ... (following Bohr) the "principle of correspondence'" . See the works mentioned in no. 3 and 13
11.7 Erwin Schrodinger and the Philosophy of the Physicists* I
Erwin Schrodinger whose hundredth birthday we celebrated not long ago was awarded the Nobel prize for physics in 1933. He was honored "for the discovery of new fruitful forms of atomic theory" that he had made several years earlier and first published in his famous papers "Quantisierung als Eigenwertproblem." 1 The community of physicists decided to call the basic dynamic equation of quantum mechanics the "Schrodinger equation". The view, associated with his equation, that quantum mechanics is a theory of the temporal changes of state in a physical system is known as the "Schrodinger picture". All this seems to show that Schrodinger was deeply involved with the new atomic theory - the most important advancement in physics in this century. On the other hand, it is a well-known fact that he did not accept the orthodox interpretation of quantum mechanics. And he was not the only one to find himself in this situation. Already Planck and Einstein were awarded the Nobel prize for achievements marking the very beginnings of quantum theory. Yet both men were not satisfied with the interpretation of a theory which was viewed as definitive by the majority of physicists. This coincidence indicates that something peculiar was going on - something transcending physics proper but still belonging to it in a broader sense. What was at issue here is easily related in general terms. Physics is grounded not only on the laws of nature that it attempts to discover but also on tacit assumptions of a more general kind. There is nothing unusual about this, and it is something most physicists are unaware of. The special nature of physics, the exception, only results from physicists becoming aware of one or the other tacit assumption. A change usually occurs when fundamental difficulties arise in the discipline, as was the case in atomic physics at the beginning of the century. It became so difficult to understand the structure of the atom and the emission and absorption processes of light as well as he strange dualistic behavior of free radiation fields and particles that it was unlikely that a solution could be found within the bounds of classical physics. The behavior of matter on an atomic scale motivated instead a radical revision of physical thought that would break with some of the epistemological, ontological and even logical assumptions underlying traditional physics. Of course only physicists knew which modifications were to be made and, correspondingly, which new foundation physics would obtain. They were thus forced by the very development of their discipline to philosophize, not in the trivial sense in which everybody has his world view, but in order to overcome a serious foundational crisis in their discipline. * Originally published as Scheibe 1991a 1
1927a (Articles quoted without the author are by Schrodinger.) 108
II. 7 Erwin Schr6dinger and the Philosophy of the Physicists
109
If in the title of this paper I allude to the "philosophy of the physicists" it is not a particular philosophical doctrine that I am referring to - no more than such is meant when we speak of the "philosophy of the Greeks". What I am referring to is the intellectual struggle of physicists with fundamental issues, including space and time, as they arose during the first half of this century. Though it is widely known that men like Bohr, Einstein, Heisenberg, Born, Jordan, Planck and, last but not least, Schrodinger were involved in this struggle and though many papers dealing with special issues in this field were published, the history of the philosophy of the physicists has yet to be written. I cannot, of course, do this in this paper. But I would like to present the following considerations focusing primarily on Schrodinger within this larger context. For this physicist movement did in fact contribute significantly to the philosophy of science. Its occasional dilettantism is on the whole less irritating than much of the philosophical work done in the field by professionals. Schrodinger's contribution is governed - as I said in the beginning - by his basic opposition to the orthodox view of quantum mechanics. From what I then said, it follows that this opposition is located in that domain between physics in the narrower and physics in the wider sense, where even this discipline with its high level of unity has gaps that can be filled only by more or less philosophical assumptions. I say "more or less" because, as we shall soon see, though Schrodinger was deeply convinced of the importance of a genuine philosophical understanding of the world, he expressed doubts as to the philosophical relevance of the problems posed by quantum mechanics and their orthodox solution. Therefore, if one wants to ask, as I will do in the following, what Schrodinger's opposition was like and what led up to it, one has to traverse the vast field between quantum jumps and Vedanta. The reader will understand that this can only be done in "quantum jumps." II
In an essay, in which he compares the philosophy of Bohr, Heisenberg and Schrodinger, Abner Shimony says he believes that Schrodinger "was the most remarkable philosopher among the physicists of our century." 2 I have no immediate objections to this statement. Is, however, this supposed to mean that Schrodinger has an important philosophical message in connection with the foundational crisis of physics mentioned before - a message comparable with Bohr's idea of complementarity or Heisenberg's modal interpretation of quantum mechanical reality? To answer this question let me first stake out the field with two cornerstones that we can rely on. For one thing, Schrodinger saw the development of physics beginning at the turn of the century as being in a state of genuine crisis. In Nature and the Greeks 3 he writes: 2
3
Shimony 1983, p. 215 1954, p. 15
110
11.7 Erwin Schrodinger and the Philosophy of the Physicists
"My point is this: The modem development, which those who have brought it to the fore are yet far from really understanding, has intruded into the relatively simple scheme of physics which towards the end of the nineteenth century looked fairly stabilized. This intrusion has, in a way, overthrown what had been built on the foundations laid in the seventeenth century, mainly by Galileo, Huygens and Newton. The very foundations were shaken." On the other hand, both Schrodinger and Dirac, with whom he won the Nobel Prize, were convinced that the crisis could not be overcome by the quantum mechanics of the late twenties. Schrodinger ironically compared quantum jumps with the epicycles of Ptolemean astronomy 4 and he belittled the "pulling down of the frontier between observer and observed" as being "a much overrated aspect without profound significance." 5 These quotations show that while Schrodinger took the crisis of physics seriously, he seems to have been inclined to locate a possible solution in the philosophical considerations of physics closer to physics than to philosophy but by no means closer to the orthodox philosophy of quantum mechanics. We can learn more about his thinking when we examine his general view on the relationship between science and philosophy. From his Berlin period there is the following remark (one I would describe as conventional) that he made in connection with the "present raise of the causality problem": The old alliance between philosophy and natural science is formed anew ... The further natural science progresses the less it can do so
without philosophical critique ...
6
Two remarks he made one year before his death are even more revealing. In a letter 7 he writes: It would be ... quite wrong to say that the development of science as a whole has no influence on philosophical thought; but never in the way that one clear-cut ... theorem of one special science has a logically irrefutable consequence in philosophy.
Schrodinger also expresses himself very clearly in the preface to his philosophicallegacy Meine Weltansicht. 8 Here he discloses to the reader not only that as a young man, he was on the verge of doing physics just for his livelihood whereas in his spare time, he would do philosophy as his proper concern. He also says in wellweighed words that now in the old man's book "he would nowhere talk about 4
5 6 7
8
1952, p. 481 op. cit. in no.3, p. 34 1929a, p. 317 Bertotti 1985, p. 85 1961a
11.7 Erwin Schr6dinger and the Philosophy of the Physicists
111
acausality, wave mechanics, uncertainty relations, complementarity, expanding universe, continuous creation, etc." because "these things seem to have much less to do with the philosophical world view than is popular today." 9 Here again he is speaking in clear terms. The reader is told that the foundational crisis of physics which affected Schrodinger as a physicist hardly touched his private philosophical world view. We will now seek to gain more than just a general perspective by briefly examining this view. Schrodinger must have been an introvert for whom it was natural to interpret the world as it appeared to him mentally and emotionally. Of course, this did not make him a solipsist: there were those other individuals, each with his own inner life. There was also the possibility of communicating with our fellow man, this communication being of particular importance in the natural sciences. To Schrodinger, the fact that many independently existing human beings can come to believe that they live in one common world was the great miracle of the world. His main philosophical problem therefore was: How, if at all, can we explain this miracle? The usual explanation, which Schrodinger assumed that also the majority of his colleagues endorsed, is that there is a real world of objects out there to which, among many other things, our own bodies belong. Schrodinger did not accept this view. 10 He found it naive and not sufficient to explain the miracle. Above all he has emphasized that it is at least as metaphysical and as mystical as the view that he accepted. It is metaphysical because this supposed world of real objects would by definition not be observable. And it is mystical because the alleged causal nexus connecting real objects with perceptions of conscious beings would indeed be entirely mysterious. However, in the last resort it was moral and religious grounds that led Schrodinger to solve this main problem with an idealistic monism. Influenced by philosophers like Spinoza and Schopenhauer and, on the other hand, by the Indian Vedanta, he believed in a universal spirit of which the various individuals were only aspects and whose unity would guarantee the community of common experience that was to be explained. Nature, however, as we generally come to know through our senses, has no detached existence. It is made from the same stuff as that universal spirit, i.e., our sensations. Schrodinger on the great universe which cosmology presents to us and the romance of a world that eventually produces brains so that it can be understood: "To me personally all this is maya, even if very lawful and interesting maya." 11 What really counts in view of such a beautiful delusion is the moral content of the doctrine of the identity, the numerical identity of the self with the other selves and the religious consolation to partake in a timeless, eternal existence. 9 ibid. p. 115ff; also op. cit. in no.7, p. 94 lOOp. cit. in no.S, pp. 153ff 11 ibid. p. 175
112
11.7 Erwin Schrodinger and the Philosophy of the Physicists
III
Once one has realized that this was Schrodinger's world view and that he took it at least as seriously as physics, it becomes clear what the difficulties of his discipline could ultimately mean to him. On the other hand, Schrodinger was not the man to use eastern wisdom to make his western life easy. This could be seen as the basic contradiction of his thinking: on the one hand, his affinity to eastern thought, which, when established as a religious form of life, has prevented a scientific world view from being established, and on the other hand, his affiliation to western culture where science had been created and had developed - also assisted by Schrodinger - to the sophistication of wave mechanics. It was to be expected that Schrodinger wanted to find out in what particular sense the scientific world view was to be able to endure this tension. And to this end he made considerable intellectual efforts by going back to the sources - to Greek philosophy. 12 Whether his clarification is justifiable from the viewpoint of the historian need less concern us than what it led up to. It resulted in the view that the scientific world view is based on the realization of two postulates: the postulate of comprehensibility of nature and the postulate of its objectification. Schrodinger also pointed to a connection: the postulate of objectification has to be satisfied in order for nature to be understood. Of course, one must first explain what these postulates are all about. But already at this point it should be said how matter will evolve in view of our main problem. Schrodinger reached the conclusion that quantum mechanics violates both postulates, and it is for this reason that he criticized it. He did this although 1) he saw the restrictions that were imposed on the scientific world view precisely by the fulfillment of these postulates, 2) felt that they were restrictions affecting his own monistic world view and 3) again and again was on the verge of realizing that quantum theory could eventually loosen the restrictions a bit. Schrodinger's assumption that nature can be understood and his postulate that a physical theory would allow a maximum of understanding, belong to the tradition of understanding in terms of pictures or models that dominated physics for sometime and can be traced back far into the 19th century. 13 Speaking in terms of pictures or models made it possible for physicists to turn against the rather gross dichotomy of false and true statements. Realizing that truth is rarely ever attained one had to look for something that mitigated the only alternative of being false. One came to say instead that physical pictures do more or less justice to the facts or that they are more or less useful. The question, which criteria could be accepted for ever better adaptation or what pictures were to be admitted at all, was not discussed for a long time. Even Schrodinger took up this question only when quantum theory made it evident 12 13
op. cit. in no.3, and 1948 Locus classicus is Hertz 1894. In addition to Schrodinger's papers referred to in no.12 see also 1928
II. 7 Erwin Schrodinger and the Philosophy of the Physicists
113
that the kind of pictures used in classical physics led to serious trouble if applied on the atomic scale. Schrodinger clearly saw that the use of pictures in physics cannot be specified in exact terms. He attempted to describe the assumption of comprehensibility as a solution situated between animism and positivism. Animism, being the archaic paradigm of understanding, Schrodinger considered worth mentioning in view of the revival of the causality debate. He wanted to remind people that any attempt to disclose the nature of causality would inevitably end up in primitive animism. It was hopeless to attain a causal understanding of nature. On the other hand, positivism was too modest. Understanding can only be attained if our pictures of nature show features of wholeness and "gestalt". This, however, never obtains if one follows positivism and restricts science to the observable, to sense perception or the like. With this restriction we sacrifice objectification and no coherent picture emerges. All connections are lost, and science is without theoretical orientation. It is here where the aforementioned connection between the two postulates becomes particularly perspicuous: We have to objectify in order to understand. The pictures are the end, not the means. As Schrodinger put it: As I find it, in physics ... the appreciated result of our endeavors is a clearly drawn, pictorial overall view of the object investigated, completely understood in its inner connections. (These connections) would be entirely destroyed if we were to scruple about truthfulness and formulate all our statements in such a manner that their direct relation to sense perception is displayed. 14 Objectification means above all the elimination of human consciousness from the world view of physics. Even for Schrodinger the world of real objects, rejected as a metaphysical entity, was to be accepted as a fiction. However, the elimination of consciousness was more a matter of fate than a postulate: Strictly speaking we cannot, not even outside the world of science, objectify consciousness. We do not find it in our world view "because it ... is this world view." 15 But there are things related to consciousness and less comprehensible: there are its activities and abilities, there are sensations and memory and the like. With respect to this wider range, objectification means to obtain pictures that describe nature without giving any hints as to how we came to know all this. For instance, the measuring instruments would not appear in these pictures in their functioning as measuring instruments. And not even the question of measurability would be raised by the pictures.
14
15
1948, p. 425
ibid. p. 443
114
11.7 Erwin Schrodinger and the Philosophy of the Physicists Hardly any physicist of the classical period - says Schrodinger inventing a model has had the imprudence to believe that its variables were measurable. 16
Physics has to describe nature itself, not its relation to a knowing subject.
IV We are now ready to go one step further - perhaps the last practicable one - in answering our main question: what made Schrodinger oppose the orthodox interpretation of quantum mechanics? So far we have reduced this question to the following one: In what respect does quantum mechanics violate the postulates of comprehensibility and objectification? As regards comprehensibility it is in order to touch upon two subjects: determinism and description of space and time. According to the orthodox view, quantum mechanics is an essentially indeterministic, irreducibly probabilistic theory. Some would even declare this feature as being the major one that distinguishes quantum theory from classical physics (including relativity theory!). It is therefore worth mentioning that Schrodinger was far from putting indeterminism on his index of quantum theory. He was an admirer of Boltzmann, the inventor of statistical physics. His own papers on this topic fill one of the four volumes of his Collected Works. Once he had developed the view, which was summed up in the preceding section, statistical mechanics became the paradigm of understandable physics for him. Even in his last published paper (1958) Schrodinger considered the possibility that energy might be a statistical concept like temperature or entropy. 17 All this does not say anything definite about Schrodinger's views regarding indeterminism. There are, however, two arguments in favor of interdeterminism that were known before the advent of quantum mechanics. One of them, going back to Boltzmann and Poincare, shows that even an arbitrarily small imprecision in our knowledge of the initial conditions of a mechanical system leads, in the majority of cases, to complete ignorance about those conditions at a later time. This argument has been used by Born for appeasement regarding quantum mechanical indeterminacy. Today it has become integrated into chaos research. 18 The second argument stems from Franz Exner, one of Schrodinger's teachers. This argument forms the core of the inaugural lecture Schrodinger held in 1922 in Zurich 19. His argument runs as follows: Determinism on a molecular level is not necessary. If macroscopic pressure can be explained statistically, why shouldn't this also apply 16 17 18 19
1935a, p. 486 1958. This idea, formed already early by Schrodinger, belongs to a consistent wave mechanical interpretation of the microphysical world. Smoluchovski 1918; Born 1955; Born/Hooton 1955 Exner 1919, part IV, esp. pp. 691ff; 1929b; I find Forman's description of the development, in which he includes also Weyl, Reichenbach and others, as being a turn towards acausality misleading. See Forman 1971, esp. 111.3.
11.7 Erwin Schrodinger and the Philosophy of the Physicists
115
to energy? Determinism is a habit of thought that has "evolved in millenia of observing precisely those regularities which we know today not as causal but as directly statistical regularities." Moreover, we know about this macroscopic regularity that "it would exist even if the course of every individual molecular process were decided by throwing dice." Therefore, nothing forces us to assume molecular determinism, and this, if anything, shows the power of statistical explanation. There are even arguments that make this determinism improbable. For it would lead to the following schizophrenic situation: "In the realm of appearances, clear comprehensibility ~ behind it a dark, eternally incomprehensible power, a mysterious "must" ... Such a double foundation of regularity in nature is improbable by itself. The burden of proof lies on the advocates, not the skeptics of absolute causality. For to doubt it is much more natural today." Did this seem more natural five years later when it became clear what quantum mechanical interdeterminism looks like? Schrodinger never denied the heritage of Boltzmann and Exner. He dealt with the issue even after 1927. But his remarks ~ so it seems ~ became somewhat more cautious. Already in 1927 he presented the following interesting argument: I shrink back from this conception ... because one should demand of a theory which postulates an absolute, primary probability... that at this price it should free us from the old 'ergodic difficulties' and make us understand the irreversibility of natural processes without further assumptions. 20 Presumably it was not so much the primary probabilities by themselves Schrodinger shunned but the reason why an irreducible indeterminism had become necessary. The reason was the apparent impossibility of describing nature on the atomic scale in the classical way, i.e., by giving an account of what is actually going on at every point of a continuously connected region of spacetime. Schrodinger realized that the phenomena embraced by the so-called wave-particle dualism presented a serious difficulty that somehow had to be overcome. But he was not ready to accept the orthodox solution in this respect: The renouncement of the classical spacetime description the intrusion of discontinuity, of quantum jumps and the new kind of identity of particles destroyed the ideal of nature as something understandable. "There are, ~ Schrodinger says ~ as it were, gaps in our picture." 21 However, possibly with the exception of a period around 1930, he firmly believed that the gaps could be eliminated. Already in one of his original papers on wave mechanics, Schrodinger speculated whether the processes in the atom can be integrated into the space-time form of our thinking. In an almost Wittgensteinian manner he says: 20 21
1927b, p. 279. See also 1929c; 1929a; 1930, esp. p. 24; 1932 1961b, p. 27
116
II. 7 Erwin Schrodinger and the Philosophy of the Physicists From a philosophical point of view I would consider a definite decision in this sense to be a complete surrender. For we cannot really alter the forms of thinking, and whatever we cannot understand within them we cannot understand at all. There are such things, but I do not believe that atomic structure belongs to them. 22
In a similar vein, in his last paper of 1958 he emphasizes: We do feel a yearning for a complete description of the material world in space and time, and we consider far from proven that this aim cannot be reached. 23 Schrodinger's wave mechanics was his attempt to reach the aim. Wave mechanics represents the physical portion of the whole argumentation, and its importance was underlined by the Nobel prize. That the prize was divided between Schrodinger and Dirac, who favored particles, was an apt symbolization of the wave-particle dualism. Today the merits of wave mechanics are being assessed by quantum field theory. Indeed there was at least one good reason for Schrodinger to reject quantum jumps: his own equation, the basis of quantum mechanics, shows us that an atom once in an eigenstate of the energy has to stay there forever. The spontaneous emission and absorption of light is not explained by quantum mechanics. On the other hand, the theory which does explain it - quantum electrodynamics - certainly is a quantum theory. Therefore it is hard to reconcile with the postulate of comprehensibility and its realization in pure wave mechanics. 24
v In conclusion we have to ask what has become of the postulate of objectification. It was already mentioned that this postulate was seen as serving the realization of the postulate of comprehensibility. As quantum theory violates the latter, it is to be expected that there are difficulties already with the former. And this has proven to be true. Moreover, the Copenhagen interpretation of quantum mechanics, the main target of Schrodinger's critique, has completely reversed the matter: the unattainability of the postulate of comprehensibility - the inapplicability of classical pictures of the atom - was explained by the impossibility of objectifying the processes: The observer had to be included more or less in the description of quantum phenomena. I say "more or less" because there is a certain latitude in assessing both the matter itself and its various interpretations. Correspondingly, Schrodinger shifts the focus of his critique back and forth. But it is always assumed that at some 22 23
24
1926, pp. 117£ (Italics mine) op. cit. in no.17, p. 509 For a recent vindication see Dorling 1987. See also Wessels 1975.
II.7 Erwin Schrodinger and the Philosophy of the Physicists
117
point one enters the sacrosanct district of consciousness, the atmosphere of subjectivity. 25 At face value the Copenhagen view seems to open up possibilities undreamtof. On account of his philosophical position, Schrodinger was convinced of the narrowness of the outlook which science had imposed on itself by the fulfillment of the postulate of objectification. He expresses himself in clear terms: .. .I consider science an integrating part of our endeavor to answer the one great philosophical question which embraces all others, ... who are we? And more than that: I consider this not only one of the tasks, but the task, of science, the only one that really counts. 26
Consequently, when quantum theory seemed to force a renouncement of the objective description of nature Schrodinger at first hesitated. In a lecture of 1930 he admits that it is ... a painful reduction of our claims to truth and clarity that our symbols and formulas and the pictures associated with them do not represent an object independent of the observer but only the relation subject:object. But - Schrodinger continues - is not this relation the only reality we know of strictly speaking? Does it not suffice if it finds fixed, clear and entirely unique expression ... ? Why do we have to eliminate ourselves by all means ... ? 27 Schrodinger never pursued these questions from the same perspective. On the contrary: the door that was open for some time is definitely closed by 1935. In this year the now famous paper by Einstein, Podolsky and Rosen was published. 28 Moreover, it is Schrodinger to whom we are indebted for the first thorough analysis of the situation. He at once realized the devastating epistemological consequences of inseparability: To most quantum mechanical pure state descriptions of a system composed of two subsystems no (pure) state descriptions of the subsystems correspond: I would not call that one but rather the characteristic trait of quantum mechanics, the one that enforces its entire departure from classicallines of thought. 29 However, precisely this break was too radical for Schrodinger: his cat paradox probably has become the best known expression of this attitude. 30 In spite 25
26 27
28 29 30
It is not always clear which view Schrodinger is attacking. He seldom uses names or gives references. On the other hand, the views of the members of the Copenhagen school diverge. See op. cit. in no.2 and E. Scheibe 1989b op. cit. in no.21, p. 51 op. cit. 1930 in no.20, p. 26. This lecture, given in Munich 1930, was not published until after Schrodinger's death. Einstein et al. 1935. Schrodinger's reaction was the paper referred to in no.16. 1935b and c; 1955 op. cit. in no.16, p. 489
118
II. 7 Erwin Schrodinger and the Philosophy of the Physicists
of the gravity of the case Schrodinger could not believe that the intrusion of the "observer" pointed in a direction where the deep philosophical problem of the relation between subject and object is hidden. It is not very easy to say why I do not believe it. I feel a certain incongruity between the applied means and the problem to be solved. 31
Schrodinger's argument seems to have been: measured against the infinite problem of consciousness, what we learn from quantum theory is too small to compensate for the loss with regard to the two epistemological postulates in question. Consequently, in his later work on the philosophical implications of quantum theory Schrodinger always presented the matter against the background of the classical modes of understanding, thereby trying to reduce it to absurdity. He rarely attempted to tackle the problem of quantum theory on the grounds of his own philosophical position. 32 Perhaps he somewhat thoughtlessly lost a chance that presented itself to him of all persons. Perhaps it was his own fault if in this matter not he himself but only his cat became famous.
31 32
op. cit. in no.21, p. 51£ See op. cit. in no.17, pp. 507ff
11.8 Albert Einstein: Theory, Experience, Reality* At the beginning of his contributory essay to the Einstein issue of the Library of Living Philosophers l , Philipp Frank distinguishes "according to Max Planck, two conflicting conceptions in the philosophy of science: the metaphysical and the positivistic conception. Each of these regards Einstein as its chief advocate and most distinguished witness.,,2 This minor paradox admits of a historical and systematic solution. For one, it is no question that Einstein went through a philosophical development and that this began with Mach 3 . Einstein had just as gratefully acknowledged this, as he made no secret of his having veered away from Mach later on. He says in his autobiography4: "In my younger years ... Mach's epistemological position also influenced me very greatly, a position which today appears to me to be essentially untenable." The change which took place in Einstein and which concerned mostly his understanding of reality as a physicist is probably partly to be traced to the nature of his own activity in physics as the author of the general theory of relativity. Einstein sensed this himself, as can be geared from his statement: "Coming from a skeptical empiricism of the Machian sort, I was made into a believing rationalist by the gravitation problem.,,5 Since this transformation occurred mainly in his Berlin days, the influence of Planck should also be taken into account. Planck had likewise started as a disciple of Mach. His departure from the Machian position culminated in the famous Leiden lecture of 19086 , by means of which scientific realism was founded, and that could not have escaped Einstein's notice. Towards the end of the 1920's at the latest, Einstein entered the realistic tradition which led from Hertz and Boltzmann across Planck to Einstein himself. Systematically, however, to characterize Einstein's view of an epistemological theory as a realistic one would have been absolutely insufficient. Elements from very different directions rather converge here into a new unity, whose internal consistency is not at all easy to establish. On the one hand, it is certainly important for Einstein to establish - dissenting from Mach - that physics does not deal with sense data, although these constitute the only touchstone of the truth of physical theories. The object of physics is distinct from sense data and is justly that which earns the designation 'physical-real'. The concepts of our theories serve to describe such an object, and Einstein underscores now with particular emphasis that, on the other hand, these concepts are free creations of the human mind. This makes for a certain rift in his * Fare well lecture, Heidelberg, July 13, 1992. Originally published as Scheibe 1 2
3 4 5
6
1992b and translated for this volume by Charito Pizarro. Schilpp 1949 (German Edition 1955 Frank 1949 (1955) Holton 1981, pp. 203-255 Einstein 1949a, p. 21 (1955a, p. 8) Holton 1981, p. 230f Planck 1949, pp. 28-51 119
120
11.8 Albert Einstein: Theory, Experience, Reality
theory of knowledge - a rift which he himself tries to conceal in a makeshift way by repeatedly asserting that the conceptual grasp of the real is a miracle. In spite of this admission, Einstein has clearly stressed the rational against the sense-based element in our knowledge. The emancipation of theoretical physics from experimental physics since the second half of the 19th century undoubtedly finds its crowning glory in Einstein's work, and it is not to the very least of Einstein's credits, that the young physicist of today does not need to have made new and important experiments in order to be acknowledged as a good physicist. This doubtless existing tendency in Einstein's thinking to absolutize theory has also led to anecdotal excesses, such as the story reported by Ilse Rosenthal-Schneider 7 . According to this, Einstein is supposed to have said, in response to the confirmation of his general theory of relativity by Eddington's measurements of the deflection of light on the edge of the sun: "I knew indeed that the theory was correct." And to the astonished reproach that a refutation of his theory would have also been possible, he is supposed to have retorted: "In this case, I would have felt sorry for the good Lord - the theory is, after all, correct." One must not misunderstand such a remark, if it was ever made. Einstein was an unquestioning empiricist with regards to the experimental testing of theories, and the story under consideration illustrates primarily his view that a single experiment certainly is not yet entitled to stamp the force of the decision with regard to the correctness of a theory. The following quotation recurs therefore in varied formulations 8 : The concepts and propositions [of a theory) get 'meaning,' viz., 'content', only through their connection with sense-experiences ... The degree of certainty with which this connection ... can be undertaken, and nothing else, differentiates empty fantasy from scientific 'truth.' In the mentioned context, Einstein also did not neglect to inform immediately the journal "Die Naturwissenschaften" about the new evidence, confirmative for his theory.9 On the other hand, he had already conceded in 1907 only a relatively low probability to the correctness of theories competing with his special theory of relativity if their explanations were ad hoc and concerned isolated effects only. He was not willing to accord these theories a higher probability "because their basic assumptions concerning the mass of the moving electron are not suggested by means of theoretical systems which embrace a larger complex of phenomena"lO. The tendency which comes to expression here must not be misunderstood to imply that a physical theory, in order to hold, must be completely interpreted empirically. Bridgman's so-called 'operationalism' was rejected by Einstein because it "de facto has never yet been achieved by any theory and can not at all be achieved."ll 7 8 9
10 11
Holton 1981, p. 225 Einstein 1949a, p. 13 (1955a, p. 4); italics mine Einstein 1919 Einstein 1907, p. 439; italics mine) Schilpp 1949b, p. 679 (1955b, p. 504)
11.8 Albert Einstein: Theory, Experience, Reality
121
The by and large fundamental empiricist position of Einstein stands now opposed to his rationalism which - as will be shown - dangerously approximates the supposition of a pre- established harmony between reason and reality. Probably it involves indeed more the notion that we, guided by the control through sense data, construct a 'world view' - a world view which we then call 'a picture of reality'12: Insofar as the physical thinking justifies itself ... by its ability to grasp experiences intellectually, we regard it as 'knowledge of the real.' At the outset what confronts us here is that miracle which concerns the advance from sense data to our concepts 13 : The very fact that the totality of our sense experiences is such that by means of thinking ... it can be put in order, this fact is one which leaves us in awe, but which we shall never understand. One may say 'the eternal mystery of the world is its comprehensibility'. For Einstein as realist the miracle which takes place during the transition from sense data to concepts was even greater, since physical concepts proper are not concepts of sense data. It must well have been this conviction which reinforced Einstein in his positive thesis, that concepts are free posits of the human mind. It is exactly at this point where Einstein's rationalism and realism merge. In addition to this there is the resistance to every form of apriorism. In the (anti-empiricist) sharp division between sensations and concepts, Einstein agrees with Kant, but the latter is his traveling companion only for a short stretch of the way. With regards to the question of the source of our concepts, Kant had thought that some concepts particularly fundamental for science are given to us a priori. This is the point where Einstein abandons Kant's road, because Kant restricts the freedom of thought too severely. In an obituary to Ernst Mach, Einstein argues - and certainly not for the sake of rhyme - that ''those concepts which prove to be useful in the order of things easily attain such an authoritative hold over us, so that we forget their earthly origin and regard them as unalterable matters of fact. They then become stamped as 'necessities of thinking,' 'a priori given,' etc.,,14 Einstein now finds, however, that "the path of scientific progress would be made unviable for a long time because of such misconceptions". In this critique he is certainly thinking of the fate which Kant's view of space and time suffered due to the theory of relativity. What he, however, in general maintains and, it must be admitted, repeatedly and emphatically asserts, is that the human mind is absolutely free in the creation of concepts 15 : 12 13 14 15
ibid. p. 674 (p. 500)
Einstein 31984, p. 65 (1954, p. 292) Einstein 1916a, p. 102 Einstein 1989, p. 115 (1954, p. 272)
122
11.8 Albert Einstein: Theory, Experience, Reality
The structure of the system is the work of reason; the empirical contents and their mutual relations must find their representation by means of the conclusions of the theory. In the possibility of such a representation lies the sole value and justification of the whole system, and especially of the concepts and fundamental principles which underlie it. Apart from that, these latter are free inventions of the human intellect, which cannot be justified either by the nature of that intellect or in any other fashion a priori. There is therefore no doubt that Einstein was arguing in favor of an enhancement of theory as opposed to empirical data. At the same time, however, he did not make the error of undervaluing the significance of experiments for physics in general. It meant only that in a time which tended more to overvalue the empirical element, theoretical reason and physical speculation also could be helped to find its rightful place. Einstein is supposed to have said to the physical chemist Hermann Mark: "You make experiments and I make theories. Do you know what the difference is? A theory is something which no one believes in - except the theory maker himself. An experiment is something which everybody believes in - with the exception of the experimenter himself.,,16 It was the asymmetry between the theoretical and the empirical element of knowledge - aptly expressed by the jesting comment - which Einstein again wanted to straighten out. My observations on Einstein's position regarding the relationship between theory and experience would be incomplete if I did not think of one other point, although we know of this one only at second hand. Heisenberg relates in his memoirs 17 a conversation with Einstein which took place directly after a lecture in 1926 which the then 25 years old Heisenberg gave in Berlin. In his lecture, Heisenberg had spoken about the new matrix mechanics which heuristically took as its starting point the admission only of "observable" quantities in atomic theory. Heisenberg reports how Einstein had reacted. He asks Heisenberg whether he earnestly believed "that one could admit only observable quantities in a physical theory." Heisenberg appears astonished that Einstein of all people should react in this way, when he was in fact the one who eliminated absolute time from physics due to its unobservability. He reminds Einstein of Mach and his principle of economy of thinking, which Einstein had once made his own. But Einstein remains unmoved. "I might probably have used this sort of philosophy," he replies according to Heisenberg, "but it is nevertheless nonsense." He then strengthens his thesis further in the following often cited remark: . .. as a matter of principle, it is absolutely wrong to build a theory on the basis of observable quantities alone. For it is, in reality, exactly the other way round: The theory decides first what one can observe. 16 17
Quoted from Holton 1980, p. 57 Heisenberg 1969, pp. 90-100
11.8 Albert Einstein: Theory, Experience, Reality
123
Einstein's explanation of this thesis is completely harmless at the outset. He reminds Heisenberg that we do not know what the result of an experiment or a measurement is, unless we have a theory of the equipment and its way of functioning, with the help of which the experiment or the measurement will be carried out. One aspect of this circumstance is that we measure physical quantities directly only in the rarest instances. One other and grave thing is that when we test a theory, to make the experiment succeed we must presuppose in certain circumstances precisely that theory which we want to supersede by the tested theory. Einstein says, namely, to Heisenberg: . .. obviously you assume with your theory that the entire mechanism of radiation of light from an orbiting atom ... to the eye functions exactly in the way one has always assumed , namely, essentially according to the laws of Maxwell. If that were no longer the case, then you would not be able at all to observe the quantities which you call observable. Your claim that you introduce only observable quantities is therefore, in reality, a conjecture of an attribute of the theory which you strive to formulate. You suppose that your theory does not infringe upon the present description of the radiation process in the points which are referred to here. You may be right with that, but it is not in any way certain. We see here how Einstein's insistence on the theoretical character of observation leads to an important methodological principle. It ought to be guaranteed that a well-tested theory belongs, within certain limits, to the prerequisites of later research or to the conditions of possibility of new experiences, even when these experiences belong to a theory which finally does not agree with the aforesaid theory. We shall be confronted anew with this principle when we once again consider, but on a second train of thought, a particular remolding of Einstein's basic statement that the concepts of physics are free creations of the human mind. I refer to a certain conventionalism in Einstein's epistemology, which had been basically already developed by Poincare. The latter had been impressed by the discovery of the so-called non-Euclidean geometry which occurred in the first half of the nineteenth century. It was immediately clear to all the mathematicians involved in the matter that this discovery would test severely Kant's doctrine of the apriority of space. The evidence of the logical possibility of something other than Euclidean geometry indicated nothing directly against Kant's statements on our intuition of space, of course. But then Kant had considered intuition as binding likewise for the space of physics; and Riemann's expressed thought, that is, that physical space could show deviations from Euclidity due to forces, was impressive precisely because for the first time ever, an alternative which could be seriously considered had come into view - an alternative which could break the dogma of apriority of Euclidean geometry. Now that we are concerned with philosophical analysis, Poincare will be interesting to us because he even added a third possibility
124
11.8 Albert Einstein: Theory, Experience, Reality
to the new possibility of the empirical nature of geometry which now stood beside that of Kant 18 : The axioms of geometry are ... neither synthetic judgments a priori nor experimental facts. They are posits based on agreement; ... In other words: the axioms of geometry ... are merely disguised definitions. Since we are concerned here with Einstein, I want to dwell further on Poincare's position from Einstein's point of view. Einstein had clearly also seen that here one has basically three possibilities. He confronts Poincare's point of view as the newest now and then with Kant's point of view, now and then with that of the empiricist. In a conference in Paris (1922), it turns out from an answer to a question of Brunschvicg's concerning Kant, that one had now two positions: "Kant's apriorism, according to which certain concepts preexist in our consciousness, and Poincare's conventionalism. Both agree in that we need arbitrary (= non-empirical) concepts for the construction of science. I cannot say, however, whether these concepts are given a priori or are arbitrary conventions.,,19 As against this indecisiveness - which is characteristic of this transition period - Einstein had already expressed in a letter to Born (probably in 1918): "When one permits [Kant] nothing but the existence of synthetic a priori judgments then one is already trapped. I have to weaken 'a priori' to 'conventional' in order not to have to contradict; but that also does not agree with the details.,,2o Einstein was not so much unclear on what Kant's view is, as on how far it had to be considered for the foundations of physics, precisely in competition with a conventionalist point of view. In his review of Elsbach's book Kant and Einstein 21 somewhat later (1924), regarding the first point he appears to be extremely amazed over the revisionistic strategy of many Neokantians (as also Elsbach's), who still call an epistemology Kantian when its theses "cannot at all claim to be valid independently of the current stage of the natural sciences". Concerning the substantial question, on the other hand, Einstein turns against the revisionism of the Neokantians who although ready to give up "Kant's system of a priori .... standards," cleave to "Kant's problem" and after the establishment of the relativity theory still wish to continue the search for the a priori elements of epistemology. This attitude, though irrefutable, was for Einstein unnatural, because in a given empirical content, non-empirical elements cannot be unambiguously characterized although they can always be made out. In such a way the great reformer of our views of space and time arrived, on conventionalist lines, finally at the amazing statement that "Kant influenced the development unfavorably, when he put space-time 18
19 20 21
Poincare 1914, p. 5lf Einstein 1923; italics mine Einstein/Born 1969, pp. 25 f. Einstein 1924
11.8 Albert Einstein: Theory, Experience, Reality
125
concepts and their relations in a special position as compared to other concepts." In another context, Einstein has set Poincare's viewpoint against the empiricist view. According to the latter, "geometry" [interpreted in the usual way by means of (virtually) rigid bodies, and completed by the statement of the invariance of length during their transport] is " ... to be treated as a branch of physics." On the basis of this view, the "truth" of such interpreted geometric statements could be rightfully questioned pari passu with the corresponding question concerning physical statements in a narrow sense 22 . Einstein thinks that Poincare rejects this empiricist viewpoint because what has been used to define the geometric concepts, namely rigid bodies, are strictly speaking not rigid; the arising range of validity for the geometry, however, needs to be filled in not by means of a non-Euclidean geometry because possible deviations rather can be explained within the maintenance of Euclidean geometry through physical laws in a narrow sense. In this way we obtain the conventionalist view of geometry 23: Geometry (G) predicates nothing about the behavior of real things, but only geometry together with the totality (P) of physical laws can do so ... Thus (G) may be chosen arbitrarily, and also parts of (P); all these laws are conventions. Which of these two positions - the empiricist or the conventionalist - has Einstein given preference in this confrontation? That is not easy to answer. Generally, that is, removed from the possibility of a special role for geometry, he has spoken for a certain conventionalism in connection with a holistic view of theories. In this point he is probably influenced by a compatriot of Poincare - Pierre Duhem, a physicist and science historian 24 . The latter had taught that, contrary to a widely spread empiricist attitude, the single hypotheses of a physical theory cannot be tested in isolation. A single experiment always carries about a cluster of physical hypotheses, and a negative experimental result does not necessarily mean the failure of a previously chosen hypothesis whose validity is being tested. "Physical science" - 25 - "is a system that must be taken as a whole; it is an organism in which one part cannot be made to function except when the parts that are most remote from it are called into play, some more so than others, but all to some degree." Einstein, who shared this view, now expresses that it brings with it a certain arbitrariness in the construction and presentation of a theory. Starting from "primary concepts" which are tied to typical complexes of sense data, we make definitions of new concepts and establish, finally, laws of nature which relate themselves for the most part still very indirectly with those sense data. However 26 : 22 23 24 25 26
Einstein 1917; 23 1988 ; Sect.l; see also Einstein 1922; 81990 ; p. llf. Einstein 1989, p. 122f (1954, p. 236) Cf. Howard 1990 Duhem 1962, p. 187f Einstein 31984, p. 67 (1954, p. 293)
126
11.8 Albert Einstein: Theory, Experience, Reality The question as to which of the propositions shall be considered as definitions and which as natural laws will depend largely upon the chosen presentation. It really becomes absolutely necessary to make this differentiation only when one examines the degree to which the whole system of concepts considered is not empty from the physical point of view.
Empirical holism provides, therefore, liberties for conventionalist strategies in the construction of theories, and Einstein would have been the last person not to have made use of such liberties. When, with the question on the content of a physical theory in mind, we return to the special case of geometry, we are confronted by Einstein's thought-provoking admission, that Poincare is right "sub specie aeterni" with his conventionalist view, but that the empiricist view which is based on the usual employment of rods and clocks is to be preferred "in the present stage of development of theoretical physics.'>27 Why do I say that this is a "thoughtprovoking" admission? I say this because hereby it becomes clear how provisional Einstein regarded not only the physics of his time but especially also its epistemological assessment with which we are concerned here. Only in a few professional philosophers of science in our century is one able to discover a circumspect attitude towards and an adequate sensitivity for the problems which the physicist Einstein touches on here. I would like to mention at this point, however, my honored mentor C.F. von Weizsiicker whom I remember with gratitude at this moment. He saw where Einstein was heading on here and generalized his approach into a methodological principle on which his own quasi-transcendental program for a reconstruction of physics is based 28 . Einstein found it deplorable that the special theory of relativity, in spite of all it had accomplished, distinguishes a class of objects, namely standard rods and clocks, which empirically define the fundamental concepts of (relativistic) geometry, without being themselves subject to the special theory of relativity29. With his general theory of relativity and later with the so-called unified field theory, Einstein then tried to create a theory whose geometry could be defined by means of objects which are themselves solutions of the equations of the theory - a theory which at the same time made clear why one could, under the conditions of the special theory of relativity, proceed in the way one does. The theory would in this way become its own measurement theory and could not be falsified by means of objects for which it is not at all responsible. Since in general a measurement theory belongs to the conditions of the possibility of experience relevant to a corresponding object theory, the incorporation of the former in the latter is an important step in the unification of physics based on a complete set of conditions of possible experience. In this way, Einstein's ideas can be introduced into von Weizsiicker's program. 27 28
29
Einstein 1989, p. 123 (1954, p. 236f) Weizsiicker 1971, pp. 195ff (1980, pp. 157ff) Einstein1949a, pp. 58ff (1955, p. 22f)
11.8 Albert Einstein: Theory, Experience, Reality
127
The development of physics into what Einstein prefers to call "logical uniformity" is also, for him, a subject in which one can read his attitude towards science better than in all other conventional schematas. However, it must be unfortunately totally disregarded here, that Einstein had sacrificed many years of his life (in vain) for the establishment of a so-called unified field theory of gravitation and electromagnetic interaction. The following consideration is confined to the philosophical assessment of that adventure. Remaining in the tradition of the distinction between deductive and inductive physics, Einstein noticed even in 1914 in his opening speech before the Prussian Academy of Science30 a certain balance between these two trends. On one hand he points "to a group of facts for the theoretical treatment of which the principles are lacking". With this complex of facts he meant the many specific empirical laws to which one had access, at that time, in the areas of heat radiation, molecular motion and the atomic and molecular spectra, without having found a unified theoretical explanation for them. This did not come until the quantum theory of the late 20s, and therefore this was a case wherein, historically contingently, experimental physics was far ahead of theory. At the same time, however, the reverse situation was the case in another area; the case, namely, as Einstein expressed it "that clearly formulated principles lead to conclusions which fall entirely, or almost entirely, outside the sphere of reality at present accessible to our experience." Einstein then presents to his new colleagues the theory of relativity as one of such cases, especially the general theory of relativity - at that time still in process with its principle of general relativity for which, however, the body of facts was temporarily missing "to test the legitimacy of our introduction of the postulated principle." And with this clue, Einstein can sum up "that inductive physics asks questions of deductive [physics], and vice versa, the answers to which demand the exertions of all our energies. May we soon succeed in making permanent progress by our united efforts!" Einstein sees these advances in the long run and basically such, however, that the balance shifts in favor of the logico-deductive part of physics: . .. it must be conceded that a theory has an important advantage if its basic concepts and fundamental hypotheses are "close to experience", and greater confidence in such a theory is certainly justified. There is less danger of going completely astray, particularly since it takes so much less time and effort to disprove such theories by experience. Yet more and more, as the depth of our knowledge increases, we must give up this advantage in our quest for logical simplicity and uniformity in the foundations of physical theory31. 30 31
Einstein 1989, pp. 110-3 (1954, pp. 220-3) Einstein 1950, p. 15
128
11.8 Albert Einstein: Theory, Experience, Reality
Precisely because older theories were still closer to direct experience, their authors had not noticed how free they were in theory formation. Einstein explains this 32 : The view I have just outlined of the purely fictitious character of the fundamentals of scientific theory was by no means the prevailing one in the eighteenth and nineteenth centuries. But it is steadily gaining ground from the fact that the distance in thought between the fundamental concepts and laws on the one side and, on the other, the conclusions which have to be brought into relation with our experience grows larger and larger, the simpler the structure becomes - that is to say, the smaller the number of logically independent conceptual elements which are found necessary to support the structure. Einstein reminds us of Newton who still thought that the theoretical foundations of his system could be derived from experience. Generalizing, he says that at this early time the natural scientists were supposed to have been entirely filled with the thought "that the basic concepts and fundamental laws of physics could be derived through 'abstraction', that is, by logical means, from experiments". According to Einstein's conviction, these researchers were plainly mistaken. The danger of being mistaken is of course greater, the closer we are to the sense data. However, as we have already heard, there is basically no logical way which leads from these data to concepts. In what Einstein calls the "striving for the greatest conceivable logical unity of our world view", the tremendous predominance of the purely theoretical part of physical knowledge is expressed as it is represented by the great theorist Einstein. But he can also refer to the history of physics independent of his own works and print out a development towards a stratified structure of our knowledge of nature 33 . The construction of primary concepts which are directly related to sense data, and which precisely because of this "lack any logical unity", takes place in the lowest stratum. On the next level a new 'secondary system' "pays for its higher logical unity by having elementary concepts ... which are no longer directly connected with complexes of sense experiences. Further striving for logical unity brings us to a tertiary system, still poorer in empirical content". In this way a stratified structure of physics results in which "The multitude of layers ... corresponds to the several stages of progress which he resulted from the struggle for unity in the course of development. As regards the final aim, intermediary layers are only of temporary importance, they must eventually disappear as irrelevant." With the apparently unavoidable emergence of the aforementioned stratified structure of physics, however, a problem arises whose solution - as far as we know - still casts a certain light on the liberty of scientific concept formation which Einstein repeatedly evoked. On the one hand, the development 32 Einstein 1989, p. 115 (1954, p. 272f) 33 Einstein 31984, pp. 67 ff (1954, pp. 293ff)
II.S Albert Einstein: Theory, Experience, Reality
129
of physics appears to confirm the freedom thesis brilliantly. Einstein needs only to advance his general theory of relativity and can say: 34 . .. which showed that one could take account of a wider range of empirical facts, and that, too, in a more satisfactory and complete manner, on a foundation quite different from the Newtonian. But quite apart from the question of the superiority of one or the other, the fictitious character of fundamental principles is perfectly evident from the fact that we can point to two essentially different principles, both of which correspond with experience to a large extent; this proves at the same time that every attempt at a logical deduction of the basic concepts and postulates of mechanics from elementary experiences is doomed to failure. On the other hand, this absence of an "ars inveniendi" does seem to be a serious lack which can be balanced only by means of a certain intuition supported by empathy into experience. But how can we, under these conditions, make advances in physics - advances which consistently build one upon the other - and avoid any form of caprice? the answer 35 : In this methodological uncertainty, one might suppose that there were any number of possible systems of theoretical physics all equally well justified; and this opinion is no doubt correct, theoretically. But the development of physics has shown that at any given moment, out of all conceivable constructions, a single one has always proved itself decidedly superior to all the rest. Nobody who has really gone deeply into the matter will deny that in practice the world of phenomena uniquely determines the theoretical system, in spite of the fact that there is no logical bridge between phenomena and their theoretical principles. Here an empirical argument is given in favor of the unambiguous development of physics. Einstein, however, is not at all deterred in this matter from dwelling on "higher" insights. In a paper from 192936 he explains his ever repeated demand for the logical uniformity of a theory to the effect that "we do not only want to know how nature is . .. but rather also . .. want to arrive at that probably utopian and seemingly presumptuous goal to know why nature is such and not otherwise". The kinetic theory of gases is mentioned as another showpiece. Of such theories it is said in an almost paradoxical way that they succeed "in grasping empirical lawfulness as a logical necessity". Under the assurance (given in a footnote) that he does not want to convey 34
35 36
Einstein 19S9, p. 116 (1954, p. 273f) ibid. p. 109 ( p. 226f) Einstein 1929, p 126. The following explanations refer to the so-called 'constructive theories' which Einstein wanted to distinguish from the so-called 'principle theories'. See Einstein 19S9, p. 127f
130
11.8 Albert Einstein: Theory, Experience, Reality
"epistemological wisdom" but rather "a certain experience of research", Einstein then adds enthusiastically that in the kinetic theory one feels as it were "that God himself could not have determined in another way the connections as they really are. ... This is", he concludes, ''that Promethean element in the scientific experience which is encapsuled in the school expression 'logical uniformity'. Here lies for me always the actual magic of scientific reflection; it is so to say the religious basis of scientific endeavor." About the puzzling phenomenon of a theory which in spite of its logical arbitrariness is definitely forcing itself on the researcher as "correct", Einstein has said: "Epistemologists are severely reproached by the physicists for not acknowledging this circumstance adequately." Unfortunately it happens rarely that physicists tell the philosophers explicitly where the shoe pinches. It is possible, however, that this is not the only reason why it is likewise rare enough that philosophers are capable of saying something helpful to the physicists. Be this as it may, Einstein has taken the paradox out of the phenomenon when he taught us what he meant with mental freedom. This freedom does not resemble the freedom of a novel writer who can invent whole stories but rather "the liberty of a man engaged in solving a well-designed word puzzle".37 Freedom consists in the concerned person's being able to suggest every word as a solution. In view of the circumstance that the puzzle has only one single word as a solution, this freedom seems to be insignificant for the solution. But if we did not have it, finding a solution would in certain circumstances become impossible for us, and in any case the wit in puzzle solving would get lost as the vocabulary decreases. That nature is, however, also a very well made puzzle to the physicists - a puzzle with an essentially unambiguous solution - the faith in this is, so I think, for Einstein at the same time the faith in a reality which is independent of our sense data. To begin the last train of thought which shoves Einstein's understanding of reality on to center stage, we must once again take into account the circumstance that this understanding has also participated in Einstein's general philosophical development. From a letter to Schlick38 we can gather how Einstein already in 1917 complained to a philosopher about the multiple meaning of the word "real", which can apparently be used to describe a direct ''reality of experience" as well as an already physically constructed "reality of events". The problem was then intensified for Einstein obviously on the occasion of his reading Elsbach's book entitled Kant und Einstein. He expresses his dissatisfaction with the position of the Neokantians and holds, that a direct reality of experience as well as a so-called real external world could have a justification to exist. The opposing traditional positions of philosophical idealism and realism, are now interpreted by Einstein as being "the [painful] incomprehensibility of setting up a conceptual system linking the experiences" on the one hand and "the acceptance of the reality hypothesis" on the other 37 Einstein 31984, p. 69 (1954, p. 294f) 38 Hentschel 1986, p. 483f
11.8 Albert Einstein: Theory, Experience, Reality
131
hand. And then the question arises as to whether a difference exists at all between these two positions, i. e. between the assumption of the coherence of that conceptual system and the reality hypothesis. According to Einstein's insight at that time, even the realist is admittedly able to recognize the miracle or that successful conceptual system ''which cannot be removed from the world with the help of any philosophical sophistry", but he is certainly not able to explain it. Six years later (1930), however, Einstein once more criticizes Schlick's philosophical development as too positivistic, and argues: "I tell you quite plainly, physics is an attempt at a conceptual construction of a model of the real world as well as its lawful structure."39 And henceforth, such formulations accumulate and become almost a cliche: "The belief in an external world independent of the perceiving subject is the basis of all natural science.,,4o The question is how Einstein wished such remarks to be understood. Firstly, one must say that his view since the end of the 1920's was shaped by his antagonism towards the Copenhagen interpretation of quantum mechanics. As we have heard, it was through this that an understanding of reality arose which was irreducibly connected with observation and probability. Against this, Einstein held on to his conviction, that physics does not deal with sense data or measurement results and that God does not play dice. Quantum mechanics was for him compatible with this view only when one considers it as an essentially incomplete theory. Einstein merely gave vague hints about how it could be completed. His resistance against the Copenhagen view is important for his understanding of reality because he has provided us with the relatively most precise idea of a physical system as something real in general. This idea rests on the distinctness of the parts of a system, in the sense that the state description of the whole system consists of the simple "addition" of the state descriptions of the subsystems. The state description of quantum mechanics violates this principle in a striking way and because of this it is suspected of being incomplete. Even the statistical information about a system which is officially regarded as complete does not generally carryover to its subsystems. This peculiarity has been tested recently in very exact experiments. It disagrees with macroscopic experience even more strikingly than the temporal reversibility of fundamental laws of nature disagrees with the macroscopic irreversibility of the processes which we daily experience. One can, therefore, hardly blame Einstein when he once ironically notes that "all men, including the quantum theorists, cling to this thesis of reality, as long as they are not discussing the principles of quantum theory'.n. Nevertheless, according to the aforementioned the present opinion on Einstein's understanding of reality could not be favorable, had it not contained 39 40 41
Holton 1981, p. 233 Einstein 1989, p. 159 (1954, p. 266) Einstein 1955c, p. 14; italics mine
132
11.8 Albert Einstein: Theory, Experience, Reality
certain features which are independent of the problems which quantum theory has raised. For what I, from this point on, still have to say I claim such independence. The first such feature appears to be totally implausible at the first glance. When one considers it important, Einstein's realism deserved to be called "programmatic" realism. We read, for instance, 42 that "the 'real' in physics is to be taken as a type of program", or that "'Being' is always something which is mentally constructed by us, that is, something which we freely posit." And with such remarks Einstein thinks that he could even refer to Kant when he says that "I ... came to understand the truly valuable which is to be found in [Kant's] doctrine ... only quite late. It is contained in the sentence, 'The real is not given to us (gegeben) but put to us (aufgegeben) (by way of a riddle)." Whether the reference to Kant is justifiable may be left here as a moot question43 . In any case, the last mentioned remark clarifies what ought to be expressed by the earlier. The task or program of physics is plainly to establish a theoretical world view. Of this continuously changing world view the following is certain in principle: Our sense data do not already constitute its meaning but rather only its ever new touchstone. Each time, its particular meaning approaches reality. Such is what is intended. But we can not reach beyond this world picture through the assurance that it is a picture of reality. The picture remains a conceptual construct: "This conceptual construct relates ... to the real (by definition) and every further question on the 'nature of reality' appears empty." The argument seems to be this. A glance at physics shows that its content differs from sense data. In the degree that this content, in contrast to the mere given facts of the sense data, remains for us a task to be revised time and again, will the distance between the two become constantly even greater. On the question of the essence or nature of reality, it is indeed only this divergence which is to be detected. Through this it will probably become understandable in what way or to what extent the real can be a program. A further question remains, however, and that is, what shall direct us in the realization of this program. Einstein here is completely un/tmbiguous in his ever repeated tribute to empiricism: 8"The degree of certainty in which the relationship [of our concepts and statements to sense data] ... can be explained, and nothing else, distinguishes empty fantasy from scientific 'truth'." Without the "light of reason", however, we would not have physics as well. Without reason we would not have concepts and thereby have no freedom in the construction of theories. This freedom was for Einstein of prime importance. According to his own judgment, "The theoretical attitude here advocated is distinct from that of Kant only by the fact that we do not conceive of the 'categories' as unalter42
43
The following quotations are from pp. 674, 669 and 680 of Einstein 1949b (pp. 500, 496 and 505 in 1955b resp.) Einstein paraphrases an expression of Kant in the "Critique of Pure Reason", B 526f. According to this, given something conditioned, in the realm of phenomena a regressus in the series of all conditions is only proposed (aufgegeben); however, in the realm of noumena it is already given (gegeben).
11.8 Albert Einstein: Theory, Experience, Reality
133
able (conditioned by the nature of the understanding) but as (in the logical sense) free conventions.,,44 In his rationalistic understanding of reality, Einstein however resembles more remarkably the later idealists than he does Kant. In absolutely concrete contexts, e.g. in the question - a question which has become difficult due to quantum theory - of when a radioactive atom decays, he comes up suddenly with an abstract appeal on the reasonableness of our theories 45 : One may not merely ask: 'Does a definite time instant for the transformation of a single atom exist?' but rather: 'Is it, within the framework of our theoretical total construction, reasonable to posit the existence of a definite point of time for the transformation of a single atom?' One may not even ask what this assertion means. One can only ask whether such a proposition, within the framework of the chosen conceptual system - with a view to its ability to grasp theoretically what is empirically given - is reasonable or not. In the refusal to ask what the setting of a definite moment of the decay means, we have here the typical idealistic repulsion to isolated facts, and in the challenge to test the reasonableness of this setting "in the framework of our entire theoretical construction" we have the typical postulate of coherence theory, i.e. to take the possible incorporation of an event in a currently existing (and proven) system as the criterium for its reality. Einstein is not alone among the physicists in holding this view. For a long time, physicists have resisted efforts to subject their theories to too strict an evaluation of their truth or falsity. Although the self-understanding of physicists has had an anti-idealistic development for 150 years, this one bridge to the idealistic coherence theory of truth has been never completely pulled down 46 . In contradistinction to traditional philosophical idealism, Einstein considers the "believing rationalist, in whom the problem of gravitation turned him into" primarily not to be the bold speculator who attempts to conceive of the identity of mind and nature but instead the modern physicist ''who seeks the only reliable source of truth in mathematical simplicity". This conception is in no way self-evident for someone who, like Einstein, renounced with his thesis of free creation not only every inductive research assistance but also had to experience - and downright brought about himself - the dubiousness of such a mathematically impressive structure like Newton's mechanics. In what other circumstances was still Descartes, indeed, as he went to work with a handful of each mathematics and physics, borne by the conviction that the fundamental laws of nature are simple! Nothing which happened in between, however, appears to have deterred Einstein ultimately. Remaining in the classical tradition, he expresses his conviction47 : 44 45 46
47
Einstein 1949b, p. 674 (1955, p. 500) ibid. p. 669 (p. 496) Cf. Scheibe 1986b Einstein 1989, p. 116 f. (1954, p. 274)
134
11.8 Albert Einstein: Theory, Experience, Reality I am convinced that we can discover by means of purely mathematical constructions the concepts and the laws connecting them with each other, which furnish the key to the understanding of natural phenomena. Experience may suggest the appropriate mathematical concepts, but they most certainly cannot be deduced from it. Experience remains, of course, the sole criterion of the physical utility of a mathematical construction. But the creative principle resides in mathematics. In a certain sense, therefore, I hold it true that pure thought can grasp reality, as the ancients dreamed.
In the sense which Einstein meant here, his rationalism and his realism therefore converge. All in all, which was now the epistemological position which Einstein had taken? He certainly did not wish to appear frivolous when he called, in this regard, the modern scientist and, at any rate, also himself an unscrupulous opportunist. You will recognize in his own description of epistemological opportunists what I, in concluding, once again summarize as features of Einstein's position, 48:
He appears as a realist in so far as he seeks to describe a world independent of the acts of perception; as idealist in so far as he looks upon the concepts and theories as the free inventions of the human spirit (not logically derivable from what is empirically given); as positivist in so far as he considers his concepts and theories as justified only to the extent to which they furnish a logical representation of the relations among sensory experiences. He may even appear as Platonist or Pythagorean in so far as he considers the viewpoint of logical simplicity as an indispensable and effective tool of his research. But Einstein, of course, was not only an opportunist. He was one who admitted wonder. As we have already heard, he had repeatedly admitted, that the comprehensibility of the world and therefore that with which science deals is, on its part, incomprehensible. In Einstein's work, such as physics and philosophy, that opportunism and this resignation find their unique conjunction. Einstein has expressed this himself when he says; "The [physicist] believes that ... the totality of sensory data can be 'understood' on the basis of a conceptual system of great simplicity. The skeptic would say that this is a 'belief in miracles'. It is this, indeed, but it is a belief in miracles which has proved itself in an astonishing degree in the development of science.,,49 The resignation turns into belief in miracles and the object of this belief is of course not capricious deviation from natural law, not a miracle in the usual sense, but rather natural law itself. In a great researcher, "a deep conviction of the rationality of the universe" will be accompanied by "a yearning to un48
49
Einstein 1949b, p. 684 (1955, p. 508) Einstein 1950, p. 13
II.8 Albert Einstein: Theory, Experience, Reality
135
derstand, were it but a feeble reflection of the mind revealed in this world" 50 . The existence of natural science, as it stands now, is due solely to the effect of the joining of this belief with this longing. These are just different words for the same thing; they are, however, words which take the form of a legacy, with which I wish to end this lecture 51 : One thing I have learned in my long life: that all our science, measured against reality, is primitive and childlike - and yet it is the most precious thing we have.
50
51
Einstein 1989, p. 17 (1954, p. 39) Hoffmann/Dukas 1972, p. VII
11.9 Heisenberg's Concept of a Closed Theory* The concept to be discussed is little known. To begin with, we shall want to introduce it exactly in the sense in which its creator understood it. The only exception to this rule is the addition of a contribution by von Weizsiicker that is present everywhere in Heisenberg's reflections. Heisenberg touches on, and explicates, the concept of a closed theory in over half a dozen works that are evenly distributed over his entire productive life. l The first published work is from 1934. In his autobiography of his career as a scientist, however, Heisenberg dates the first conversation on this topic back to the year 1929. 2 The final work is from 1973. The concept is present as well between the lines in the posthumously published writing 'The Order of Reality' - although there it occurs in a context other than physics. 3 The requirements that define the concept of a closed theory are appropriately divided into three groups. The first group comprises the general logical requirements: The theory is axiomatized. Within the theory we can distinguish concepts, and laws formulated in these concepts. And the axiomatic system is consistent. In the second group we find the general empirical requirements: The concepts of the theory are empirical concepts, and within certain limits the laws of the theory have been proven empirically. I call the requirements of these two groups 'general', because one would rationally want any theory - closed or not - to meet them. The single requirement of the third group is specific to the idea of the closedness of a theory. It exists in two equivalent versions. Of the first, which I want to call the Heisenbergian version, I shall provide three variants. In doing so I shall generalize Heisenberg's formulations, which are usually given with the help of the example of Newtonian mechanics as a paradigm of a closed theory, to an arbitrary theory T, which already satisfies the said general requirements. The following is our first formulation 4 : To the extent to which one can describe any given appearances with the concepts [of TJ, the laws [of Tj also hold with strict validity ... Heisenberg himself immediately revises this formulation by saying: More precisely. .. perhaps ... : The laws [of Tj are valid with the same degree of accuracy with which the appearances are describable using the concepts [of Tj. * First published as Scheibe 1993d. Translated for this volume by Hans-Jakob 1
2 3
4
Wilhelm Heisenberg Heisenberg Heisenberg Heisenberg
1934, 1936, 1948, 1959, 1969 (Chap. 8), 1970, 1973 1969, p. 131 1989 1969, p. 135
136
11.9 Heisenberg's Concept of a Closed Theory
137
This is, of course, not simply a variant of the first formulation, but in a certain sense a quantitative strengthening. The following formulation differs from the first only in terms of its linguistic expression 5 : Wherever the concepts [of T] can be used for the description of natural processes, the laws [of T] are exactly correct ... Formulations such as these may be regarded as typical, as far as the existing textual material is concerned. 6 At this point I merely want to note the fact that all variants of the Heisenbergian requirements have - roughly speaking - the form of an implication, the premise of which speaks only of the concepts of a theory, while the conclusion speaks only of its laws. We further want to observe that in the premise there is talk of the possibility of the application of concepts, while in the conclusion on the other hand the validity of the laws is simply asserted. I now turn to the equivalent version of von Weizsiicker. Heisenberg recalls 7 : "During a colloquium [in Starnberg] the question was posed by [v. Weizsiicker], whence the closed theories in physics derive their persuasive power, or which criteria would justify the assumption that small corrections could no longer be made to these theories ... " Accordingly, Weizsiicker's requirement reads 8 : T cannot be corrected by means of small (or: only by means of large) changes. Small changes to a theory concern only its laws, e. g. the addition of a corrective term, but not its concepts. Large changes, on the other hand, are those that already affect the conceptual structure of the theory. On this interpretation, Weizsiicker's requirement is equivalent to that of Heisenberg. For the latter ties the validity of the laws to the applicability of the concepts. If one wanted to change the laws of such a theory, one could do so only through a change of the concepts, that is, through a large change. If a theory, however, is not closed, then one is free to change its laws without effecting a change in the concepts. And in that case Weizsiicker's condition does not apply either. So much for the general definition of our concept. Heisenberg regarded four theories of physics as closed: Newtonian mechanics; thermodynamics, including its statistical version; electrodynamics, including the special theory of relativity; and quantum mechanics. He expected that a future theory of elementary particles would be a candidate for a fifth closed theory. For the general theory of relativity he left the issue undecided. 9 Heisenberg's remarks about these examples of closed theories do not have the strict character of arguments to the effect that in these cases we are dealing with examples. But 5
6 7 8
9
Heisenberg 1959, p. 84 Heisenberg 1936, p. 91; 1948, p. 333 f.; 1970 in 1971, p. 308; 1973, p. 141 Heisenberg 1973, p. 140 v. Weizsiicker 1971, p. 193 f., 213 ff., 232 ff. (1980, p. 156, pp. 173 ff., pp. 188 ff.) Heisenberg 1959, p. 86 ff.
138
H.9 Heisenberg's Concept of a Closed Theory
these remarks do suggest that regarding the question, whether the requirement of closedness is properly a logical or an empirical one, that is, whether in this regard it should be placed into the first or into the second group of our general requirements, he would have said: It is empirical. It must be shown through its application whether a theory is closed. I shall return to this question. Weizsacker's version of the concept of a closed theory already points to the fact that its proper field of application is the development of physics and the question of its possible unity.lO Heisenberg seems to envisage three developmental steps of increasing complexity. The simplest step consists in the mere expansion of the domain of application of a theory, that is, the "application of already known propositions [and hence concepts] to new objects".ll In a further step, a change of the laws can occur as well, while still leaving the concepts untouched. Heisenberg assumes that in this step the new laws even contradict the old ones. 12 The most far-reaching, revolutionary changes are those in which already the conceptual structure of the theory is affected. Such a modification necessarily occurs when a closed theory is to be overcome. The succeeding theory still talks about (roughly) the same objects. But its propositions have a new sense, one that competes with the old: "The most important new result of atomic physics was the recognition of the possibility that completely dissimilar schemata of laws of nature Ii. e. closed theories] can be applied to the same physical events without contradicting one another. This is because of the fact that in a certain system of laws, due to the basic concepts on which it is built, only certain types of questions have a sense, and that through this it closes itself off against other systems [what is meant is always closed theories] in which other questions are posed.,,13 Even though closed theories do not yet make up the whole of physics, their appearance threatens its final unity. Strictly speaking, these theories cannot be eliminated through any progress. The optimal final state of physics would at best consist of a statement 1) of all the closed theories for the purpose of an account of nature that is as complete as possible and 2) of all relations between these theories for the purpose of the optimization of the unity of our picture of nature. "Thus the edifice of the exact natural sciences can scarcely become a coherent unity in the naive sense previously hoped for, such that starting from one point in it one could get into all the other rooms 10 11
12 13
On the issue of how this fits in with the concept of progress developed by physicists themselves, see Scheibe 1988b Heisenberg 1934, p. 701 Heisenberg 1959, p. 84 Heisenberg 1934, p. 701 (my emphasis)
11.9 Heisenberg's Concept of a Closed Theory
139
simply by following the path prescribed. Rather, it consists of individual parts, each of which, although standing in the most manifold relations to the others, containing some of them and being contained in some of them, nevertheless forms a self-enclosed unity. The step from those parts of the edifice that have already been completed to those that are newly discovered or that are to be newly constructed always requires a mental act that cannot be performed by merely developing further what already exists.,,14 Here we are not able to give a thorough analysis of the concept of a closed theory. I want to discuss in more detail only one question which will show how far we are from an understanding of our concept. The question is, whether the ascription of closedness is properly an empirical or a logical-analytical matter, that is, whether or not, in order to judge about its closedness, we must first have gathered experiences with the respective theory. I already mentioned that the manner in which Heisenberg presents his examples of closed theories suggests that he favors the former possibility. But at this point I would like to draw attention to the fact that these examples are themselves quite heterogeneous. This would mean that some should be more and some should be less likely candidates for closedness, if, as can hardly be doubted, closedness is either always an empirical or always a logical matter. Thus classical electrodynamics is a theory of a certain kind of interaction, while classical mechanics and quantum mechanics essentially are not. The laws of the latter are far more general and hence, as we shall see in a moment, they have a much better chance of being closed than electrodynamics. Moreover, their specifications through the choice of certain laws of force or Hamilton-operators are good candidates for theories that are not closed. One can change and correct precisely hypotheses of this kind without touching the basic concepts. Electrodynamics, however, has the same degree of generality as these specifications and not as the original theories. From this point of view, it seems to me that nothing is decided with regard to our question. And at times Heisenberg himself talks in ways that suggest that his thoughts are going into the opposite, logical direction. At one point, for example, he says15 that classical physics "[is based] on a system of axioms formulated with mathematical rigour, the physical content of which is established by the fact that through the choice of the words that appear in the axioms the application of this axiomatic system to nature is definitely traced out". The passage continues: "Thus the claim to truth of classical physics seems - like that of any mathematical proposition - absolute ... " And then we are given the usual formulation of closedness for the case of classical mechanics. I do not want to be more precise than this text. But one is surely not overstating the point to say that what is suggested here is, even 14 15
Heisenberg 1934, p. 702 Heisenberg 1936, p. 91 (my emphasis)
140
1I.9 Heisenberg's Concept of a Closed Theory
if not the analyticity of classical mechanics itself, then at least that of its closedness. In this connection I want to point out that the character of closedness in the sense of Heisenberg's original definition if fulfilled analytically, if the theory concerned is in turn a logical-mathematical one. Quine has even expressed this in a form which essentially makes use of von Weizsacker's version of closedness. 16 First he notes that mathematics and logic are true by virtue of the meaning of certain words. But this inner necessity does not make them immune to change. Their possible abrogation, however, does not mean that we suddenly deny that the laws of mathematics and logic are true on the basis of meanings. It is just that because of this, every change of the laws is perceived by us to be an introduction of new meanings for old words. Indeed, if we think of classical propositional logic for instance, we have a field of application consisting of propositions that are either true or false. On this basis we explain the logical operations 'and', 'or', 'not' etc. as fundamental concepts of our logic. And it is on the basis of these definitions that the laws of classical logic have their validity, for example, the tertium non datur and the law of distributivity. Wherever we are able to apply the operations thus defined, that is, where we are dealing with bivalent propositions, these laws apply. This logic is thus closed, and it is out of the question that we bring about its end by falsifying one of its laws while maintaining its concepts. Of course we are able to replace this logic with another logic, whose logical operations then have a different meaning than the classical operations. Intuitionism, for example, does not regard the principle of bivalence as an appropriate starting point for mathematics. Accordingly, it does not use this principle for the definition of the logical operations, but gives them a different, an intuitionist sense. In quantum theory - another example - we encounter a situation which some observers have interpreted as the existence of certain propositions about individual systems that are not bivalent for reasons other than intuitionist ones. Again, new logical operations were introduced in order to do justice to this situation, and thus, as in the case of intuitionism, there emerged a new corpus of logical laws that is closed precisely in the sense of Heisenberg's concept. From the point of view of these considerations, familiar to logicians, it would seem to be almost the point of Heisenberg's concept of closedness that its establishment for a theory is analytic even though it concerns an empirical theory. We are familiar with such phenomena from certain similar, albeit not equal, cases. The concepts of major axis and period of revolution, for example, with which we articulate Kepler's third law, would not be applicable, if Kepler's two other laws were not valid. For the latter are presuppositions for the definability of those concepts. When we speak of the period of revolution, we imply that there is a law that guarantees its unique existence. Thus, in a presuppositional sense, the validity of certain laws follows from the 16
Quine 1969, p. 21 f.
11.9 Heisenberg's Concept of a Closed Theory
141
applicability of certain concepts. But here - and this is what makes this case different - we are dealing with defined, rather than with elementary concepts; and in every theory (as a rule) these depend on the validity of laws of the theory in the role of presuppositions. Nevertheless, in our context we must also pay attention to such situations. Thus, for example, the classical concept of a path has no general meaning in quantum mechanics precisely because there we do not have a law that would make it possible for us to define it. This circumstance constitutes part of the incommensurability of quantum mechanics with classical mechanics. Another case that must be considered concerns indeed the elementary concepts of mass and force in mechanics. There is no application of the concept of force in which we do not already presuppose the validity of Newton's second law. At least this is the case, when the application includes, this time not the definition, but rather the determination of forces. The status of Heisenberg's implication, "if the concepts are applicable, then the respective laws are valid", however, is quite unclear in this case. First of all, it is an empirical fact of science that so far no one has yet determined forces without the help of Newton's second law. But even if we abstract from this fact, we would not be dealing with an analytical inference in the narrower sense, but rather once more with a presuppositional relation. I come to my conclusion. First we saw what Heisenberg himself understood by a closed theory. This report necessarily left us with many questions. One of these I then explored on my own initiative: What is the epistemological status of a proposition with which we ascribe closedness to a theory? The discussion of this question revealed that here we are treading on unfamiliar ground. The familiar dichotomy of the logical and the empirical proves to be too rough to be able adequately to describe the situation at hand. Behind all the difficulties we have encountered there is the direct question: What is a closed theory? I have not dealt with this here. This question quickly reduces to the problem: What do we mean when we say that concepts are applicable? Even without being able to answer this question, we all believe that the applicability of concepts is a presupposition for the validity of laws that are formulated in terms of these concepts. From this point of view the matter is quite simple. Yet in the concept of closedness the applicability of concepts and the validity of respective laws reverse their roles: the former becomes the premise and the latter the conclusion. From this point of view, what suddenly becomes important is the question concerning the sense of the premise. Intuitively we feel that we would have solved the secret of closedness, if we could only answer that question. Now you will ask: Are we that stupid that we don't even know what it means for concepts to be applicable? To this I say: This is the stupidity of philosophers, and we owe it to the physicist Heisenberg that he was not afraid to share it with us.
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein* I Scientific realism today is an issue much debated by philosophers of science. 1 However, to the best of my knowledge it was invented by physicists, and this is a fact that seems to have fallen into oblivion. Moreover, scientific realism emerged in a period where there was a general turn in the attitude of the physicists in matters of the philosophical foundations of their science. In obvious connection with the establishment of theoretical physics as a new discipline at the end of the 19th century we see the new theoretical physicists becoming engaged in a lively debate on various philosophical questions concerning physics. And not only did this happen but it was also immediately noticed - from without and within. In the years before the first world war the theologian Adolf v. Harnack is said to have said on occasion: "People complain that our generation has no philosophers. Quite unjustly: it is merely that today's philosophers sit in another department, their names are Planck and Einstein.,,2 And as early as 1901 Wilhelm Ostwald in his lectures on natural philosophy has testified the change: "The mental operations by which scientific work is organized ... are not essentially different from those that are investigated in philosophy. The awareness of this situation has indeed been obscured for some time during the second half of the 19th century; but in our days it is awaken to a most vivid efficacy, and everywhere the spirits are aroused in the scientists'camp to make their contribution to the whole of philosophical knowledge.,,3 Even more extensive commentaries of the entire process can be found without going outside of physics. At the end of the period in question, in a lecture of 1948,4 Arnold Sommerfeld summarizes the development in the followimg words: "During the 19th century the relation between physics and philosophy was strained. First philosophy dominated and wanted to prescribe physics its way .... Later the physicists had become suspicious and rejected any pilosophy . .. " The quarrel between physics and philosophy that Sommerfeld refers to had evolved from that unhappy marriage that some natural scientists had contracted with the natural philosophy of Schelling and Hegel in the early 19th century. 5 The divorce that followed was so thorough that it led to the definite methodical emancipation of the Geisteswissenschaften as well as to a lasting puristic and partially positivistic attitude of the physi* First published as Scheibe 1995b 1
2 3
4
5
Leplin 1984 Seelig 1952, p.45; Sommerfeld 1955, p.37 (eng!. trans!. in 1949, p.99) Ostwald 1902, p.3 Sommerfeld 1948, pp.640ff Helmholtz 1862; Jungnickel/McCormmach 1986, voU, pp.23f 142
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
143
cists towards their discipline. However, Sommerfeld goes on to tell us: "In the 20th century the relation between physics and philosophy changed fundamentally. Right at the beginning in 1900 Planck discovered the quantum of action . .. He thus gave philosphy a hard nut to crack which it will have to deal with quite a while .... The decisive step towards a philosophically deepened physics was taken by Einstein in 1905." Here Sommerfeld alludes to the two new physical theories of our century, the relativity and quantum theory, that introduced profound conceptual revisions and set physics in such a contrast to its past and to common sense that philosophical reflection became imperative. He could even have referred to the earlier kinetic theory of heat that had raised basic questions concerning atomism. At all events Sommerfeld can crown his review with the words: "Since Einstein there is no longer any alienation between physicists and philosophers. The physicists became philosophers, and the philosophers are on their guard not to become engaged in a conflict with physics." That the physicists became philosophers did, of course, not mean that all of a sudden philosophical articles in a professional sense flew from their pen. Gilson has coined the malignant saying: "Nothing equals the ignorance of modern philosophers in matters of science, except the ignorance of modern scientists in matters of philosophy.,,6 In fact one can observe how uneasy the physicists of the first generation of our period felt themselves whenever they were forced by external reasons to philosophize before the public. Ostwald tells us that he had been urged by friends and students to give his lectures on natural philosophy (= N aturphilosophie ), and he confesses right at their beginning that "[he] may not call philosophy a subject that [he] had studied in the normal sense of the word."7 Similarly, it delights us to hear what Boltzmann says at the beginning of a series of lectures on natural philosophy8 that the Vienna ministry had demanded of him. He comments on the large number of people attending the inaugural lecture by saying that he could explain this only by the fact that "[his] present lectures be indeed a curiosity in academic life in a certain respect." And this respect then turns out to be that he as a philosophical layman now has to give these lectures on natural philosophy. He seeks consolation in the most far-fetched explanations why the ministry had imposed this burden on him of all persons, and he assures the astonished audience that his objections were settled by the ministry with the remark "that any other person would do not better". However, we shall see that it really was not that bad what Boltzmann and his followers did when they started philosophizing. There was no doubt a new and creative philosophical spirit among the physicists at the turn of the century. 6
7 8
Quoted from Jaki 1966, p.341 Ostwald 1902, p.1 Boltzmann 1990, pp.12££, 152££ (engl. transl. in 1974, pp.153-8); see also Einstein 1934, p.113 (engl. transl. in 1954, p.270f)
144
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
This being the situation in general I now turn - still by way of introduction - to the more special question: What is scientific realism? Questions about human access to reality have been asked since the beginning of western philosophy. Already in Plato's "Theaetetus" the equating of knowledge and perception is denied, and Protagoras' doctrine that man be the measure of all things is attacked. 9 In modern times Descartes was the first to give a reformulation of the problem.lO Starting out from sense illusions he makes it clear that we can become deceived by almost everything. The epistemic immediacy concerning external things as it characterizes naive realism gets lost, and the monstrous viewpoint of solipsism - an invention of Descartes occurs. Henceforth all philosophers, including Descartes, try to find proofs of the independent existence of bodily things. But one and a half centuries later Kant l l has still to notice that it remains "a scandal of philosophy and general human reason to have to accept the existence of external things as a matter of faith ... " Kant added a new proof which, however, did not find general acceptance either. How desperate the situation finally became is certified after further one and a half centuries in a lecture by G.E. Moore, having the title "Proof of an External World" .12 At the end of his lecture, after laborious considerations including Kant, Moore gives the proof in question by proving that there exist, for instance, two human hands. And answering the question: How? Moore gives his audience even the following details: "By holding up my two hands, and saying, as I make a certain gesture with the right hand, 'Here is One hand', and adding, as I make a certain gesture with the left, 'and here is another'." I don't mention this proof of an external world in order to demolish the reputation of Moore whom one cannot but appreciate in his way. The point is not the philosophical importance of all these proofs that is different from case to case and certainly different in the cases of Kant and Moore. The point is that all proofs are restricted to the kind of experience as we make it in daily life whether this is evident as in Moore or not so evident as in Kant. It is only in recent times that philosophy deals with the problem in question under explicit consideration of the fact that for more than three hundred years there is a continuously progressing scientific experience. And this fact can be made the crucial point of an argument in favour of realism. Among the physicists Helmholtz still argued in favour of realism by physiological investigations: "Our sensations - he says - are effects produced in our organs by external causes.,,13 By contrast, Planck is the first who clearly bases realism typically on a development of physics that leads away from all questions of sensation or perception in the usual sense. He supports realism 9 10 11
12
13
Plato, Theaetetus 151d-186e Descartes, Meditations, esp. Med.1 and 6 Kant, Critique of Pure Reason, B XXXIX, B 274ff Moore 1939, pp.127ff, 146 Helmholtz 1879, pp.18f
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
145
by the continuous empirical success of physical theories that are about more and more remote objects, divested of the subjectivity of human perception, but at the same time fundamental for the construction of physics. It is the realism backed up by this kind of argument from scientific progress that may justifiably be called 'scientific' realism. And if we ask hence this scientific realism comes at least one line of development leads us to the physicists around the turn of the century: It leads us to Boltzmann, Planck and Einstein. It leads us, however, as well to the great anti-realist Ernst Mach whose authority none of the aforementioned could ignore. II
Ludwig Boltzmann whom I am going to address first has passed through a mental development that did not lack a tragic stamp. He stood at the threshold of the final establishment of modern atomism. As a young man he was a pure physicist who had enlarged the kinetic theory of gases, initiated by Clausius and Maxwell, by adding the famous equation that bears his name. The kinetic theory, however, was an atomistic theory, and the atoms remained a speculative object during the whole of the 19th century. Many physicists, disappointed by the influences from german idealism, adopted a decided positivistic or, as it was called in those times, phenomenological attitude that was not in favour of atomism. 14 Besides empirical successes there were also physical difficulties for the kinetic theory, and Boltzmann who was convinced of the atomistic approach felt himself more and more bound to propose philosophical arguments to support the theory. Indeed, during the last ten to fifteen years of his life Boltzmann has done mainly philosophical work. The corresponding publications, however, did not have the intended effect. Searching for philosophical support of atomism Boltzmann saw himself more and more driven into the camp of his adversaries. It would be an exaggeration, though, to call this process a conversion, as has recently been done. 15 Boltzmann believed in atoms in the sense that "[the theory of gases] agrees in so many respects with the facts that we can hardly doubt that in gases certain entities, the number and size of which can roughly be determined, fly about pell-mell.,,16 Yet Boltzmann was an atomist of sorts. He took a position somewhere between the naive belief in atoms and a methodical phenomenalism. This position is perhaps most adequately conveyed by looking at the concept of physical theory that Boltzmann developed. On this matter views were circulating at that time which Boltzmann could by no means be in agreement with. A laconic formulation of Kirchhoff's had risen to fame according to which it be the task of mechanics "to describe the motions occuring in nature" where this was meant in the restrictd sense 14
15 16
There were also, of course, adherents of the corpuscular philosophical tradition, see the discussion in Du Bois- Reymond 1872 Blackmore 1982; see also Elkana 1971 and Brush 1990, pp.53ff Boltzmann 1974, p.202
146
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
"that the issue can only be to point out which are the phenomena occurring, not, however, to discover their causes."17 It goes without saying that the kinetic theory did hardly satisfy Kirchhoff's condition. Already here we have reached a point where an unambiguous statement on Boltzmann's position can be made: Whether or not he was a realist in matters of the atoms, he pleaded for a concept of theory liberal enough to guarantee their posssible existence. What concept of theory had this been? Nowadays one would like to say that Boltzmann suggested the use of theoretical terms. In the language of his time the crucial relevant metatheoretic concept was that of a picture (Bild), sometimes also model. It is a ceterum censeo in Boltzmann's papers that theoretical physics strictly speaking does not deal with things themselves but with certain pictures instead that we take of them. Boltzmann gives Maxwell the priority to have introduced the idea of a picture, and he repeatedly mentions Heinrich Hertz as the one "[who] makes physicists properly aware of something philosophers had no doubt long since stated, namely that no theory can be objective, actually coinciding with nature, but rather that each theory is only a mental picture of phenomena, related to them as sign is to designatum."18 If Boltzmann here quotes that in the theory we make us pictures of the phenomena this conceals a little his proper point that in theory we try to use pictures precisely where the phenomena are missing. Such was the case with the kinetic theory: It outlined a picture of something that had not appeared to anybody by that time. It is in this sense that Boltzmann says in the same article quite unambiguously: "Phenomenology believed that it could represent nature without in any way going beyond experience, but I think this is an illusion .... [Every] equation ... idealizes [the processes] ... thus going beyond experience." This transcendence belongs to the nature of the mental operation "consisting as it does in adding something to experience and creating something that is not experience and therefore can represent many experiences".19 Accordingly, Boltzmann's presumably strongest argument against the phenomenological position was that it, too, goes beyond the phenomena, for instance, by assuming matter to be a continuum. 2o Precisely as a representative of this liberal view of theories Boltzmann was seen by his contemporaries. At the annual meeting of the "Gesellschaft Deutscher Naturforscher und Artze" in 1895 in Lubeck Ostwald21 calls out to Boltzmann (in unintended prophecy): "We have .... finally to give up all hope to give a pictorial (anschaulich) interpretation of the physical world by reducing the phenomena to the mechanics of atoms." And to the question which means be still available ''to make us a picture of reality" Ostwald 17
18 19 20 21
Kirchhoff 1872, Vorrede Boltzmann 1905, pp.137f (eng!. trans!. in 1974, pp.90f) ibid. p.144 (eng!. trans!. in 1974, p.96) ibid. pp.78ff, 145 (eng!. trans!. in 1974, pp.41ff and 97) Ostwald 1895, p.162
II.1O The Origin of Scientific Realism: Boltzmann, Planck, Einstein
147
answers in the presence of the man whose gas lectures appeared under the motto "Alles Vergangliche ist nur ein Gleichnis,,22: "In view of such questions I want to call out to you: You shouldn't make yourselves any picture or simile (Gleichnis)! Our task is ... to [look at] the world ... as directly as the COnstitution of our mind will allow. To relate realities ... is the task of science, and we cannot solve it by the underlaying of any hypothetical picture ... " Evidently, speaking of "realities" Ostwald here means "appearances", and these he sees as rivals of the theoretical pictures of Boltzmann and Hertz. Ernst Mach has seen the situation essentially in the same way, though he was even prepared to make an important concession. He would accept an atomistic theory as a means if it leads to useful results on the level of phenomena and no realistic consequences are drawn from it. 23 He appraises the kinetic theory in view of its successes, and he is not against "the liberty that one takes by assuming invisible hidden motions." But this method has a decisive instrumentalistic proviso. Despite the admissibility of arbitrary ideas as means of research it is imperative from time to time ''to purify the representation of the results of research from the superfluous and inessential ingredients which intervene by the operations with hypotheses." It is precisely because our research work ends with this elimination that before everything is permitted for the uninterpretable parts of a theory.24 Just in case the atoms don't belong to the world of phenomena we are free God knows what mathematical ideas to relate to them. If the atoms are not perceivable why then make a picture of them as if they were perceivable or would become so one day?25 If we now come back to Boltzmann we must observe that his statements on the point in question are more ambiguous. He oscillates between realism and instrumentalism. Unambiguous - as we have heard - is his advocacy of a theory concept that leaves room for the introduction of as yet empirically uncertain entities. But if we ask how he intends to fill this place his statements become increasingly cautious. As a physicist he simply believed in atoms and could be very drastic about this. Mach reports that Boltzmann criticized his analysis of sensations by the remark "no sooner could one analyse the sensations than the paths of the atoms in the brain were not known.,,26 Again, in a commemorative address on the occasion of Loschmidt's death Boltzmann remarked that Loschmidt's body had now decayed into its atoms and adds the comment that the deceased himself had put us in the position to know into how many - Loschmidt's number being on the blackbord. 27 This directness of the pragmatically minded physicist contrasts markedly with the philosophical scepsis in the man. Boltzmann is most anxious to 22 23 24
25 26
27
Boltzmann 1896, vol. I, p.4 Mach 1900, p.362f Mach 1912, p.467 and Mach 1872, pp.17ff Instrumentalistic views were common at that time, cf. Pearson 1892, pp.114f Mach 1922, p.256 Boltzmann 1905, p.157
148
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
transform inadequately posed into adequately posed questions. On occasion of his report 28 on Mach's intrusion into a discussion on the value of atomistic theories by the words "I do not believe that atoms exist" Boltzmann starts an argument that here, i.e. in the case of space, time, atoms etc., as distinct from things like tables, dogs and human beings, one might not even know "what is meant by asking whether these things exist." In the address on Loschmidt he considers the question of the constitution of matter as being one of the most important questions of the time. 29 It is only that one puts it somewhat differently today as compared with earlier times. "While formerly one was looking for the ultimate elements of .... matter itself, nowadays it is asked from which simple elements the mental pictures have to be constructed in order to achieve the best possible agreement with the phenomena." By adding that both ways of speaking presumably have the same meaning Boltzmann seemS to suggest: One ought to mean the same by the question whether atoms exist as by the question whether the theory in which we have introduced atoms by means of such and such a picture is empirically successful. Mach has passed judgement on our subject by the impressive words: "If one day the now living physicists will have made their exit from the scene a future historian .... will easily .... disclose how fearfully serious and terribly naive the mechanical, particularly atomistic ideas have been conceived by a large majority of outstanding scientists of our times, and how few scholars of a peculiar way of thinking belonged to the party in opposition.,,3o In the sense of this distinction we certainly find Mach and Boltzmann on the same side. On the other hand, Einstein once said that physicists have to be judged, not on account of their words, but their deeds. 31 And if we take this seriously then the facts are that Boltzmann's major work was a book on the atomistic theory of gases and Mach's one on the analysis of sensations. This opposition of their deeds they have somewhat obscured by their words however well chosen they might have been.
III Boltzmann had second thoughts also on the development of physics. He can even be seen as the founder of an important tradition of thinking in matters of theory progress. 32 In the preceding section we have seen that Boltzmann viewed the empirical success of a theory as a criterion of the reality of theoretical entities introduced into that theory. It was left to Max Planck to link this idea to the idea of progress in physics. In his earlier days Planck was an opponent of atomism who did not believe in Boltzmann's statistical 28 29 30
31 32
Boltzmann 1990, pp.152f (engl. transl. in Boltzmann 1974, pp.153f) Boltzmann 1905, p.152 (italics mine) Mach 1900, pp.363f Einstein 1989, p.113 (engl. transl. in 1954, p.270) Scheibe 1988b (this vol. 11.6)
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
149
foundation of the second law of thermodynamics. 33 But he changed his mind at the beginning of our century in connection with his own epoch-making contribution to the theory of heat radiation. Henceforth the existence of the atoms was no real problem for Planck: "The atoms - he writes in 1908 -, however little we know of their properties in detail, are no more and no less real than the celestial bodies or the terrestrial objects of our environment."34 Right after his conversion Planck sees the problem of realism, as far as it concerns physics, in a wider scope within which atomism played an important but not an all-important role. This view transpires already from his first philosophical paper of 1908 coming from a lecture given at the university of Leiden. 35 The question Planck seeks to clarify in this paper is the question "[whether our physical world view] is merely a convenient but basically arbitrary creation of our mind or . .. . mirrors real natural processes independent of ourselves." Neither his own position nor that of his opponents is adequately described in this first formulation. But one thing Planck makes entirely clear right at the beginning. He wants to link this question with the other, in what sense and in what direction physics has made progress and whether this direction can be determined as a development towards the unity of physics. The main thesis then is that this particular development towards unity does not only actually occur but is also an impeccable sign that physics is concerned about a real external world and steadily increases our knowledge of it as being an entity independent of the human mind. In closer detail the development is described as a twofold process characterized by losses and gains. The loosing business in this process36 is the de-anthropomorphization of our primary world - "the conspicuous elimination of the human-historical elements from all physical definitions." Planck admits outright that this abstraction "is a heavy drawback for the exploitation [of the emerging, purely physical world view] in the reality [of our life]." He speaks of "invaluable advantanges that are worth such a self-renunciation" and asks: "What is the peculiar moment that inspite of these obvious disadvantages provides the future world view with such a decisive precedence?" The answer is given by the other part of the development. This part consists of the amalgamation of an originally extremely disparate phenomenal world into an unitary system. " ... the signature of the entire former development of theoretical physics is an unification of the system which is obtained by a certain emancipation of the anthropomorphic elements and the specific sense impressions in particular .... " In another place Planck uses an old criterion of wholeness: " ... the old system of physics did not equal an unique 33 34 35
36
Cf. Jost 1979 Planck 1949, p.48 "Die Einheit des physikalischen Weltbildes", in Planck 1949, 28-51. The following quotations are from pp. 29-31, 45f and 49. See also the reprint in Heilbron 1988, 301-14, and Planck 1910a, pp.I-9 Concerning this process see Wiener 1900
150
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
picture but rather a picture collection; .... one could remove [each picture] without affecting the others. This will be impossible in the future physical world view. There will be not a single feature of it that could be ommitted, everyone is rather an indispensable constituent of the whole." Physics attempts intentionally at "the complete detachment of the physical world view from the individuality of the creative mind." And in doing this it shows itself that physics developes into a conceptual system of ever increasing simplicity and unity. The argument in favour of realism thus springs from a synopsis of that twofold process: The process cannot be explained but by the assumption that physics is about an external world independent of the mind. This selfunderstanding of physics was certainly not new at the beginning of our century. But is seems fair to say that in Planck's paper it has found a formulation obligatory on the whole century. Scientific realism is the conviction that the fundamental epistemological problem can be solved by pointing out the success of modern physics. 37 Towards the end of Planck's paper something strange happens. 38 Planck supplements the argumentation presented so far by contrasting it with the anti-realistic program of Mach's. And in doing this he cannot avoid becoming polemic. Mach's philosophy of science rests on two pillars: with respect to the subject on his phenomenalism and with respect to method on his principle of economy. In his critique Planck attacks both parts of Mach's view. The attack on the principle of economy being more revealing, I confine my presentation to it. Not without pathos Planck conjures the heros of physics from Copernicus to Faraday to assist him in his assault. It was a battle indeed ''when the great masters of natural philosophy threw their ideas into science .... Economical viewpoints were the very last ones that these men fortified in their struggle against traditional views and eminent authorities. No - it was their unshaken ... faith in the reality of their world view." On this "incontestable fact" Planck then bases the conjecture that by the principle of economy "progress of science would perhaps be unfortunately obstructed." And he concludes this argument and with it his paper with a quotation from the Bible that is to separate the false from the true prophets: By their fruits ye shall know them! It goes without saying that Mach could not leave this attack without an answer. The emerging controversy was sterile concerning the subject but instructive in a psychological respect. In his reply Mach considers himself as mostly misunderstood and tries to play down the differences between his and Planck's views. 39 But sometimes also his pen ran away with him: "After having exhorted us with christian indulgence to respect the opponent Planck finally stigmatizes me with the well known verse of the Bible as a false prophet. So we see the physicists are on their shortest way to become 37
38
Planck's view has been elaborated in Bavink 1947 Planck 1949, pp. 47-51. For the Planck-Mach controversy see also Heilbron 1986,
Ch.II.l 39
Mach 1910; see also Adler 1909
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
151
a church .... " Planck's first reaction to Mach's article foreshadows his eventual reply. In a letter to v. Laue we can read: "Mach's provocation in the current issue of the Physikalische Zeitschrift concerning my Leiden lecture I cannot leave without reply. Until October I wish him all happiness about his article; afterwards he might rather wish not to have written it ...."40 In fact, the ensueing reply can hardly be called fair. 41 After some argumentation strictly confined to the subject in question Planck suddenly interrupts the discussion in the midth of his paper and turns to the task to compare Mach's considerations in his books on the theory of heat and the history of mechanics as physical achievements with his own or at any rate those of orthodox theoretical physics, the verse of the Bible mentioned being the measure of comparison. The result, of course, is not open to question. Much later Sommerfeld covered this somewhat embarrassing controversy with the cloak of charity by saying: "The discussion between Planck and Mach showed the contrast between a creative physicist like Planck and a reflecting physicist like Mach.,,42 In fact there was a good deal of elitism in Planck's attitude. He saw himself not in the position to see the importance of Mach's physical achievements. Consequently, he was irritated by the fact that Mach proclaimed his philosophy of physics as a physicist. The proper core of the matter, however, is that from the behaviour of Planck it becomes obvious how important the matter was for him. It was an important philosophical concern for him to make it safe beyond doubt that physics is about a kind of reality that does not lie on the surface of sense impressions. Of course, this concern by itself does not explain the faux pas he was guilty of. One has to add that Planck emotionally was sure of the matter to an extent that to counterbalance intellectually he saw himself much less in a position than he knew the positivistic position represented by Mach. But at that time he had the impression that Mach's way of thinking began to establish itself in many heads of physicists and could in this way endanger the new generation. Mach denied that at this time his influence had been large or even worth mentioning. Whereever Planck had his impression from , if he had it then it seemed to him that time had come to raise his voice and to use his own infl uence. 43
IV Although such utterances are without any statistical foundation one would like to say today that the controversy between Planck and Mach has 40
41 42 43
Thiele 1968, p.90 Planck 1910b Sommerfeld 1936, p.610 Planck never gave up his position. See his "Positivismus und reale Au15enwelt" of 1930 in Planck 1949, pp.228-45, and a later controversy in Muller 1940 and Planck 1940
152
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
been decided in favour of Planck. 44 Some will then ask: But what about quantum theory? Though it was Planck and not Mach who made an original and highly important contribution to this theory, in retrospect it rather looks as if from a philosophical point of view Mach was better prepared for the theory than Planck. If anything in physics then it was quantum theory that was a serious challenge to the classical view of reality as being independent of the observer, and Einstein, for one, seemed inclined to view the orthodox interpretation of quantum theory as a new kind of phenomenalism. It is precisely for this reason that he rejected this interpretation. To him it appeared mistaken "to permit theoretical description directly depend upon acts of empirical assertions, as it seems to me intended ... in Bohr's principle of complementarity.,,45 By contrast, Einstein held that ''there is such a thing as the 'real' state of a physical system existing independently of any observation or measurement and being describable in principle with the expressive means of physics." Ironically he added: "All men, inclusive of the quantum theorists, stick to this thesis of reality as long as they don't discuss the foundations of quantum theory.,,46 In this sense, i.e. leaving out of consideration the questions that came up with quantum theory, the concluding remarks on Einstein are to be understood. 47 It is better known of Einstein than of Planck that he, too, started out as a Machian. Subsequently Einstein passed through a mental development that led him from Mach's phenomenalism to a further variety of scientific realism. He gratefully acknowledged his origin from Mach as well as he did not deny his later turning away from him. In his 'Autobiography' we read: "[Mach's 'Mechanics'] exercised a profound influence upon me ... while I was a student .... [His] epistemological position also influenced me greatly, a position which today appears to me to be essentially untenable . .. For he did not place in the correct light the essentially constructive and speculative nature of ... scientific thought.,,48 Here as elsewhere Einstein likes to emphasize his later rationalistic position. And this is related to a parallel development of his view of reality. The latter's final version could perhaps best be called a "programmatic or constructive realism". Already Planck had summarized his view in the sentence: "[The real external world] does not appear at the beginning but at the end of physical research.,,49 Similarly, we read in Einstein50 that "the 'real' in physics is to be taken as a type of program", or "'being' is always something which is mentally constructed by us, that is, something which we 44 45 46 47 48 49
50
Sommerfeld 1929, p.1 Einstein 1955b, p.500 (engl. transl. in 1949b, p.674) Einstein 1955c, p.14 Cf. Scheibe 1992b (this vol. II.8) Einstein 1955a, p.8 (engl. transl. in 1949a, p.21) Planck 1949, p.VI The following three quotations are from Einstein 1955b, pp. 500, 496 and 505 (engl. transl. in 1949b, pp. 674, 668 and 680)
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
153
freely posit." Einstein even thinks he may appeal to Kant in this matter. For he points out: "I '" came to understand the truly valuable which is to be found in his doctrine ... only quite late. It is contained in the sentence: 'The real is not given to us, but put to us (aufgegeben) (by way of a riddle).'" It may be left open here whether the appeal to Kant is justified. 51 At any rate it elucidates Einstein's opinion. And this was - to borrow a formulation of Brand Blanshard52 - that "reality is a system, completely ordered and fully intelligible, with which thought in its advance is more and more identifying itself." We here meet already with the most conspicuous feature of Einstein's view on reality. At face value we seem to be confronted with a quite robust realism. "I tell you straight out - Einstein writes in 1930 to Schlick -: Physics is the attempt at the conceptual construction of a model of the real world and of its lawlike structure."53 However, as soon as the question is one of the criteria of reality Einstein becomes more cautious, and thoughts arise that we otherwise encounter in the camp of philosophical idealism. In a book review 54 Einstein asks the (neo-kantian) author rhetorically: "Are ... the realists and with them all scientists (in their non-philosophical moments) not right when, by the highly stunning possibility of the integration of our experiences into a system of (time-space-causal) concepts, they are led to believe in real things independently of their thinking and being?" Here Einstein still distinguishes between a reality hypothesis itself and a reality criterion. But this criterion belongs to the coherence theory of reality, not to the correspondence theory. That Einstein was a hidden coherence theorist can be seen even more clearly from his discussion of a quite concrete scientific question: When have we to expect the next decay of a single radioactive atom? Einstein's answer: "One may not merely ask: 'Does a definite time instant for the transformation of a single atom exist?' but rather: 'Is it, within the framework of our theoretical total construction, reasonable to posit the existence of a definite point of time for the transformation of a single atom ?",55 The refusal to ask the first question reminds us of the typically idealistic rejection of isolated facts, and the suggestion to replace the first by the second question comes up to the typically idealistic postulate to take the integrability of an event into the already existing (and confirmed) theoretical system as a criterion of its reality. 56 51
52
53 54
55 56
Einstein here paraphrases a statement of Kant's in his Critique of Pure Reason, B 526f. It says that, given something conditioned, in the domain of appearances the regress in the series of its conditions is only put to us (aufgegeben) whereas in the domain of things-in-themselves it would already be given (gegeben). Blanshard 1939, vol. II, p.264 Quoted from Holton 1968, p.660 (german original in Holton 1981, p.233) Einstein 1924, p.1685f Einstein 1955b, p.496 (engl. transl. in 1949b, p.669) See for this Blanshard 1939, vol.lI, pp.225ff
154
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
But this is not the end of the story. We find Einstein on coherence theorists' ground not only when it comes to the question of reality criteria. Above this Einstein is quite prepared to identify reality with (complete) coherence. In the book review mentioned he continues the quoted passage with the question: "Is there really a difference between the assumption that the totality of [ourl .... experiences admits of a logical, conceptual system connecting them, and the reality hypothesis?" From the rhetorical character also of this question it is clear what Einstein's answer is. Moreover, not even from the last step that coherence theorists go in this direction did Einstein frighten away: If coherence is not only a criterion of reality but reality itself then there can be only one coherent system. In this sense Einstein said already in 1919 of his general theory of relativity: "The chief attraction of the theory lies in its logical completeness. If a single one of the conclusions drawn from it proves wrong, it must be given up; to modify it without destroying the whole structure seems to be impossible."57 And thirty years later he judged of his unified field theory: "In favour of this theory are ... its logical simplicity and its 'rigidity'. Rigidity means that the theory is either true or false, but not modijiable.,,58 What is here called 'rigidity' of a theory Einstein usually calls its 'logical unity', and the postulate of logical unity is one of the most often repeated requirements of a theory that we find in Einstein's writings. On occasion he can become enthusiastic about the matter. In a paper of 1929 he formulates the postulate in greater detai1 59 by saying "that we do not only want to know what nature is like .... but also .... want to achieve the utopian ... goal to know why nature is as it is." As a paradigm Einstein refers to the kinetic theory, and of theories like this one he says almost paradoxically one there succeeds "in conceiving of the empiricallawlikeness as logical necessity . .. .Even God could not have determined those connections differently from what they in fact are. This is ~ Einstein concludes ~ the promethean element of scientific experience as we try to catch it by the term 'logical unity' ... .it is so to speak the religious basis of our scientific endeavours." It is in connection with the decisive rational element of logical unity that also in Einstein's thinking the development of physics comes into play. Our search for the greatest possible logical unity of the world view is mirrored in the level structure of physics. 6o From level to level the system displays a greater logical simplicity, and this is the way in which physics makes progress. At the same time there is a complementary removal of the anthropomorphic elements of our experience, just as we have found it so much emphasized by Planck: " ... it must be conceded ~ says Einstein61 ~ that a theory has an important advantage if its basic concepts and fundamental hypotheses are 57 58
59
60 61
Einstein Einstein Einstein Einstein Einstein
1989, 1950, 1929, 1984, 1950,
p.131 (italics mine) (engl. transl. in 1954, p.232) p.15 (italics mine) pp.126f pp.67ff (engl. transl. in 1954, pp.293ff) p.15
11.10 The Origin of Scientific Realism: Boltzmann, Planck, Einstein
155
'close to experience' .... Yet more and more, as the depth of our knowledge increases, we must give up this advantage in our quest for logical simplicity and uniformity in the foundations of physical theory." There is a second sort of complementarity at work here. In principle we are completely free to choose the concepts at the various levels. But as a matter of fact, despite all freedom of reason, there is essentially only one route open to us. "One might suppose that there were any number of possible systems of theoretical physics all equally well justified; and this opinion is no doubt correct, theoretically. But the development of physics has shown that at any given moment, out of all conceivable constructions, a single one has always proved itself decidedly superior to all the rest. Nobody who has really gone deeply into the matter will deny that in practise the world of phenomena uniquely determines the theoretical system, inspite of the fact that there is no logical bridge between phenomena and their theoretical principles."62 In conclusion, I think Einstein as well as his forerunners Boltzmann and Planck, in spite of obvious differences in their views on reality, would agree that unless one holds an essentially realistic position the success of science would remain a miracle. 63 And yet there remains the miracle that the world is conceivable at all. Einstein has given it the wording: "The eternal mistery of the world is its comprehensibility.,,64 And again in greater detail: "I believe that every true theorist is a tamed metaphysicist, no matter how pure a 'positivist' he may fancy himself. The metaphysicist believes that the logically simple is also the real. The tamed metaphysicist believes that not all that is logically simple is embodied in experienced reality, but that the totality of all sensory experience can be 'comprehended' on the basis of a conceptual system built on premisses of great simplicity. The skeptic will say that this is a 'miracle creed'. Admittedly so, but it is a miracle creed which has been borne out to an amazing extent by the development of science."65
62
63 64 65
Einstein 1989, p.109 (engl. transl. in 1954, p.226) Cf. Putnam 1984, pp.140f Einstein 1984, 65 (engl. transl. in 1954, p.292) Einstein 1950, p.13
III. Reconstruction
Reconstructionism is a methodology of logical empiricism on according to which in epistemology and philosophy of science "one should not describe the real process of obtaining knowledge in its concrete constitution but rather give a rational reconstruction of its formal structure" (Carnap, see the beginning of [13]). The reconstruction is meant to be a translation of a primary scientific text into a logically impeccable language such that "the new determinations ..... are superior to the old ones with respect to clarity and precision" (Carnap, ibid.). In the sixties this methodology came under fire from two sides. From the side of constructive philosophy of science, its advocates were blamed for keeping their reconstructions much too close to the actual procedure of the scientists without ever giving them a critical touch. By contrast, the representatives of the historically oriented philosophy of science deplored the lack of real life in the reconstructions, these being "generally unrecognizable as science to either historians of science or scientists themselves" (Kuhn, see §I of [13]). In other words, what looked too descriptive for the constructivists appeared too normative for the historians. Reconstructionism thus cornered from two sides is the subject of papers [131 and [141 and is defended against its opponents, mainly those in the Kuhn/Feyerabend camp.! For a systematic theory of rational reconstruction the reader is referred to the main text ([13], §II). The importance of the enterprise lies in its antiabsolutist tendency. Precisely those historians who do not possess such a theory in the background run the danger of believing that there is such a thing as the rendering of a historical event - the most faithful rendering, telling us "how it really happened". In order not to surrender to this danger, the reconstructionist, armed with an explicit theory of rational reconstruction, from the very beginning is clear about the relativity of his enterprise. For him the most important question ist not: is this an adequate reconstruction of science? Instead the claim connected with a given reconstruction is always only the relative one: if we choose such and such a reconstruction frame then such and such a piece of science assumes such and such shape. The interest may not be devoted exclusively to the subject but also to the means of representing the subject under given circumstances. 1
See also Scheibe 1997b, Ch. 1.3; 1986c and 1988g
E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001
158
III. Reconstruction
Perhaps the most important and most frequently used theoretical means in the reconstruction business is logic. No wonder, therefore, that logic also has been the favorite target of Feyerabend's attacks. He speaks, for instance, of the "host of bewildered philosophers of science who have read a few logic books but have never seen science from nearby". 2 But there is no justification for the argument that a logical reconstruction, by the very fact of invoking logic, is insufficiently based upon historical reality. In principle, a logical reconstruction is not exempt from the requirement of being fair to the facts any more than a historical reconstruction is exempt from following the elementary rules of logic and language. It is, of course, true that historical research and logical analysis may diverge. A historian may legitimately become interested in the development of concepts whose very vagueness would defy all attempts at logical analysis. Conversely, a modern reformulation of mechanics may throw no light whatever on Newton's 'Principia'. But whereas no logician would dream of blaming the historian for not having brought in logic, it has become fashionable lately to blame logical reconstructionists for having forgotten historical reality. But the very advocates of the importance of history for appraising science are likely to forget their message as soon as logic enters the scene. However, logic is itself a historical subject and not isolated from science, mathematics and their history. The justification of the critique of the historians can best be discussed in the light of examples. In [13] the general considerations are followed by a presentation of examples that have been suggested (or could have been suggested) from the side of reconstructionists. In [14] the attacks of the opponents are rejected. The cases favoured by Carnap and his followers, namely explications of concepts, had been treated already by Kant in a similar manner. In particular Kant pleaded for the dtermination of adequacy conditions before one tackles the problem of definitions. That reductions can also be seen as reconstructions is shown by the reduction of Aristotelean syllogistic to modern predicate logic (with quantifiers). Finally, descriptive reconstructions are shown to be successful mainly as reconstructions of the various theories of physics. Their 'relativity' finds expression in the well-known procedures of idealization, simplification, neglect etc. without which useful results could not be achieved. Descriptive reconstructions thus prepare the ground for the investigation of intertheoretic relations, especially those playing a central role in the development of physics. It turns out that the crucial intertheoretic relation of incommensurability is by far less dangerous for an understanding of the development than Kuhn and Feyerabend would have us believe. 3 The problem treated in [11] concerns the reconstruction of an adequate concept of a physical theory. Proceeding from the assumption that physical theories are to be reconstructed as so-called species of structures (in the sense of Bourbaki), it is investigated what it is that makes a species of structures 2 3
Feyerabend 1981c, p. 237 See also Scheibe 1999
III. Reconstruction
159
the formal framework of a physical theory and not of something else. Insofar as a physicist is acquainted with examples of species of structures, he knows them from mathematics, i.e. he knows something about groups, rings, fields, vectorspaces, topological spaces, fibre bundles etc. But when applied in physics, these structures appear as deduced from the original physical theories and not as being themselves these theories. In Newton's mechanics, for instance, euclidean space can be found and from this space its topology can be deduced. Likewise, a group can be derived, namely the group of all euclidean transformations. Configuration- and phase-space can be deduced, and from the latter we obtain the commutative algebra of quantities etc. This apparently unlimited deducibility suggests again that their common starting point - Newton's theory - is itself a species of structures and that deducibility is to be understood in the usual sense in which, starting with one species of structures many others are deducible. The main question then is: Are there any features common to those and only those species of structure that are the mathematical framework of some physical theory? Since physical theories are about physical systems and, therefore, physical systems are structures, one can also give the question also the following wording: Is there a feature of structures common to those and only those structures that are isomorphic to some physical system? It is this question that is followed up in [11] without, of course, being solved. Paper [12] would not properly belong to a chapter on reconstructions were it not for illustrating from the viewpoint of reconstructionism how a comparison of two concepts of physical theory is made possible if the concepts have been presented already in an incontestable form. Both concepts had been proposed independently in the seventies. They were meant to be concepts of the same subject matter but they looked rather different. This situation demanded a comparison. On account of the very abstract nature of the subject and its treatment, the comparison came out rather abstract, and the paper is in this respect the most difficult text in this collection. 4
4
A simpler version of (12) is Scheibe 1983
111.11 On the Structure of Physical Theories· In this paper I want to suggest a partial characterization of physical theories by means of the concept of a species of structures in the sense of Bourbaki 1 . Before going into the details it will perhaps be useful to indicate where my approach comes to stand with respect to the existing views on the structure of scientific theories in general. This can most readily be done by reference to the very comprehensive review of the matter given by Suppe2 . In his paper Suppe first goes through several stages of the empiricist view on scientific theories which he calls - following Putnam - the 'received view'. After a thorough-going criticism of it he comes to several alternatives to this view that have been developed of late. Some of them he groups together under the heading 'semantic approaches'. They comprise work done by himself as well as by Beth, Suppes, van Fraassen, Sneed and others. Von Neumann is mentioned as the ancestor of this kind of analysis. It is, therefore, of some importance to remember that when in the late twenties von Neumann started work on the mathematical foundations of quantum mechanics resulting in his famous book3 he did not set about developing an alternative to any received view on scientific theories. At a time when the empiricist view did not even exist von Neumann was rather directly faced with the problem to cast the newly invented mechanics into a form that could satisfy him as a mathematician. He thereby did a piece of work that no subsequent empiricist philosopher of science - perhaps excepting Reichenbach - even attempted to do: the reconstruction of a highly abstract but fundamental physical theory without any quarrels about the theoretical-observational dichotomy and related problems. Accordingly von Neumann's result was nearer to the actual practice at least of the theoretical physicists than any of the various versions of the empiricist views on scientific theories that were to come. Having been interested in the foundation of quantum mechanics I myself have published two books in this field that were written in the spirit of the von Neumann line of analysis4. Since Suppe is quite right in placing the 'semantic approaches' alongside with von Neumann's pioneering work I would, therefore, have to say in retrospect that my own view has to be classified among the 'semantic approaches'. But since it is not identical with any of them I may be allowed to present a new version of it in this paper. As a matter of fact it will only be one particular aspect of the whole problem of the structure of physical theories that I shall be discussing in some detail. But in order to show the context of this particular aspect I am going to premise some remarks on physical theories in general. As I see it, there are at least three main aspects under which a physical theory can be viewed, namely * 1 2
3 4
First published as Scheibe 1979 Bourbaki 1968 Suppe 1974 v. Neumann 1932 Scheibe 1964 and 1973c
160
111.11 On the Structure of Physical Theories
161
a) its mathematical structure b) its physical interpretation c) its intended and actual applications. That there is something like the mathematical structure of a physical theory is perhaps obvious from any modern textbook in theoretical physics. In opening such a book the overwhelming impression of what is going on there is that of more or less complicated mathematical reasoning. Moreover, it can easily be recognized that according as the physical subject matter varies from chapter to chapter or from book to book different kinds of mathematical structures determine the reasonings. Thus already from a superficial look at the relevant literature it may be surmised that physical theories can be characterized by kinds of mathematical structures. It goes without saying that such a characterization is of necessity incomplete since a physical theory that deserves its name has empirical implications that simply transcend the possibilities of pure mathematics. Here we meet the second aspect mentioned in my threefold division: the physical interpretation. Even in a book on theoretical physics the language used is not entirely mathematical. Rather it contains such general physical terms as 'particle', 'field', 'state', 'quantity', 'probability' etc. and very often also more special terms such as 'time', 'space', 'position', 'energy', 'temperature' etc. The use of such terms indicates that the book is not just about mathematical entities but also and mainly about the real world. In analyzing the peculiar way in which the physical terms are used in connection with the mathematical structures involved we would get an impression of how a physical interpretation of the latter is accomplished. This analysis is perhaps the most difficult part in an attempt to give an adequate reconstruction of the nature of physical theories. In particular, there is no general agreement on the amount of interpretation that a theory must be given in order to lead to empirically significant results. In this respect my own use of the term 'physical interpretation' would be such that it does not yet include the actual or possible referents of the theory. Rather it is confined to whatever is necessary in order to identify the intended referents by physical means, e.g. measuring instruments. The intended and actual applications of a physical theory are better viewed under a separate aspect: As compared with a physical interpretation in the sense indicated the announcement of a range of intended or actual referents of a theory is a new logical step. Its distinction from a mere physical interpretation approximately mirrors the general distinction made between meaning and fact. Of course, as there was no sharp borderline between the purely mathematical and the interpretative aspect of a physical theory neither is there any between the latter and the aspect of a theory's applications. These distinctions will always be blurred by the existence of theoretical quantities (or concepts) and by the necessity of constructing our concepts of kinds of physical objects with the help of the very theory for which they are the objects under investigation. Correspondingly there is no consistent use of the term 'theory' separating the three aspects under discussion. Thus, if we speak of Kepler's theory then perhaps all three aspects are present, including well-determined referents of the
162
111.11 On the Structure of Physical Theories
theory. In speaking of Newton's gravitational theory or general Newtonian mechanics only the first two aspects a) and b) come to mind. Finally, modern presentations of general quantum theory mostly are restricted to aspect a). In spite of these circumstances the division suggested will be useful for a first orientation and this is what was intended by it. I am now in the position to say that the following investigation will be confined to aspect a) although on several occasions I shall make comments also on the two other aspects. In thus approaching the mathematical part of a physical theory I first want to draw the attention to the nowadays widely recognized view that modern mathematics is a science or even the science of abstract structures. The view has found its most conspicuous expression in the monumental work on the elements of mathematics composed by a group of French mathematicians under the pseudonym of Bourbaki 5 . It has been accepted by many theoretical physicists in writing books and papers in the more general and abstract parts of physics such as quantum theory, especially quantum field theory, and the general theory of relativity. There are even books written under the aspect of applying this or that particular kind of mathematical structure to physical problems. Owing to the classics of Weyl and Wigner this is well known in the case of groups. In the meantime more and more kinds of structures were included in this kind of pUblications6 . But if we turn from physics to philosophy of science the situation is different: With the exception of some authors belonging to the 'semantic approaches' camp nobody has taken advantage of the 'structuralistic' view of mathematics in the analysis of scientific theories. Now, it may be argued that as long as nothing but the mere occurrence of this or that kind of mathematical structure in theories of physics is pointed out nothing of any philosophical interest has been shown to occur. This is true enough, but it ceases to be true as soon as a general research program is connected with the phenomenon under discussion, and such a program is readily at hand: The really interesting task that poses itself in view of a systematic use of mathematical structures in physical theories is a detailed analysis that would display the specific role played by each kind of structures in a physical theory. In view of the aspects b) and c) indicated above this task quickly turns out to be of considerable complexity and can not even superficially be described in this paper. Rather one further restriction of the following considerations is necessary. My main problem will be the question whether kinds of mathematical structures can be used to characterize physical theories as such. It is therefore the logical part of the program just indicated that I am going to investigate in more detail. As I see it the problem has the following two main aspects: 1) When a particular physical theory is pointed out, e.g. the usual non-relativistic quantum theory of the hydrogen atom or classical Hamilton mechanics or 5 6
Bourbaki 1968 See e.g. Hermann 1970 for vector bundles and Choquet-Bruhat et al. 1977 for manifolds
111.11 On the Structure of Physical Theories
163
the general theory of relativity, can we then point to a kind of mathematical structures and claim that this kind of structures is characteristic for the given theory in the sense that another physical theory would have another kind of structures inequivalent to the first as being characteristic for it and a third physical theory would have some third such thing as characterizing it etc.? 2) Are there one or two or three (or how many?) very general but still interesting kinds of mathematical structures a) such that in each kind of structures characteristic of a physical theory in the sense of question 1) there is involved (exactly) one of these general kinds of structures and b) such that it is this involvement that is responsible for the underlying physical theory to be a theory at all? In other words, in question 1) it is asked how physical theories can be distinguished from each other and in 2) what they have in common as being physical theories, and with regard to both questions kinds of mathematical structures are supposed to be the main tools in answering them. Neither of the questions admits of a straightforward answer since none of the kinds of structures most frequently found in mathematical or physical books is able to characterize a physical theory in the sense of question 1) or 2). E.g. the kind of structures known as groups can neither be used to distinguish one physical theory, say, classical mechanics from another one, say, quantum mechanics nor is it the hallmark of a physical theory as such. That is not to say that the kind of structures called 'groups' and all the other kinds of mathematical structures well known by current names may not occur as part of the kinds we are looking for. Rather it was the very fact of this occurrence that called my attention to the kind of analysis I am going to carry out and to the problems 1) and 2) in particular. But they themselves, the structures known as groups, rings, fields, topological spaces, manifolds etc., do not yet solve these problems. On the other hand, there are all reasons to suppose that certain other kinds of structures not different in principle from those most frequently considered do solve them. The solution will then require a concept of kinds of structures such that the well known kinds of mathematical structures as well as the kinds we are looking for can be subsumed under it. In the relevant literature on the foundations of the empirical sciences it is the concept of a set-theoretical predicate that has been offered and used to clear up matters that are not far off the ideas I am going to develop. Its application to problems in the philosophy of science was probably first suggested by Suppes and recent actual applications to the formation of concepts related to physics are to be found in books by Sneed and Stegmiiller 7 . On the other hand, there is an independent approach in this field by Ludwig8 using the concept of a species of structures in the sense of Bourbaki. Both concepts, that of a set-theoretical predicate and that of a species of structures, belong to the metatheory of set theory and are settheoretical versions of the concept of an axiomatized theory. But for the 7 8
Suppes 1957, Sneed 1971, Stegmiiller 1976 Ludwig 1978
164
111.11 On the Structure of Physical Theories
first concept only naive set theory is presupposed and, as far as I can see, a set-theoretical predicate is understood to be just any predicate with sets as arguments. In contrast with this conception the Bourbakian concept is a precise syntactical concept founded on a strictly formal approach to set theory. At the same time a species of structures is not an arbitrary settheoretical predicate but is subject to certain rather important conditions. Together with a gain in precision we therefore have a greater specificity in this concept as compared with the other one, and these are the main reasons why my further investigations will be based on it. Since it seems to be not well known in the circles that could take advantage of it a very brief introduction may be in order. Because of the somewhat idiosyncratic way in which the Bourbaki group presents the foundations of mathematics I am inclined to modify their concept of a species of structures by adapting it to a standard formulation of (first order) set theory, e.g. to the system of Zermelo-Fraenkel (ZF). With this modification in mind let T be any extension by definitions of ZF. A species of structures (in T) is a formula (some indices suppressed) Xl -I- 0 1\ ... 1\ Xm -I- 0 1\ Sl E Ul [x, alA ... 1\ Sp E Up[x, a] 1\ o:(x,~, S)
}
(1)
where the xI' and S7r' are variables, the ak are terms of T possibly depending on the variables ~'Y' different from the xI' and S7r' the U7r , [x, a] are terms constructed from the XI' and ak by successively applying one of the operations yielding a power set or a Cartesian product, and o:(x,~, S) is a formula in the variables xI" ~'Y' and S7r satisfying the invariance condition f-T o:(x',~, S') +-+ o:(x,~, S).
(2)
Here the x~, result from the xI' by any bijections and the S~ from the S7r by corresponding bijections canonically determined by the former on account of the second line in (1) and (consequently) satisfying f-T S~ E U7r [x',a].
(3)
The second and the third line of (1) are respectively called the typification and the axiom of the species of structures determined by (1). The concept of a species of structures thus defined is obviously a syntactical concept. But it has a semantical counterpart in the following sense: If we introduce a model M of T (expanding a model of ZF) then to every species of structures (1) a class of structures with the principal base sets xI" the auxiliary base sets ak and the typified sets S7r is uniquely assigned: They are the systems of sets in M satisfying (1). The classes of structures called groups, rings, vector spaces, manifolds etc. in ordinary mathematics are classes in this sense with the only difference that in ordinary mathematical thinking
III.ll On the Structure of Physical Theories
165
these classes are not relativized to a model of a formalized set theory: With naive set theory as the background they are rather taken in an absolute sense. As we shall see later on, the sort of applications that I want to make of the concept of a species of structures can easily lead into the well known settheoretical difficulties if they are based only on the naive standpoint. One way out of this dangerous perspective would be to keep the whole consideration on a purely formal level. This is indeed the way chosen by Bourbaki. But the very same applications just mentioned also suggest to think of set theory as being about some sufficiently well defined subject matter: It is the kind of thinking in terms of sets of real possibilities that requires this contentional view of mathematics. A real possibility is an abstract entity realizable in the physical world, and a realization is brought about by a physical interpretation or an actual application of a theory in the sense indicated above. In Newton's gravitational theory, to give but one example in advance, the points of Euclidean space (as a mathematical structure) are real possibilities. In every experiment or observation relying on a spatial reference frame physicists try to realize some of the points of Euclidean space by parts of rigid bodies thereby giving a physical interpretation to them. Again, a system of n functions of mathematical time with values in Euclidean space is a real possibility. With respect to a suitable reference frame it may be realized (approximately) by a system of n gravitating bodies moving in space and thus leading to an application of Newton's theory. In cases like these - the points of space, the orbital functions and many others - I think it to be at least helpful for the understanding that not only a term in a formal language is present to the mind but also the idea of a set of different possibilities telling us what may be realized in nature. It is for this reason that I suggest the introduction of a model of set theory in which the structures of a given species of structures may be 'visualized'. To be sure, the totality of real possibilities related to a physical theory will never be given by the class of structures defined by the species that characterizes the theory. If this would be the situation then on account of the invariance condition (2) with every structure representing a real possibility every structure isomorphic with it would also represent a real possibility. This will never happen in theories about the real world. A physical theory that leads immediately to the description of real possibilities will rather be characterized by a species of structures having no two nonisomorphic structures belonging to it, and the sets of one single structure out of these will comprise the real possibilities provided by the theory. The concept of a species of structures together with its semantical counterpart may now be compared with related concepts that are to be found in the literature. One of them is the concept of a set-theoretical predicate that has already been mentioned. Its explication with respect to our foundations the Zermelo-Fraenkel system and a model of it - obviously consists of an arbitrary formula P (y) of the extension T of ZF, y being the only free variable in P, and the corresponding class of sets y in the model M of T satisfying
166
111.11 On the Structure of Physical Theories
P(y). It is likewise obvious that this is a more general conception than that of a species of structures: The requirement that a set-theoretical predicate should correspond to a species of structures would mean that it has to be of the form
:3 x~S.y
= (x, a(~), S) 1\ E(x,~, S).
(4)
where E is the formula (1) subject to the conditions connected with this formula. I do not want to justify the restrictions imposed on the concept of a species of structures as compared with the concept of a set-theoretical predicate other than by pointing out that the solution of problems 1) and 2), if possible by means of one of these concepts, is possible by means of the first. There is, however, the problem whether certain infinities, not covered by either of the concepts, must be taken into account. This leads to one other related conception developed in modern mathematical logic (in the sense of model theory): the concept of a formal theory and its corresponding class of models 9 . Model theory usually is developed on the basis of naive set theory only, and it is mostly confined to first order and even to one-sorted theories. For the sake of comparison the first circumstance can easily be overcome by relativization of the concept of a model to the one selected model M of set theory: given an arbitrary theory only its models in M are admitted. As regards the second point - restriction to first order and one-sorted theories - the concept of a species of structures would be much more general since no restrictions are imposed on the typification in (1). But from a purely conceptual point of view the introduction of higher order and manysorted theories and their models presents no difficulties, and we can assume that the model-theoretical approach is brought into line with ours in this respect. On the other hand, theories in the sense of mathematical logic are usually assumed to have neither a finite vocabulary nor a finite axiom system. Accordingly, structures with infinitely many typified sets are admitted, and classes of structures that are not finitely axiomatizable may occur. On the face of it, these assumptions are too general to be digested by our concept of a species of structures. For finitely axiomatized theories with a finite vocabulary a semantically adequate translation of such a theory into a formula (1) is easily achieved. But there seem to be no systematic and comprehensive investigations of the connection in the general case. Even in the finite case the translation of all the secondary concepts and the main results of mathematical logic into the Bourbaki scheme quickly leads to open problems. Before we can come back to physics it is necessary to introduce two further concepts based on the concept of a species of structures and yielding constructions of such species from given species. To arrive at the first concept, let E be a given species of structures (1), and let dv(x, s) and Dv[x, a] 9
Shoenfield 1967, and Mal'cev 1971
IlI.ll On the Structure of Physical Theories
167
be terms, the latter being constructed from the xI' and ak in the same manner as the U1l"[x, a] in (1) were assumed to be formed. Finally, assume that
(5) where the dashes in the second line refer to arbitrarily given bijections performed on xI" the corresponding bijections performed on the 811" according to the typification in (1) and the corresponding bijections performed on the dv (x, S) according to the typification (first line) in (5) respectively. Given these data we now add to E 1) new variables Yv indicating new typified sets, 2) their typification
Yv E Dv[x, a]
(6)
Yv = dv(x, S) .
(7)
and 3) the new axioms
The result is a new species of structures E' that could be called an extension by definitions of E because it corresponds to the equally named procedure concerning theories in the sense of mathematical logic 10 . The procedure has an obvious semantical counterpart: owing to (6) and (7) every structure belonging to E is uniquely expanded by definitions to a structure belonging to E,ll. To arrive at the second concept let us once more start from a species of structures E given by (1). Besides terms dv and Dv satisfying (5) another family 8p and ..1 p likewise satisfying (5) shall be given, and the 8(x, S) shall be typified by the 8(x, S): 8(x, S) E v(d(x, S), b). Then, with respect to a second species of structures 8, it may happen that f-T E(x, a, S, U, a) } 8(d(x, S), b, J(x, S), v[d(x, S), b], j3(d(x, S), J(x, 8))) .
(8)
-t
In this case we could speak of a deduction of 8 from E by means of the deducing terms d v and Jp • For this procedure generalizes the deduction of a theory from another theory as it is known in mathematical logic. Its semantical counterpart with respect to fixed interpretations of the a" and b).. is a mapping that transmits every structure (x, S) over a of E into the structure (d(x, S), J(x, S)) over b of 8 (isomorphic structures being sent into isomorphic ones). The most simple case would be that in E only the axiom is weakened (identity mapping). A more complicated case is the previously introduced extension by definitions with E' as 8 (expansion mapping keeping the principal base sets but introducing new typified sets). In the general case 10 11
Shoenfield 1967, Ch. 4.6 Shoenfield 1967, Ch. 6.9
168
IIU1 On the Structure of Physical Theories
also the principal base sets may change: Think of a group and the lattice of its subgroups. This lattice is constructed by a deduction. Its principal base set would be the set of sub-groups of the given group. The meet and join operations - its two typified sets - would be deduced as the intersection of two subgroups and the subgroup generated by two subgroups. Every deduction can be represented as the product of an extension by definitions and another simple kind of deduction: from the data for an arbitrary deduction as in the preceding paragraph we first extend E by definition according to (6) and (7):
Yv Yv
E
Dv[x, al, tp
E
Llp[x, al
= dv(x, S), tp = 8p(x, S)
}.
(9)
From the resulting E' we then deduce 6) by deducing terms that simply pick out the Yv and tp as principal base sets and typified sets respectively. E' can be viewed as the incorporation of 6) into E. It is not difficult to prove that this incorporation can be iterated in an obvious way in the case that a third species of structures r can be deduced from 6) etc. In this way we can exhibit within a given species of structures whatever species of structures are deducible from it. The last remark leads back to the problems 1) and 2) posed above. For my general strategy in solving these problems will be the following: given a physical theory, find a species of structures E - called basic for the theory - such that a) E can not be represented as already constructed from other species of structures by the procedure just indicated, i.e. it is irreducible in this sense, whereas b) all the physical concepts occurring in the given theory can be represented within species of structures deducible from E and, consequently, can be incorporated into E by means of the aforementioned procedure. The eventually resulting species of structures will then be characteristic for the given theory in the sense of problem 1), and it can be expected that among the species of structures by which E is extended in the sense of b) there is one which characterizes the theory as being a physical theory of some very general kind and in this way leads to a solution of problem 2). I am fully aware of the fact that the strategy given by a) and b) is less well determined than its wording may suggest it to be. In particular, a) makes implicit use of a concept of equivalence between species of structures not yet defined, and in b) it is unclear what is meant by a 'physical concept occurring in the given theory'. A fortiori we will not know when the process of extending E will be completed. Yet I hope that the following example will gradually clear up my intentions. Let us assume that the given physical theory is Newton's theory of gravitating bodies idealized as mass points. The following species of structures Egr can serve as our starting point in the sense of a), i.e. it will be basic for Newton's gravitational theory. Since Egr is already rather complicated I can describe it only roughly in the material mode of speech and in an (unnecessary) simplified version according to which time and space are absolute. To
111.11 On the Structure of Physical Theories
169
begin with, the structures in Egr would have two principal base sets T and ~ for time and space. Using suitable auxiliary base sets, above all the set ffi. of real numbers, the ordinary metrics for T and ~ together with a distinguished set Fnew of Newtonian reference frames and a corresponding automorphism group Gnew can be introduced by so many typified sets and the appropriate axioms. Up to this point, our species of structures characterizes the structure of time and space and is, actually, a species that determines its structures uniquely up to an isomorphism. When it comes to the description of the n gravitating bodies it seems necessary to introduce one further principal base set Mo for the possible masses of the bodies. Mo would have to be identified with IR independently of the Newtonian frames. Finally, 2n further typified sets
(10) would describe the masses and motions of the n bodies. Apart from axioms of minor importance they have to satisfy Newton's gravitational equations. (For simplicity the gravitational constant is assumed to be 1). Brushing aside all questions of purely academic nature, we are thus led to a unique species of structures Egr satisfying the condition a). Evidently Egr is not one of the species of structures usually considered in mathematics. It is composed of such species and, accordingly, the latter can be deduced from it, e.g. the species of groups and that of metric spaces. But again, although this may become a matter of controversy, we are not interested primarily in deductions of these species of structures since the corresponding extensions of E gr according to b) would not approach a species of structures characterizing Newton's theory as a physical theory. On the other hand, the following series of deductions will lead to extensions approaching such a characterization. In each step a species of structures will be deduced that is basic with respect to some other, more general theory, i.e. it plays the role with respect to this theory that Egr plays with respect to Newton's gravitational theory. In the first step a species of structures Enew basic for Newton's general mechanics is deduced. Enew itself is obtained from Egr by deleting the gravitational equations of the latter and adding force functions (as typified sets) (11) for every Newtonian frame I E Fnew satisfying the usual transformation rules for forces and - together with the mv and Tv of (10) - Newton's second law. For the sake of simplicity the forces are assumed to depend only on the positions of the particles, and by the same reason we will assume that they can be derived from a potential. Apart from trivial terms the deduction of Enew from Egr is performed by defining the forces (11) to be the gravitational forces. Then Newton's second law follows immediately from the gravitational equations. The extension of Egr corresponding to the deduction of Enew leads to a species of structures E(gr,new)' As compared with Egr the new E(gr,new)
170
111.11 On the Structure of Physical Theories
is enriched by the concept of force. It is, so to speak, the answer to somebody who, in view of Egr being associated with the gravitational theory, would ask: But what about forces? In our next step further questions of this kind will be answered. A species of structures Eham will be deduced from E new , and E ham will be basic for Hamiltonian mechanics. Following modern presentations of this theory E ham has to be reconstructed in the following way:12 Besides the time structure T we have an m-dimensional manifold C as configuration space, a function H on the cotangent bundle of C as the Hamiltonian and a solution (q,p) ofthe Hamilton equations describing the motion of the system. The species of structures thus defined is easily deduced from Enew: Cis 3?n, H is determined by the potential of the forces (11) and the masses m" and (q,p) by the latter and the kinematical functions r" in the usual way. This deduction immediately leads to E(new,ham) and together with our first step to E(gr,new,ham)' In these two species of structures, in contrast with E gr , we have the possibility to speak of the instantaneous configurations of our system of n bodies, and E(gr,new,ham) contains the result that the gravitational equations are a special case of the Hamilton equations. Although our original species of structures Egr has thus been enriched once more it does not yet contain the concept of a state: In Hamiltonian mechanics the states are the points of the phase space. But the phase space is the cotangent bundle of the configuration space and, therefore, does occur neither as a base set nor as a typified set in Eham. In the next step we introduce the concept of state following the idea that the time development of a physical system may be determined by its state at any time. The following is a species of structures Edet basic for what may be called the state version of a theory of deterministic systems. Besides the time structure T our new species Edet consists of a topological space S a continuous representation U of the automorphism group of T, i.e. the group of time translations, as transformations of S, a sufficiently large set F of continuous functions from T into S and a 'solution' of U, i.e. an f E F with
f(t) = Ut-tof(to) .
(12)
In the intended interpretation the points of S are the possible states of a system, U is the dynamics, the elements of F are the possible physical systems - possible before the dynamics is coming into play - , and the distinguished f E F is, so to speak, the system under investigation. Given E ham the new species of structures Edet is deduced by letting S be the cotangent bundle of the configuration space C, U the transformation group that (under favorable conditions only!) is associated with the Hamiltonian H by integration of the Hamilton equations, F the set of all suitably smooth functions from T into S and the distinguished f E F the solution (q,p) of the Hamilton equations 12
Mackey 1963, Ch. 1; Hermann 1970, vol. 2, Ch. 11; Choquet-Bruhat et al. 1977, Ch. IV, C.9
IILll On the Structure of Physical Theories
171
distinguished in E ham . We are thus again in the position to construct the extensions E(ham,det)), E(new,ham,det) , ... etc. In this way for every theory introduced so far the concept of state would explicitly be introduced and we would recognize that there is a deterministic and reversible time development of the physical systems that are the objects of the theories 13 . This, obviously, would be a particularly important insight into the nature of the gravitational theory which was our starting point. Although my general idea for solving the problems 1) and 2) may already have become sufficiently clear I want to go one or two steps ahead in order to introduce still more abstract species of structures involved in our original theory. The species of structures Edet introduced in the last step is obviously already a very general one. Accordingly, the step that leads to it is perhaps the greatest step that we shall have made in the whole course of our deductions. However, in Edet the concept of a deterministic system is still represented by a typified set (the set F above). Now it seems to me that the analysis of a given theory about a certain kind of possible objects is not completed until this kind has not been represented by a principal base set. It was this very idea of abstraction that led to all the various species of structures nowadays investigated in mathematics. Therefore we should realize it also in the case before us, and this could even signalize that in some respect this will be the final step in our analysis. (In another respect it certainly will not. Because, as we shall see presently, further concepts, such as that of a property or a quantity of an object still have to be introduced). The species of structures Eob characterizing the object version of a theory of deterministic systems may be roughly described as follows. The time structure is as before, T being its principal base set, i.e. the set of time points. There is one further principal base set F the elements of which are possible physical systems of some kind. A set S for the possible states of these systems is typified by
S is a complete division of (F x T), and two f,g E F for which
(I, t)
E s B
(g, t) E s
for all t E T and s E S are identical. Since the meaning of (I, t) E s is that at time t the system f is in state s the foregoing assumptions can be rephrased by saying that at every time every system is in exactly one state, and every system is uniquely characterized by the time development of its state. Further axioms may guarantee that F is large enough with respect to S. Also S may be assumed to be a topological space (one further typified set being introduced). Furthermore, let Fo, typified by
Fo E Pow(F), 13
Mackey 1963, Ch. 1.1
172
111.11 On the Structure of Physical Theories
be such that, given t E T and s E S there is exactly one f E Fa such that (I, t) E s, i.e. such that at time t the object f is in state s. Obviously, by Fa we have introduced a 'dynamics' that makes the systems in F deterministic (and reversible). If S is assumed to be topological further axioms may guarantee the adequate continuity conditions. In a final step we introduce ~ as always ~ the system f E Fa as the system 'under investigation' and have thus completed the construction of E ob . There is an obvious deduction of Eob from E det : The new F is taken to be the old F. The new S has as its elements all sets
{(I, t) I f E F 1\ t E T 1\ f(t) = s} with the old F and an s from the old S. Fa is the set of all those f from the old F that satisfy (12). Finally, the new distinguished 'solution' f is taken to be the old one. Actually, Eob is not essentially weaker than E det . Thus we have even been successful in saying everything that was said in Edet. this time with the set of possible objects being a principal base set. Having arrived at this stage of abstraction I want to insert a criticism of the reformulation of particle mechanics that can be found in Sneed's book14. Actually, other authors have given birth to this formulation. But in the context of my paper it can best be criticized in Sneed's version since he explicitly tries to subsume it under a very general kind of structures, ~ just as I have done it with my reformulations in the foregoing paragraphs. If in my own series of deductions and extensions we go back to Enew and ask ourselves which set the set F of E(new, ... ,ob) actually turns out to be then the answer is that it is the set of all sufficiently smooth functions from time T into the contangent bundle of ~n where n is given in advance. Making the concept of a 'smooth function' precise our F would turn out to be a perfectly good set. Let us now look how Sneed arrives at an analogous collection. In his book, Ch. VI, particle mechanics is reconstructed by introducing, among other things, an arbitrary finite set P, thereby generalizing the idea of a finite set of particles. Accordingly, the possible motions of the particles are described by functions in P x T r-+ ~, the possible masses by functions in P r-+ lR etc. Also various axiom systems are suggested that gradually restrict the systems of particles to more and more special theories of mechanics. Now, what is done so far can very well be rephrased as establishing so many species of structures (in my sense), P always being one of their principal base sets. But in the next Ch. VII, beginning with (D26), Sneed sets about to introduce a series of very general notions about the idea of a physical theory. With minor corrections these notions, too, can be rephrased as species of structures in the formal sense. Indeed, everybody acquainted with the latter concept and reading these passages in Sneed's book would read them in terms of species of structures. However, he would soon have to realize that this reconstruction is not possible in the semantical sense because in the most important 14
Sneed 1971
IIl.ll On the Structure of Physical Theories
173
example that Sneed has to offer, classical particle mechanics, already the principal base sets of these species of structures turn out to be no ordinary sets at all: On the one side, these principal base sets are what Sneed calls the collections of possible models and possible partial models. But according to his aforementioned reconstruction of classical particle mechanics these collections obviously would involve the totality of all finite sets and thereby the totality of all sets whatsoever of a model of set theory. For in any such model and for any set in it there is the set consisting of just this set as its only element. This exorbitancy would further infect the typified 'sets' belonging to a theory in the sense of Sneed, especially the class representing the constraints. This being a second order class, one may get into trouble by taking care of it even in a set theory that systematically distinguishes between sets and classes since in such a theory classes are not possible elements. Apart from these set-theoretical difficulties, there is another reason to reject the reconstruction of classical particle mechanics of Sneed and his forerunners. Leaving aside the (otherwise important) distinction between theoretical and non-theoretical entities as being irrelevant for the present discussion, one idea that Sneed wants to grasp with his reconstruction of the concept of a physical theory in general, namely the idea of the totality of possible objects that a theory is about, is a perfectly good idea. Moreover, it is this very idea that I tried to characterize by the species of structures Eob for the case in which the objects show a deterministic behavior in time. It may even be argued that an incontestable deduction of Eob from another species of structures E characterizing a relevant physical theory is a negative touchstone that the characterization has really been achieved. Such a deduction has been indicated above with E gr , E new , ... etc. as possible starting points. But there are many other existing physical theories for which the deduction of Eob would be a straightforward matter. To mention but two large classes: deterministic statistical theories and classical field theories. As regards the former there is an elegant characterization of them given by Mackey 15 . The characterization can easily be rephrased in terms of a certain species of structures, and it is already at such a general level as to include classical statistical mechanics and quantum mechanics. A subsequent deduction of our Edet (and a fortiori E ob ) is readily obtained. The critical set F of 'possible objects' turns out to be a set of probability functions depending on time, the states and the proper observables of the objects. Of course, whether in this case the probability functions represent real objects depends on the interpretation of the theory. But at any rate this totality of possibilities is an ordinary set. The same is true for the classical field theories where F comes out as this or that function space. Neither here nor anywhere else can we find an analogue of the unfortunate duplication of possibilities in the reconstruction of classical particle mechanics under discussion. Even if we have a field theory with two or more fields interacting 15
Mackey 1963 Chs. 2.2 and 2.3 (restricted to axioms I-VI)
174
111.11 On the Structure of Physical Theories
with each other, in theoretical physics we would never establish this theory as being about possible objects represented by arbitrary finite sets with fields (instead of orbits etc.) assigned to its elements. The object of the theory is rather a finite set of fields, just as in particle mechanics it is a finite set of trajectories, masses etc. It is only in the applications that two objects having the same mathematical descriptions in a theory must be distinguished individually. Looking back to the problems 1) and 2) posed above I would like to emphasize in conclusion that their solution could only be touched upon in this paper. The example that was intended to serve as an illustration for the solution in general has led to the species of structures Eob. Its function as one of several applicants characterizing different types of physical theories on a most general level could be made still more evident by deducing the field of Borel sets of the state space S representing the possible contingent properties of a physical system or - equivalently - by deducing another structure representing the observables or quantities. In the first case we would get a species of structure E 10g that is sometimes called the 'logic' of the kind of physical systems treated in Eob 16. It can be used to distinguish in a most general way between classical and quantum theories. Whereas E 10g will be involved in every physical theory whatsoever there remains the question whether there are true alternatives on the level of Eob. This will certainly be the case for relativistic theories where the time structure T will have to be replaced by a space-time structure 17 . Probabilistic theories of physics may be already included in Eob because S may be a set of probability functions: There are probabilistic and at the same time deterministic theories in physics; in a sense even quantum mechanics is such a theory. On the other hand, it is questionable whether we can adequately appraise a probabilistic theory without making the probabilities explicit. And at any rate the dynamical part of Eob would have to be generalized in order to include arbitrary stochastic processes.
16 17
Mackey 1963, Ch. 2.2; Varadarajan 1968, Chs. I, VI and VII; Scheibe 1964, and 1973c, Chs. II, III and V Anderson 1967; Kiinzle 1973
111.12 A Comparison of Two Recent Views on Theories* In the following paper I will make a partial comparison of two recent proposals for the concept of a physical theory. The first proposal is due to Ludwig,
and its original version is a by-product of an attempt to give a physically satisfactory axiomatization of quantum mechanics l . Meanwhile Ludwig has further developed his concept, and gave a self-contained presentation of it 2 • Using earlier approaches to the axiomatization of classical mechanics, the second proposal was made by Sneed in connection with the so-called problem of theoretical terms 3 . In contrast to the Ludwig approach which has remained isolated up to this very day, Sneed's conception has received much attention and was even presented as the way out of the difficulties that beset the orthodox view of scientific theories 4 . In my own opinion, this situation is nothing but an historical accident and does not in the least mirror the respective merits of the two approaches. Consequently, calling attention also to Ludwig's work is certainly one main purpose of my paper. But I have to admit from the outset that this purpose will be connected with a rather special interest in a foundational problem concerning the concepts in question. Although the solution of this problem will lead us to confront the two views, on account of the limited aspect chosen, the resulting comparison will be very selective, and a thoroughgoing comparative appreciation, however welcome it would be, is beyond the scope of this paper. The foundational problem I have in mind originates in a difference between the two programs according to which the corresponding concepts of a physical theory are to be reconstructed. According to Ludwig's program (the L-program) one of the things that have to be made explicit in the reconstruction of a theory is its language, and the way in which this has to be done is, as usual, formalization. On the other hand, according to Sneed's program (the S-program, where, for that matter, the'S' may rather remind us of Suppes) the explication of a formal language as one of the elements of a theory is deliberately avoided. It is, of course, tempting to refer this situation to the logical empiricist's or ~ as it has been called ~ the received view of scientific theories. For it would then turn out that whereas the S-program was a deliberate move away from this view, the accordance of the L-program with the latter (which, by the way, is not confined to the present point) may rather have been a piece of pre-established harmony than a deliberate succession. But the details of these relations need not concern us here. What really matters is the fact that Ludwig's concept of a theory (the L- concept) is syntactical in the usual sense mentioned before whereas Sneed's concept (the S-concept) is not. Rather, as Sneed has put it, "the way of talking about * First published as Scheibe 1982b 1 Ludwig 1970 2 Ludwig 1978; 21990 3 Sneed 1971 4 Stegmiiller 1979
175
176
III.12 A Comparison of Two Recent Views on Theories
scientific theories I am going to describe invites us to look at sets of 'models' for these theories rather than the linguistic entities employed to characterize these models 5 ". It is in this sense that his approach has been classified among the semantical approaches to the concept of a scientific theory6. At this point one would perhaps like to know the reasons that have been given in the S-program for abandoning the usual explication of the linguistic part of a theory. But I shall refrain from any direct comment on this matter. Although it will become clear in the course of the following consideration that these reasons cannot really be compelling, this result will remain a side-issue of the paper. The main line of my argument will rather concern a problem that is caused by the difference between the L-concept and the S-concept as it was outlined a moment ago, namely the problem how the two concepts can be compared in view of this difference. This is the foundational problem to be solved in this paper, and although it certainly is not a very deep one and its solution will only be a first step towards a more complete comparison, it will readily be admitted that something must be said about how to cope with the difference in question when undertaking a comparison of the L- and Sconcepts. In fact, the basic idea for a solution is simple enough and consists in just removing or perhaps rather in bridging the difference by either of two procedures of mutual adaptation: given the (syntactical) L-concept we may ask for a syntactical counterpart of the (semantical) S-concept and, vice versa, given the latter, one may look for a semantical counterpart of the former. Granted that these counterparts exist, a common basis of comparison for the two concepts on the syntactical as well as on the semanticallevel will be prepared. In order to realize this idea we shall have to recall (1) that theory elements typically distinguished by the L-concept are sentence forms (as physical axioms) in a formalized set theory; (2) that theory elements typically distinguished by the S-concept are classes of structures, e.g., the classes of all (physically) possible or (physically) possible partial models; and (3) that as far as the latter can be characterized by linguistic means at all they appear as the classes of all structures satisfying certain set-theoretical sentence forms or - as the official wording runs - set-theoretical predicates. Therefore, fixing the set-theoretical basis for the L-concept (roughly) as the system of Zermelo-F'raenkel (ZF), it will be very promising to look for the syntactical counterpart of the S-concept on this basis. On the other hand, there are some difficulties in getting at the corresponding result on the semanticallevel. For one thing, in developing their S-concept the advocates of the S-program not only abjured the linguistic method in theory explication, but also did not take the trouble, indeed refused, to specify any formal framework for the presentation of their concept. As was evidenced by several misunderstandings, this has obscured their attempt to a considerable degree, and, accordingly, a more 5 6
Sneed 1976, p. 144, no.2 Suppe 1974, p. 223, no.558
111.12 A Comparison of Two Recent Views on Theories
177
formal account of the matter would be desirable. This, however, means that we have to look for a formal set theory comprehensive enough to have the classes of structures mentioned in (2) and (3) as its possible objects. Here a second difficulty comes up: As we shall see later on these classes are not sets in the sense of ZF. Therefore we have to look for a more comprehensive theory in which sets and genuine classes are distinguished. From the various extensions of ZF that are possible candidates for solving the problem, the system of von Neumann and Bernays (VNB) will be suggested as the basis for a precise formulation of the (semantical) S-concept and the semantical counterpart of the L-concept. II
In order to make this paper understandable for readers not familiar with the set-theoretical systems ZF and VNB just mentioned, the following introductory remarks may be helpful 7 . Historically the system ZF was the first rigorous axiomatization of 'naive' set theory as it was developed by Cantor at the end of the last century. In naive set theory the objects of our thought are sets and the elementary statements made about them are (1) statements saying that a set x is an element of a set yand (2) statements saying that x equals y. Now, in formalizing naive set theory the first thing to do is to fix the language in which we are allowed to talk about sets. The system ZF is based on a first-order language, i.e., besides the elementary statements, formalized by the sentences of the form x E y (element hood) and x = y (equality), we are allowed to make more complex statements by means of sentences that are built from the elementary sentences with the help of the usual logical connectives 'not' (-,) 'and' (1\), 'if-then' (-+), 'or' (V), etc., and - most importantly - the quantifiers 'for every set x' (\Ix) and 'there is a set x' (3x). Secondly, we have to choose a logic, telling us which inferences of statements from given statements we are allowed to draw. More precisely, for a definite formalization we have to choose a logic and a definite axiomatization of it. The system ZF can be founded on any axiomatization of classical first-order logic. Intuitively this means the following. With the usual understanding of the logical constants there are statements in our language that turn out to be true no matter whether they are about sets or something else. Likewise we can draw inferences that turn out to be valid irrespective of the content of the statements involved in these inferences. Any statement of the form a -+ a would be an example of the first kind - a logically true statement - and any inference of the form a, a -+ (3 f-- (3 would be one of the second: a logically valid inference. A deep-going metatheorem, known as the completeness theorem, then tells us that first-order logic can be (effectively) axiomatized in the sense that we can pick out some of the logically true statements as our logical axioms and some of the logically 7
For a detailed exposition the reader is referred to Fraenkel et al. 1973, Ch. II
178
111.12 A Comparison of Two Recent Views on Theories
valid inferences as our deduction rules such that a statement is logically true if and only if it can formally be deduced from the axioms with the help of the deduction rules. Assuming then that a definite axiomatization of first-order logic has been chosen we can make the third step that - finally - leads to the axioms specific for ZF, i.e., to axioms which, although not logically true, are true for arbitrary sets. Thus to give a first example, the axiom of extensionality VyVz[Vx(x E Y +-+ x E z) -+ Y = z] is certainly not logically true: If our variables, instead of running over sets, are interpreted as indicating human beings and if x E y means that x is an ancestor of y then (*) turns out to be false since human beings x and y having the same ancestors may very well be different: they may be siblings. On the other hand, (*) is true for sets according to our intuition that a set is completely characterized by its elements. A second axiom of ZF is the power set axiom saying that to any given set y there exists a set z, the power set of y, consisting of all the subsets of y as its elements: Vy:3zVx[x E
Z
+-+ Vu(u
E x -+ U E y)]
Again this statement is not true in general. But in naive set theory we think of the collection of subsets of a set as being itself a set, and this is what is formalized in (**). A third axiom, guaranteeing that set theory does not remain trivial, requires that there is at least one infinite set. Without giving the formalized version it may be mentioned that the infinity of the set y in question is expressed by saying that the empty set is in y and that if x E y then the set whose elements are x and the elements of x is also in y. With the three axioms mentioned so far the list of axioms making up ZF is not completed. But since the present exposition has only the purpose to give the reader some idea of what formal set theory is like, a completion will not be necessary. Rather I now want to raise the question whether as a matter of principle a complete axiomatization of naive set theory is possible. Intuitively this goal would have been achieved if our axiom system were strong enough to deduce every true statement about sets from it and the logical axioms and rules laid down previously. However, although our logical apparatus was legitimately assumed to be complete in the sense mentioned above, a corresponding completion of a set-theoretical axiom system is demonstrably impossible. It therefore remains a matter of deductive experience whether a given axiomatization is strong enough in order to cover this or that theorem that was supposed to be true. But there is still another sense in which ZF and even its simple extensions, i.e., its extensions by simply adding further axioms, are incomplete: They not only do not allow to prove but rather allow to disprove the existence of certain sets that intuitively we would think to be within the 'totality' of sets. An almost tragic case in point was a fundamental assumption made by
111.12 A Comparison of Two Recent Views on Theories
179
Frege in his attempt to reduce arithmetic to logic. Reformulated within the present context Freges assumption was that given any sentence form Px (not containing y) the axiom of comprehension
3y'v'x(x E Y +-t Px)
(AC)
would be true. Indeed it sounds very plausible that once we succeed in forming a predicate P possibly applying to a set x there will be a set y such that its elements are precisely the sets x for which Px. If it exists it is unique because of (*) and is usually denoted by {xIPx}. But, as was first recognized by Russell, a trivial deduction refutes (AC) in the case where Px is x rf- x, and later on many more sentence forms Px were found for which the same thing happens. A very important class of such sentence forms is made up of cases in which Px says that x is 'a structure of kind P', and these will be the cases relevant for the following investigations. For instance, each single class consisting of all groups, of all rings, of all topological spaces, etc., will already be too big in order to be a set and therefore cannot be talked about within the system ZF. Several proposals of extending or even modifying ZF have been made in order to overcome this difficulty. In the following we shall use the system V N B of von Neumann and Bernays8. In a certain sense V N B is the least extravagant extension of ZF that is comprehensive enough to include every class {xIPx} suggested by (AC). To obtain VNB the first thing to do is to extend our language. This can be done by either introducing a new predicate distinguishing between sets and proper classes or by introducing a new sort of variables A, B, C, ... indicating classes, i.e., sets or proper classes. Taking the latter option and extending the elementary statements of equality and membership also to the cases A = B and x E A we can mutatis mutandis keep the linguistic formation rules, and also our logic will be essentially the same as before. Finally, the axioms specific for V N B are essentially the old axioms of ZF together with an axiom of extensionality (*) for classes and the following axiom of predicative comprehension for classes: With Px being a sentence form not containing quantifiers over class variables, the axiom says
3A'v'x(x
E
A
+-t
Px)
(AC')
It is obvious that this axiom transcends the possibilities of ZF to the extent that was requested: It guarantees the existence, not of sets, but of classes {xIPx} for every P of the kind described. On the other hand, if we define:
z = A (and A = z) if z and A have the same members, A E B if, for some z E B, z = A, A E Y if, for some z E y, z = A, A is a set if, for some z, z 8
= A,
For details see Fraenkel et al. 1973, Ch. 11.7
180
111.12 A Comparison of Two Recent Views on Theories
then A turns out to be a proper class, i.e., not a set, if and only if it is not a member of any class. Therefore essentially no classes other than those provided by (AC') will come up. Having prepared the set-theoretical ground we can now start the intended reconstruction of the L- and S-concept of a physical theory. Whereas ZF will be used already in the next section, it is only in the last section that V N B will be set to work. III
The common basis of comparison for the syntactical version of the Land S-concept is a suitable extension ZF' of ZF: In order to include all the mathematics that is used in physics (or at least in the physical theory under consideration) ZF is first extended by definitions of all the terms and predicates needed in that part of mathematics. Secondly, in order to make physical interpretation possible an infinite series of new constants Ci for sets is added. On the basis of ZF' thus defined, I shall now give a reformulation of the L-concept introducing some minor changes for their own sake, and then suggest the syntactical counterpart of the S-concept, thereby making some adjustments for the sake of its comparison with the former. For both cases it should be borne in mind that no complete characterization of either concept is intended. In particular, to keep the presentation as simple as possible one very important feature of the L-concept, the uniform structures introduced in order to match the inaccuracies of measurement, will be omitted altogether 9 . To begin with, let T be a physical theory to be specified according to the L-concept. In order to obtain what is usually called the axioms or - as Ludwig puts it - the mathematical theory of T lO , we first select a double series
from the additional constants Ci, abbreviating the former by the vector notation X and s, respectively. They determine the primary language of T consisting of all structures from ZF' containing no other additional constants than the X and s. Secondly, a series of scale terms (DL~)
abbreviated by a(X), is chosen, i.e., terms constructed from their arguments (which, besides the X, may include defined terms of ZF') by successively applying one of the operations that yield a power set or a Cartesian product. The first axiom of T is then given by (L~) 9 10
Ludwig 1978, Sect. 6 Ludwig 1978, Sections 2, 4 and 7
111.12 A Comparison of Two Recent Views on Theories
181
This typification renders the constants s structures of types a over the basic constants X. The remarkable thing about the typification is that according to the choice of the terms a(X), it provides counterparts of all the predicates and terms of arbitrary arity and order as they appear in the various (even many-sorted) independent logical calculi. Axiom systems formulated in terms of these calculi as they are frequently used in presentations of the received view of theories are, therefore, easily translatable into the present framework. Finally, a sentence
(Ln
o:(X, s)
of the primary language is directly introduced as the second axiom, the axiom proper, of T. It has to fulfill the following condition of canonical invariance (which is automatically satisfied by the typification (L~): Defining the relacanonically (and obviously) tion isoa(X, s; X', s'; f) to hold if bijections determined by the bijections f of the X onto the X', map the structures s of types a over the X onto structures s' of the same type over the X', it is required that
r,
isoa(X,s;X',s';f) -+ [o:(X,s) +-+ o:(X',s')]
(1)
can be proved from ZF'. Combining s E a(X) and o:(X,s) in one sentence defined by
E(X,s) == s
E
a(X) !\o:(X,s)
(2)
we can sum up our requirements by saying that
E(X, s)
(Ld
is admitted as an axiom of T if it is a species of structures in the sense of Bourbaki 11 . Turning now to a second part, the physically effective part, of T, I am going to propose what seems to me a little improvement of the L-concept as it is presented by Ludwig. As will be seen in our third step, certain empirical interpretation rules are provided for T. However, these rules will not in general give a physical meaning directly to the primitive constants X and s of the primary language but rather to certain terms dependent on them. Now, in a section about the physically effective part of a theory, Ludwig describes the transition from a theory including such uninterpreted terms to another, physically equivalent theory in which all terms are interpreted: Here the interpretation rules are immediately applicable to the primitive constants of the primary language 12 . Therefore, the general situation will perhaps more 11
12
Bourbaki 1968, Ch. IV. As regards the physical significance of the invariance property of Q the reader is referred to some relevant remarks in Scheibe 1982c (this vol. VII.31) Ludwig 1978, sect. 7.3
182
111.12 A Comparison of Two Recent Views on Theories
adequately be described by explicitly introducing a secondary language of T from the outset. Let this language be determined by a new series
from our additional constants. The main idea of connecting the Y and t with our primary language will be that of extending the primary species of structures (Ll) by definitions. In order to extend the typification (LD,scale terms
are introduced and lead to the new typification Y E r(X) 1\ t E p(X).
(L~)
With the help of further terms (DL~)
entering into the definitions Y
= P(X, s) 1\ t = q(X, s),
the extension of the axiom proper (Ln is obtained. Finally, if we require that the terms P and q are intrinsic with respect to ((Lt))13, it can easily be shown that (L 2 ) is canonically invariant and, consequently, the extension of (L 1) by (L 2 ) is again a species of structures. Moreover, it turns out that under this assumption (L~) is already a consequence of (Lt) and (L 2 ) in ZF'. Up to this point the secondary language has been considered only in so far as it is connected with the primary language and as this connection leads to an extension of the primary species of structures (Ll). We are now going to consider the secondary language also in its own right. As I mentioned before, it is this language to which the empirical interpretation rules will directly be applied. This suggests that we should look at it as the empirical language of our theory T and ask for empirical consequences of T in the sense of consequences of (Lt} and (L 2 ) in ZF' that are expressed in the secondary language. We may even ask whether there is such a thing as a strongest empirical consequence that as such is representative for the empirical content of T. It turns out that this question can very well be answered in the affirmative if we extend our previous data by a series of scale terms (DL~')
and require that the typification 13
Bourbaki 1968, Ch. IV., sect.1.6
111.12 A Comparison of Two Recent Views on Theories
t E O(Y)
183 (L~)
can be proved from (L1) and (L 2) in ZF'. Given the terms 0, we ask for a species of structures 8(Y, t) == t E O(Y) 1\ (3(Y, t)
in the secondary language which is the strongest consequence of (L 1) and (L2) in the sense that (A) E(X, s) 1\ Y = P(X, s) 1\ t = q(X, s) f- ZF ' 8(Y, t) (B) for all 0- invariant T if E(X, s) 1\ Y = P(X, s) 1\ t = q(x, s) f- ZF ' ,(Y, t) then 8(Y, t) f- ZF ' ,(Y, t). It can be proved that taking (3(Y, t) to be the sentence 3~, 77[E(~, 77) 1\
3J[isoo(Y, t; P(~, 77), q(~, 77); J)]],
(L~2)
8(Y, t) defined by (L12) is a species of structures satisfying (A) and (B) and that any two 8,8 1 satisfying (A) and (B) are equivalent in the sense that
E(X, s) 1\ Y = P(X, s) 1\ t = q(X, s) f- ZF ' 8(Y, t) +-+ 8 1 (Y, t).
(3)
Obviously, (L~2) is very much like the Ramsey sentence of our axioms (L 1) and (L 2),the only difference being that, for reasons of invariance, the equalities in (L2) are replaced by isomorphisms. Concluding the sketch of the L-concept, a third theory element has to be introduced consisting of the empirical interpretation rules (Ludwig's 'Abbildungsprinzipien,)14 that were already mentioned before. As usual, these rules serve the purpose of connecting the secondary (empirical) language of a theory T with results of observations, experiments or measurements that are obtained in a certain domain of application of T: Knowing the interpretation rules is equivalent to knowing how our experimental findings are to be written into the empirical language of our theory. Now, in general, the procedure that is to be followed in getting at observational statements in the empirical language of T from the original meter readings may be very indirect in the sense that other physical theories, different from T, must be invoked as auxiliary theories (Ludwig's 'Vortheorien')15 . But the only stage in this procedure that is wholly contained in T itself is the final output consisting of statements that can be made in the secondary language of T and that, accordingly, are considered to be directly given as far as T is concerned. In order to give a formal account of these statements, a third series
14 15
Ludwig 1978, sect. 5 Ludwig 1978, p. 10
184
111.12 A Comparison of Two Recent Views on Theories
of our additional constants Ci must be distinguished as possible names for objects in the domain ..1. An observational report (Ludwig's 'Abbildungsaxiome'), i.e., a final output of applying the interpretation rules to certain experimental findings, is a conjunction of sentences of the form x E (}O(Y)
(... ,x, ... ) E yresp.
} ~
y
Here the x are constants (DL 3),and the yare constants (DL3) or t. The (}O(Y) are scale terms occurring in the composition of the (}(Y) with the exception of the latter. Every constant (DL3) occurring in the second line of (L3) has to be typified according to the first line. If in the second line a constant y would have to be typified by a (}(Y) according to the typification of the corresponding x then y has to be one of the constants t. On the basis of this concept of an observational report our theory T is provided with empirical content in the minimal sense that an extension of (L 1 ) and (L 2 ) or - equivalently - of (L 12 ) by (L3) may be inconsistent in ZF! In Ludwig's original conception of an observational report (Abbildungsaxiom) every constant y in (L3) has to be one of the t. Since this restriction leads to a rather weak concept of a theory having empirical content, the above generalization is suggested. The brief outline of the (syntactical) L-concept of a physical theory given thus far will be sufficient for the present purpose. I come now to the syntactical (version of the) S-concept. It will be developed in as complete an analogy to the L-concept as possible 16 . At the same time to elucidate the connection with the semantical S-concept, the standard notation used in more recent presentations of the latter 17 will be applied with the only difference that the usual symbols will be primed in order to remind the reader that they stand (not for semantical but) for syntactical entities. In the first step again some properties of the axioms of T have to be specified. This will be somewhat more involved than it was in the case of the L-concept. For whereas (Ld is a statement about a single structure, our new axioms will be statements about a set of structures of a given type. Moreover, an additional feature is introduced into the new axiomatics by the so-called constraint of the S-concept. To begin with, let
I'p
(DS 1 )
be one of the additional constants. I~ creates the primary language of T in the same sense as did the X and s. Furthermore let (in vector notation as before) (DS~) 16 17
See Sneed 1971, pp. 161 ff, for the original presentation of the semantical Sconcept Balzer/Sneed 1977
111.12 A Comparison of Two Recent Views on Theories
185
M;
be a typification of the 'T] with respect to the ~. Thus (~, 'T]) is of the form (LD but it will not be necessary to make this explicit. We now want to dispose of two properties of sentences of the primary language that correspond to the typification and invariance characterizing a species of structures. With the general abbreviation
f3*(y) :=Vx[x
E
y ---+ 3~,'T](x = (~,'T]) ;\f3'(~,'T]))l
(4)
the first property can be expressed by
(5) i.e., a sentence 'Y(I;) has this property if it has the consequence in ZF' that is a set of structures of the type given by The second property depends on a natural extension of isomorphisms of structures as they were introduced in connection with (1) to sets of structures of a given type: Let ISO M p, (x, Xl; {loXh-) mean that x and Xl are sets of structures with typification that the !.x constitute a family of isomorphisms between all structures of X and all those of Xl, and that they induce a bijection between the unions of all the base sets of structures in X and Xl, respectively. Then our new invariance condition for 'Y is given by
I;
M;.
M;,
(6) With these definitions at our disposal we can now approach the two axioms of T. The first one is obtained with the help of a species of structures (DS~)
with
M; as its typification. It reads M*(I;)
and, as can easily be verified, the properties (5) and (6) are automatically satisfied for this axiom. It is different with our second axiom
0' (I;) where 0' is a unary formula of ZF' of which it is required that (5),(6) and, moreover,
(7) be fulfilled. In this sense 0' is called a (syntactical) constraint for M;. The conjunction 8 1 of our two axioms again has the properties (5) and (6). 8 1 exactly corresponds to the axiom (L l ) of T according to the L-concept. Although the systematic connection between the syntactical and the semantical
186
111.12 A Comparison of Two Recent Views on Theories
version of the S-concept will be rigorously established later on, we should perhaps stop for a moment and anticipate this connection in an informal way. As regards its first part, our theory T is about a set I~ of (physical) structures. The semantical counterpart of I~ does not appear in the original S-concept. Rather I~ is here introduced for obvious reasons of analogy with the single structure (X, s) of the L-concept. In the terminology of the S-program I~ would have to be called a set of intended theoretical (!) applications. As a consequence of (SU, viz. (5), I~ is a subset of the class of (physically) possible models, this class being the extension of M~. By submitting M; to the condition of being a typification, the concept of a class of possible models is slightly more general than the original and slightly more special than the modified concept of a theory-matrix or, for that matter, of a class of possible models according to the S-program 18. The extension of our M' is the class of all models in the sense of the S-concept, i.e., the subclass of all possible models satisfying the central law of T (here represented by 0: in (DS~). The requirement that M' be a species of structures imposes on 0: a condition of invariance that is not foreseen by the S-concept. Finally, the extension of the foregoing constraint G' is the constraint of the original S-concept. The invariance property (6) does not appear in this concept. In our approach, it is conditioned by requiring M' to be a species of structures. On the other hand, recently new conditions have been imposed on G' to be a constraint 19. Their (obvious) transcription into the syntactical framework may for be left to the reader. Taking now the second step in the development of the syntactical Sconcept, the analogy to the procedure that was followed for the L-concept suggests the introduction of a secondary language of T. It will be created by a new constant
M;
corresponding to the Y and t of the L-concept. Just as the latter were defined by (L 2 ) through the X and s, so I~p will be defined through I~ by the definition
I'pp = R'(I') p where the term R' is now obtained in the following way: We start with terms (in vector notation) (DS~)
that are intrinsic with respect to the scale terms (DS~) 18 19
Cf. Sneed 1976, p. 162; Balzer/Sneed 1977, p. 197 Balzer/Sneed 1977, p. 196
111.12 A Comparison of Two Recent Views on Theories
187
and the species of structures M', and define R' to be the term
R'(y) = {xl:J~,1J[x = (r~(~,1J),r~(~,1J)) 1\ (~,1J)
E
yn.
(8)
As in the L-program, the secondary language of the S-concept is considered to be the empirical language of T. And as before, we may ask for empirical consequences of the axiom S1 that can be obtained in the empirical language if the definition (S2) is added. To answer this question, a new typification corresponding to (L~) (DS~')
is introduced. It is related to
M; by the requirement that
Given M;p, we nOW have to look for a sentence
M;,
that, besides satisfying (5) and (6) with respect to is the strongest consequence of T in the sense that (a) M*(1;) 1\ C'(1;) 1\ I;p = R'(1;) I- ZF ' A'(I;) (b) for all M;p-invariant '"'(: if M* (1;) 1\ C' (1;) 1\ I;p = R' (1;) I- ZF ' '"'((1;p) then A' (1;p) I- ZF ' '"'((1;p). The solution of this problem is again a somewhat modified Ramsey sentence eliminating I; in the premise of (a), namely
and it is unique up to the equivalence
In the terminology of the S-program, the set I;p is the set of intended applications of T. On account of (S2) and (9), I;p is a subset of the extension of M;p, i.e., of the class of all possible partial models. (S~2)' the strongest empirical consequence of our axioms, is a precise syntactical formulation of the so-called 'empirical claim' of T in the sense of the S-program. There is an obvious modification in so far as (S~2) has been made invariant in the sense of (6) by replacing the equality of I;p and R'(1;) by an isomorphism statement. The most decisive modification, however, that has been made in view of the original S-concept concerns the generality of the transition (9) from M; to M;p and, consequently, the generality of M;. In the S-program only the special case is considered where the r~ are (normed) projections: They, as well as the Pi, are chosen to be
188
111.12 A Comparison of Two Recent Views on Theories
(11) where for some m1 :::; m, r/ is 171, ... ,17m}) ... ,17m; and Pow(x) is the power set of x. Evidently, our generalization corresponds exactly to the situation as we met it in the L-program. It is partly conditioned by a different attitude towards theoretical quantities. But this is a matter that must be dealt with on another occasion. Coming finally to the third step, the empirical interpretation of T, the first thing that has to be observed is that, whereas the foregoing development of the syntactical S-concept could almost immediately be read from the original concept once the general idea of a syntactical version was formed, the S-program does not contain any hints whatsoever as regards the formation of empirical interpretation rules. The reason is that the advocates of the Sprogram, after having abandoned the linguistic view of theories, acquiesced in the idea that the theory elements they had introduced directly referred to physical objects and that no explicit interpretation of any language was necessary. Leaving it undiscussed whether this was a justifiable strategy, the re-introduction of the linguistic aspect certainly reopens the question of interpretation. The general situation being the same as for the L-concept I shall confine myself to the following brief suggestion for the analogue of the concept of an observational report as it was formulated in (L3) for the Lprogram. Using the material mode of speech and speaking very roughly, we think of the elements of I;p as being physical systems which in turn may be composed of objects such that for these objects our experimental findings can be expressed in observational statements. Accordingly we have to introduce additional constants
and the observational reports assume the form
Thus the t, are typified with respect to the 1';. (vector notation!) by M;p and then the av are classified and submitted to empirical relations exactly as it was assumed in (L3). IV In the previous section only the first part of our main task has been fulfilled: Taking over the (original) L-concept of a theory with some minor changes and suggesting a syntactical version of the S-concept we have obtained a basis for comparing the two concepts on the syntactical level. We
111.12 A Comparison of Two Recent Views on Theories
189
have now to tackle the second part and lay the foundations for comparison also on the semantical level. One way of doing this could be by this time to leave the (original) S-concept essentially untouched and to develop a semantical counterpart of the L-concept. However, as was already announced in the introduction, I want to go beyond such a result: As in the previous section, the S-program was violated by re-introducing the linguistic aspect into it, so in this section I want to challenge the strategy of an informal presentation of the S-concept by giving a formal account of it. This means that the following considerations will not be about any semantical entities in the usual sense. To call the S-program 'semantical' is a misnomer anyway (for which I do not want to charge its advocates). Strictly speaking, a metatheory can only be called 'semantical' if it contains concepts typical for semantical relations. Since the S-program deliberately excludes a formal language from the theory elements that are to be made explicit, the original S-concept can not strictly be called 'semantical'. This concept is semantical only in the derivative sense that metalinguistic expressions used to enumerate the theory elements directly refer to the entities that would be the referents of an object language of the theory if such a language had been made explicit. Having done just this in the foregoing section, our modified S-concept could indeed be rendered semantical in the strict sense by introducing a model of ZF' and trying to find a concept of interpretation according to which referents in the model are assigned to all the syntactical theory elements M;, M', ... , etc., introduced in the previous section. Apart from the fact that this program could not be realized with respect to ZF (see below), I shall refrain from entering the semantical domain altogether. Even the S-program had as one of its goals the clarification of the relations in which the various theory elements distinguished by that program stand to each other; and although this was actually done only in a naive way that eventually was called an 'informal axiomatics'20, such an enterprise by its very nature is a formal or - for that matter - a syntactical one. Therefore it is wise not to mingle it with an aspect that, however valuable it may be in a different context, may easily lead to misunderstandings in matters of an essentially formal nature. The following example will perhaps be helpful in understanding the alternative that I am about to suggest. Suppose that we were not concerned with the S-concept of theories but with the better known mathematical concept of groups. Then, by analogy, our enterprise would consist in (1) producing certain kinds of syntactical entities ... ¢ ... (of ZF) defined by certain properties; (2) defining terms G( ... ¢ ... ) and op( ... ¢ ... ) depending on the foregoing entities; and (3) showing that if the . .. ¢ ... have their defining properties then the terms G( ... ¢ ... ) and op( ... ¢ ... ) satisfy the axioms for a group with G( ... ¢ . .. ) as its base set and op( ... ¢ . .. ) as its operation. The following is an example of such a procedure: (1') ¢ is a variable of ZF indicating a set; 20
Stegmiiller 1979, Sections 1 and 2
190
111.12 A Comparison of Two Recent Views on Theories
(2') G(¢) is the term for the set of all bijections of ¢ onto itself, and op(¢) is the term for the relation in which any three elements x, y and z of G(¢) stand if z is the product (in the usual sense) of x and y; (3') is the proof that indeed a group has been obtained, namely the group of all transformations of ¢. In general, we would perhaps like to say that what is presented by (1)(3) is a syntactical, more or less general procedure of constructing groups. Indeed, if we take the concept of groups as defined by the usual axioms as our starting point then the question 'are there any groups?' can be given a purely syntactical answer by pointing to the procedure (1)-(3) and presenting instances of it in the manner just illustrated. Looking now at the S-concept of a theory in the light of the foregoing consideration, it must be said that only part (1) of the construction procedure has been settled and some hints for (2) have been given in the previous section. The systematic exposition of (2) as well as that of (3) and, above all, the formalization of the S-concept that is presupposed by the whole procedure are still waiting for their presentation. However, as was already indicated in the introduction, the execution of our program will not be possible without extending our formal framework. Although the S-concept, taken by itself, allows a formalization within ZF and can even be realized by syntactical models in the way that was outlined a moment ago, the intended realizations cannot be obtained within ZF. For the intended syntactical realizations of the theory elements M p , M, etc., distinguished by the S-concept, i.e., the terms to be defined in part (2) of the construction procedure, are the formal extensions of corresponding predicates, viz. the M;, M' etc., of Section II. Now, as is well known, in order that the extension {x IQx} of a predicate Q exists, a statement of comprehension 3yVx(x E Y +-+ Qx)
must be provable in ZF. But precisely this is not possible for the predicates entering our syntactical version of the S-concept. It has to be emphasized that this is not a shortcoming of our syntactical reconstruction but affects the original S-concept as regards its intended physical applications: There is not a single physical theory appearing among the physical examples given by the S-movement for which, say, the class Mp of potential models or the class Mpp of potential partial models is not a genuine class in the sense that for its defining predicate Q formula (AC) can be disproved in ZF. (The same is, of course, true for the L-concept for which, however, no 'semantical' version has been claimed to exist.) We therefore are in need of a formal framework that allows for the distinction between sets and proper classes. In Section II the system VNB has been introduced as such a framework, and it is this system that we shall now invoke for a formal reconstruction of the semantical L- and S-concept. To begin with, it will be recalled from Section III that species of structures in ZF playa distinguished role in the presentation of the syntactical version of our
111.12 A Comparison of Two Recent Views on Theories
191
two concepts of theories. It turns out that a generalized concept of species of structures, adapted to our new framework VNB, can be favorably employed also for the formalization of the semantical concepts. Let us therefore briefly look for a natural generalization. As before, we conceive of a species of class structures - as it may now be called - as being a formula (12) where F typ is the typification, F the axiom proper, and cp and '¢ are vectors for class or set terms. The principal modification that is forced upon us in view of the new situation in VNB concerns the typification: There is no problem in forming Cartesian products of classes with the help of (AC') in Section II, the product of A and B being just the class of pairs (x, y) with x E A and y E B. But the power class Pow(A) of a class A cannot he the class of its subclasses (which would violate (AC')) but only the class of the subsets of A. Defining scale terms as before with respect to the new concepts of Cartesian product and power class, we have still to generalize the typification '¢ E 0"( cp) itself since this formula would restrict the '¢ to be sets. Although this is not to be excluded, genuine classes must be allowed as typified classes, and this is achieved by allowing the typification also to be of the form '¢ ~ O"(cp). As regards the axiom proper F, it is easily seen that the invariance condition connected with (1) in Section III can be taken over almost verbally. We shall presently come to see that there may be some reasons to drop the invariance condition as part of the concept of a species of class structures if it is used in the definition of our two concepts of a physical theory. Having laid the new foundations we can again attend to these concepts, and first the L-concept. According to our preparations, what was called the semantical version of the L-concept will now be defined by an axiom system in VNB or rather - similar to the situation in Section III - in a suitable extension by definitions VNB' of VNB. As already indicated, the axiom system can be given the form of a species of structures (12) where the arguments will be taken to be new class or set constants added to VNB: There will be two basic class constants 172 and E2p, three typified class constants E°, U and eO, as well as two typified set constants XO and yO with the typifications
O - p o eOcE o } EOCE p' UCEoxE pp' - pp' 0,,0 Y E ~pp.
.0,,0 ,x E ~P'
(13)
The axioms proper are given by
(14) where the first member means that U is a mapping from E° into eO. Up to this point we would have an invariant species of class structures and a very simple one at that. Since 172 and E2p are meant to be the extensions of the typifications (LD and (L~),we could go on in our list of axioms by requiring
192
111.12 A Comparison of Two Recent Views on Theories
(14') where S is the class of all sets. These axioms, however, would no longer be invariant. At the same time, they would make the semantical L-concept dependent on additional parameters (here: numbers). Whether this is a desirable consequence remains to be seen. The L-concept developed in the previous section can now be shown to be a syntactical model of the axiom system (13) and (14) in the sense of the procedure described at the beginning of this section. Using the definition schema
QO = {xIQ(x)}
(15)
the constants E~, E~p, EO, U, 8 0 , x O and yO are defined by substituting for
Q(x) 3~, 1] [x = (~, 1]) 1\ 1] E a(~)] 3cp, 'IjJ[x = (cp, 'IjJ) 1\ 'IjJ E O(cp)] 3~, 1][x = (~, 1]) 1\ E(~, 1])] 3~,1],cp,'IjJ[ x = ((~,1]), (cp,'IjJ)) 1\ cp = P(~,1]) 1\'IjJ = q(~,1]) 1\1] E a(~) 1\'IjJ E O(cp)] 3cp, 'IjJ[x = (cp, 'IjJ) 1\ 8(cp, 'IjJ)] x=(X,s} y = (Y, t)
(16)
in that order. Here the last two definitions are adapted to the schema (15) for reasons of uniformity and could instead be given directly in the simpler form x O = (X, s) and yO = (Y, t). It is now very easy to show that (13) and (14) are consequences of these definitions and the assumptions made in Section III about the syntactical entities entering the definiens of (15). On the other hand, it is also obvious that not all of these assumptions are actually needed to obtain (13) and (14),e.g., the assumption (B) for 8 is not. A consequence that would correspond to it would seem to say that among the classes into which EO is mapped by U, the class 8 0 in some sense is the smallest. But it does not seem possible to define this sense without reference to the type appearing in 8 which again would make the L-concept itself dependent on external parameters. As regards the S-concept, we first develop the precise semantical analogue of our syntactical S-concept of Section III. Since we have taken the liberty of a few modifications, it is to be expected that the result will deviate from the original S-concept in some respects. The species of structures representing the desired axiomatics has two basic class constants, Mp and Mpp , as well as the typified class constants M, C, r, A and the set constants Ip and Ipp with the typifications
o
M ~ Mp, C ~ Pow(Mp) , r ~ Mp x Mpp , A Ip E Pow(Mp) , Ipp E Pow(Mpp).
~
Pow(Mpp)
(17)
111.12 A Comparison of Two Recent Views on Theories
193
Introducing the term (18) the axioms proper read
V'x(x E Mp ---+ {x} E G) r : Mp ---+ Mpp V'y[y E Pow(M) n G ---+ R(y) E A] Ip E Pow(M) n G Ipp E R(Ip).
(19)
Bringing into play the syntactical data of Section III and using again the definition schema (15),a syntactical model of the species of class structures (17)(19) is obtained if the constants Mp, Mpp , M, G, r, A, Ip, and Ipp are defined by substituting for Q(x)
:3e, '17[x = (e, '1711\ M;(e, '17)]
:3(1, (2 [x = {(I, (211\ M;p((1, (2)] :3e, '17 [x = (e, '1711\ M'(e, '17)]
G'(x) :3e, '17, (1, (2[ x = {{e, '171, {(I, (2)) 1\ (1 = r~ (e, '17) 1\(2 = r~(e, '17) 1\ M;(e, '17) 1\ M;p((l, (2)]
(20)
A'(x)
x E I'p x E I;p in that order. Relating (18) to (8) in Section III in an obvious way, it is again easy to verify that (17) and (19) are consequences of these definitions and the assumptions about the syntactical S-Concept. Let us finally review the essential modifications that we have made with respect to the original S-concept. There is first the generalization of the mapping r: In the original S-concept, r is a projection and as such it is a mapping from Mp onto Mpp. The latter property could, of course, be required in (19) and it could be proved from the syntactical concept if we would require the reversal of (9) with an existential quantification over and '17, Whether this restriction would be acceptable also in the general case that we have considered remains to be seen. Secondly, the invariance conditions of our syntactical S-concept are alien to the S-concept. If they are dropped in (S~2) and, consequently, the ISO-formula in (S~2) is replaced by an equality then the third axiom of (19) can be replaced by the equality
e
A = {xl:3y [y E Pow(M)
n G 1\ x = R(y)]}
(19a)
which, obviously, is stronger than the former. In proving (19a) from the modified (S~2)' the condition (b) must be used. Thirdly, our Ip does not occur as
194
111.12 A Comparison of Two Recent Views on Theories
a theory element according to the original S-concept. In dropping it the last axiom of (19) has to be replaced by
Ipp E A
(19b)
Furthermore, we have omitted axioms corresponding to (14') of the L-concept and expressing the matrix character of Mp and Mpp" They could be included only at the expense of the invariance property of the species of class structures representing the S-concept. Apart from these modifications of a rather technical nature there was, finally, the basic methodological distinction between a syntactical and a semantical version of the S-concept and the formalization of the latter. Insofar as this, too, is a deviation from the original S-program, it was dictated by the desire to have a basis of comparison of the S-concept with the L-concept of a physical theory. If it will have the side effect to provoke a stricter articulation of the S-program on the part of its advocates, so much the better. Even the comparison with the L-program for which the foundations have been laid in this paper but which remains to be done might be a useful contribution to this end.
111.13 Towards a Rehabilitation of Reconstructionism* I
The topic of my paper - towards a rehabilitation of reconstructionism - might lead you to expect that I would begin my exposition by establishing a connection with a certain philosophical doctrine called 'reconstructionism' that was once successful and is still well-known today. Then, you might expect, I would go on to remind you that this doctrine - like all philosophical doctrines - eventually fell into disrepute, upon which I would attempt to say something in favor of its rehabilitation. Although the situation to which I want to draw your attention is not quite as simple, I shall proceed more or less in the way the topic would lead you to expect. Only one point I want to clarify right away. It concerns the word 'reconstructionism'. For you might rightfully claim that you do not have any idea of what I am intending to refer to with this expression. After all, in this case we are not dealing with one of those familiar 'ism'-words with which, in the shortest of abbreviations, we refer to historically significant philosophical doctrines. As far as I know, our word occurs only in Gustav Bergmann's writings, where it refers to the thesis regarding the possibility of a reconstruction of traditional metaphysics in terms of an ideal language - a thesis which has probably not been advocated by anyone except by Bergmann himself. 1 Far from wanting to defend or even consider this thesis myself, with this reference to Bergmann I am nevertheless entering the context within which I intend to move in this paper. It concerns the well-known metaphilosophical view, sometimes attributed to the early Wittgenstein, which holds that essentially philosophy can only be a critique of language. In particular, it concerns the scientistic version of this view associated with the notion of an ideal language. According to this view, it is (1) the task of philosophy to provide a so-called rational reconstruction of the formal and empirical sciences, and it is (2) the method to be followed in this regard to provide a logical analysis of the material at hand. This version of the linguistically oriented philosophy was above all developed by logical empiricism, and due to the procedure just mentioned it is today generally grouped with analytical philosophy. On account of its proper task, however, it could have just as well or perhaps more appropriately been termed a reconstructionism. And since I want to deal with the latter aspect of the matter that is, with the idea of a rational reconstruction - I have taken the liberty of choosing the term in question for the reconstructionist program of logical empiricism. The grammatical justification by the way is provided by the fact * Originally published as Scheibe 1984a, translated for this volume by Hans-Jakob Wilhelm 1 Bergmann 21967 , p. 32
195
196
111.13 Towards a Rehabilitation of Reconstructionism
that the expression 'rational reconstruction' has actually been used, even if only occasionally and without a thorough definition of the concept. For if we now look back, after this preliminary clarification, to those thinkers who introduced or adopted this expression and who explicitly commented on it, we find first in Carnap a characterization of his constitutional system of 1928 as "a rational reconstruction [Translator's note: Carnap uses the word 'Nachkonstruktion' rather than 'Rekonstruktion'] of the entire structure of reality which in cognition is for the most part built up intuitively,,2. The obviously psychologizing and, from a rational perspective, at the same time disqualifying use of the word 'intuitive' already indicates what is more precisely expressed in another of Carnap's formulations which states "that the constitution is not supposed to represent the actual process of cognition in its concrete characteristics, but is supposed to reconstruct [Translator's note: 'nachkonstruieren'] it rationally in its formal structure,,3. Only a few years later, Popper, appropriating Carnap's coinage, expressly limited the domain of that which is rationally reconstructible to the final stage of a process of cognition, that is, to the examination of a researcher's idea: "In so far as the scientist critically judges, alters or rejects his own inspiration we may ... regard the methodological analysis undertaken here as a kind of 'rational reconstruction' of the corresponding thought-processes. But this reconstruction would not describe these processes as they actually happen: it can give only a logical skeleton of the procedure of testing. Still, this is perhaps all that is meant by those who speak of a 'rational reconstruction' of the ways in which we gain knowledge.,,4. Fundamentally along the same lines, again a few years later, Reichenbach writes: "Epistemology does not regard the processes of thinking in their actual occurrence; this task is entirely left to psychology ... Epistemology .. , considers a logical substitute rather than real processes. For this logical substitute the term rational reconstruction has been introduced . .. It is ... , in a certain sense, a better way of thinking than actual thinking. In being set before the rational reconstruction, we have the feeling that only now do we understand what we think ... ,,5 And still decades later Carnap explicated his old concept by saying that a rational reconstruction [Translator's note: 'Nachkonstruktion'] is ''the search for new determinations for old concepts. The old concepts usually did not arise. .. through reflective formation, but through a spontaneous development. The new determinations are supposed to be superior to the old ones in terms of clarity and exactness . . . Such a clarification of concepts ... still seems to me to be one of the most important tasks of philosophy ... ,,6. 2 Carnap 21961a , p. 139 ibid. p. 191. See also Carnap 21961b, p. 300ff. 4 Popper 21973 , p. 6f. (1959, p. 31f) 5 Reichenbach 1938, p. 5f (1983, p. 3) 6 Carnap 21961ab , p. IX 3
111.13 Towards a Rehabilitation of Reconstructionism
197
It cannot be said of these explications that they provide a clear idea of the concept of a rational reconstruction, and this situation cannot be improved upon by searching for further explicit statements. For there are scarcely more than I have cited. From these few remarks, however, we can gather the following with sufficient clarity: A rational reconstruction is the result of a process in which something that is to be reconstructed in a given case is replaced by something else, that is, by the reconstruction. And that which is to be reconstructed is the truly primary cognitive reality. It is primary in the sense that the associated cognitions, thoughts, ideas, etc. arise spontaneously, develop further in an uncontrolled manner, and are as such the first thing one encounters when one seeks to undertake a reflection, an analysis, or something of that kind. The view is that this primary cognitive reality - which is often classified as psychological (when it pertains to the individual) or as social (when it pertains to more than the individual) - is not a legitimate or even a possible object of philosophical epistemology or of the philosophy of science. Hence it is replaced by a kind of logical idealization in the widest sense of the word, an idealization which counts as its rational reconstruction. And this is a better world than the primary and still confused psycho-social tangle. As I have already stated, this leading idea of orthodox reconstructionism which we have thus recalled leaves many issues in obscurity. In particular, we shall have to ask, to what extent the psycho-social facts, from which these advocates of reconstructionism wish to disassociate themselves, are not merely replaced with the logical fictions, but are reconstructed by means of them. It is true that one must not judge reconstructionism merely on the basis of its program. Rather, one must evaluate above all the body of work that has been accomplished in its execution. For the purposes of my historical introduction, however, it may suffice to recall the program, in order also to recall the fact that in the last two decades reconstructionism has had to fend off heavy criticism from the most diverse directions. Within the confines of this paper it is impossible to provide an overview over this criticism. For this reason, I wish to single out only a partial aspect which seems to me to be particularly representative. In a rough preliminary formulation, I am referring to the peculiarity that on the one hand reconstructionism is charged with being too descriptive, while on the other hand it is claimed that it is not descriptive enough. Critics who pursue the latter direction refer above all to the history of the acquisition of knowledge, in particular to the history of the natural sciences, and attempt to show that the decisive steps in the development of physics, for example, do not occur in the rational reconstructions of the theoreticians of science at all, while conversely these reconstructions are nowhere to be found in the reality of science. With respect to this reality, the reconstructions are misleading distortions and at best irrelevant. Thus Toulmin writes 7 that" ... Carnap's system of inductive logic was expounded not in terms of real 7
Toulmin 1972, p. 62
198
III.13 Towards a Rehabilitation of Reconstructionism
life scientific examples but in a formalized logical symbolism whose relevance to actual scientific languages was always assumed, never demonstrated." In Kuhn's judgment as well8 , the reconstructively oriented philosophy of science misses what is essential. Focussed primarily on textbook accounts and historically at best on a few classics of science - Galilei, Newton, etc. - , ''the philosopher's reconstruction is generally unrecognizable as science to either historians of science or to scientist themselves". Yet Kuhn's criticism is not directed against rational reconstructions as such: "Both historians and scientists can claim to discard as much detail as the philosopher, to be as concerned with essentials, to be engaged in rational reconstruction. Instead the difficulty is the identification of the essentials. To the philosophically minded historian, the philosopher of science often seems to have mistaken a few selected elements for the whole and then forced them to serve functions for which they may be unsuited in principle and which they surely do not perform in practice ... ". Kuhn thus takes historically relevant rational reconstructions of science to be possible and he claims - quite rightly in my view - the business of reconstruction also on behalf of the scientific specialist. Nevertheless, for his own reconstructions he claims precisely what orthodox reconstructionism expressly wants to eliminate with its reconstructions: "The explanation [of scientific progress]", it says in one passage9 , "must, in the final analysis, be psychological or sociological. It must, that is, be a description of a value system, an ideology. .. Knowing what scientists value, we may hope to understand what problems they will undertake. .. I doubt that there is another sort of answer to be found". Besides the criticism which reproaches orthodox reconstructionism for being unrealistic, there is also, as I said earlier, the opposite reproach that it clings too anxiously to the factual state of science. This criticism is voiced especially by the constructive theory of science which seeks to differentiate itself as a normative theory from a purely descriptive theory of science, as which it classifies our reconstructionism. Thus, from this side, it is a reason for complaint - I am quoting from a book by Janich, Kambartel, and MittelstraJl,lO - that "descriptive theory of science. .. [chooses] the actual practice of science as the starting point of its reflections", that ''theories from the specialized sciences are adopted almost ready-made", and that the "question regarding the acceptance and testing of the theories. .. [is] always presupposed as having been answered positively". All this is objected to against the background of the possibility which the normative theory of science now wants to seize: Rather than "presupposing the validity of [scientific] theories, [it] ... first of all wants to make it comprehensible". While maintaining a critical stance, it wants to discover theories "in which [a closed methodical] structure does not exist or for certain intelligible reasons cannot be supplied". 8
9 10
Kuhn 1977a, p. 14 (1977b, p. 65) Kuhn 1970, p. 21 Janich et al. 1974, Ch.II.!
111.13 Towards a Rehabilitation of Reconstructionism
199
And for this type of normative critique of existing science, the theory takes as its starting point the question concerning the purpose of science which is completely neglected by descriptive theory of science. In a recently published paper, Mittelstrall,l1 has traced the development of orthodox reconstructionism somewhat more thoroughly and has admitted that Reichenbach "expressly [undertakes] an extension of the descriptive content of rational reconstructions by means of a critical content, i. e. a critical analysis of science". He could have added that Reichenbach grants the theory of science also an advisory voice aside from its descriptive and critical one. Having arrived at Stegmiiller 12 , however, Mittelstrall, can quote: "The theoretician of science does not question the existing sciences" and "the question, whether 'there are' physical sciences in the sense that these disciplines do not merely exist historically ... , but are justified in their existence. .. is no longer a meaningful question". Thus Mittelstrall, summarizes: "Reichenbach's program of a critical analysis of science is again withdrawn (in favor of the fact of science)". Hence he gives the relevant section of his work the humorous title "From Carnap to Carnap". Thus there remains the reproach that rational reconstructions are simply set up in such a way that the result is just what one wants and what one wants is just the existing sciences. II
So far I have sketched the orthodox reconstructionism of the 1930's as well as the critique that it has received lately, and now, in accordance with my announcement, I would have to proceed towards its rehabilitation. Such a rehabilitation might seem tempting if only for the reason that the critique mentioned arrives at two such opposing and almost contradictory assessments. This circumstance alone seems to indicate that something is wrong and that there should be a return route to the original idea of a rational reconstruction. I shall not pursue this as my main route, however, but at most by way of an excursion. For I am not concerned literally to rehabilitate old ideas, but rather to carry them further by adopting those features that seem to me to be capable of further development and by taking up those criticims that I have found to be valid. During the last few decades philosophy has not remained without extravagance in places where it dealt with natural science, or where - explicitly or not - it measured itself against it. Some have wanted to end philosophy altogether, others tried to justify its exaggerations. Anti-orthodox currents have been described as revolts, attempts have been made at a logical reconstruction of the metaphorical usage of the concept of revolution, and epistemology has produced an anarchist variation. Remarkably, all this has occurred - as mentioned earlier - precisely in close proximity to the most stable sciences that we have. Yet it is just here, more than anywhere, that one could have 11
12
Mittelstral& 1981, pp. 90ff Stegmiiller 1973, p. 23£
200
111.13 Towards a Rehabilitation of Reconstructionism
learnt how progress is made in science. In dwelling on the idea of a rational reconstruction, I shall expressly follow the principle of viscosity that Kuhn has so convincingly worked out for the natural sciences. Applying his wellknown terminology to our subject matter, we could say that the following reflections are normal theory of science or perhaps better: normal metatheory of science. In ordinary language usage of the word 'reconstruction' often refers to the restoration of a former state of affairs or to the result of such a restoration. If that which is to be reconstructed is a thing, for example, a roman camp, then it concerns the reconstruction of its original state either in reality or in the form of a model. In the case of a corrupted text, we are also dealing with a reconstruction of its original state. Similarly, one can try to reconstruct a past conversation from memory. But at times we say even of events, of a battle or a robbery, that it is reconstructed, be it in terms of a model or merely linguistically. Even though such historical reconstructions - as one could call them - are in a way only borderline cases of rational reconstructions in the sense sought after here, we can nevertheless already discern in them a basic structure of the latter: Among other things, we find in them about half a dozen essential components which I now want to sketch with the help of the initial example and in the form of the intended generalization. Fundamentally there are two reconstructional partners: on the one hand, we have that which is to be reconstructed in a given case - the original to be reconstructed - , and, on the other, its reconstruction in the sense of the result of a process which in turn is itself sometimes called a reconstruction. Where possible, I want to avoid the latter way of talking and speak of a reconstruction only in the narrower sense of the result of the process of reconstruction, such as, in the case of deciphering a mutilated text, the new text which is offered as a reconstruction. While a historical reconstruction, if it is successful, has in a certain sense already existed, albeit not as a reconstruction, generally, of course, this is not necessarily the case. A painting depicting a landscape, for example, is a reconstruction of the landscape according to certain artistic principles - a rebuilding of it according to different rules. If such a painting is an original as a painting, that is, if it is not a copy, then it is of course a reconstruction but not a mere repetition of the landscape. In this example, I have already spoken of a third and - between the lines - of a fourth component of a reconstruction. A reconstruction is based on a certain guiding idea or a principle which determines how it is to be prepared and in what relation it stands to its original. In the case of a historical reconstruction, the principle states that the reconstruction is to be rendered so as to resemble the original as closely as possible, and the relation that matters here - according to which one judges a historical reconstruction - is the relation of similarity in the sense of an accordance with the original. In the more general case, we are dealing with different principles and different
111.13 Towards a Rehabilitation of Reconstructionism
201
relations. To be sure, one will always expect a reconstruction to bear a certain resemblance to that which is reconstructed. But this is in general not the relation that matters, that is, the relation which together with the principle under which it falls makes for the respective 'rationality' of the reconstruction. Thus, in the already cited example of the landscape painting, what matters is not that the resemblance with the represented landscape be as great as possible, even though with respect to what matters the resemblance with a landscape as such is essential. To give another example which shall be considered in more detail later, for the reconstruction of a superseded physical theory within a new theory that replaces it, the resemblance with the former is of such little importance that concerning the progress made with the new theory what matters are rather the deviations of the reconstruction from its original. Yet, in another regard, even here it is only in conjunction with the agreements that these differences amount to what is essential in this case, that is, the idea of progress. With this example again I have had to borrow from a further component of a reconstruction: the respective context or frame into which it is set. And just as important - finally - is the context or frame from which it was taken. Regarded in isolation, the previously given examples of historical reconstruction - the model of a camp, the emendated text, the reproduction of a conversation etc. - may resemble their originals as much as possible; nevertheless they no longer belong to the context in which the originals were found. They have been removed from their natural environment, so to speak, and now appear as reconstructions in an artificial environment that was explicitly intended for them: in a museum, in a historical-critical edition, in memoirs. Historical reconstructions obviously have limits and they transplant their originals into environments in which, despite of all the similarities, they are also always exposed to the wrong kind of light. But even here matters can be quite different when other types of reconstruction are concerned. What in terms of the intention pursued in a historical construction appears as a necessary failure, in other cases can be just what one intends to achieve. In the example from physics, the relevant new theory is not something one has to accept for lack of anything better. Rather, it is that, in light of which one wants to see the theory that has been superseded. This function of the relevant new context can perhaps be seen most clearly in the type of reconstructions known as conceptual explications. For with these the intended clarification of concepts is achieved precisely by the fact that the frame into which the explication is set has a sharp demarcation towards the outside as well as a high degree of order in the interior and that one is familiar with both. It is like taking a flower from the wilderness and planting it in a well-determined spot in one's garden. I began with a consideration of historical reconstructions because of a common meaning of the word. But for our purposes these historical reconstructions are relatively unimportant as long as this meaning is simply left in
202
111.13 Towards a Rehabilitation of Reconstructionism
its original state and is not metaphorically or otherwise extended. Historical reconstructions in the most narrow sense are not already rational reconstructions of history or even of one of its moments. Indeed, extensions of this concept have been attempted, for example, in Schleiermacher's well-known hermeneutical formula which states that understanding is "the historical and the divinatorical, the objective and the subjective reconstruction [Translator's note: 'Nachkonstruieren'] of a given discourse" with the aim "of first understanding the discourse just as well as and later better than its author". Here an improving reconstruction is recommended via an intermediary historical one, and thus the latter gains considerable weight. This raises the question, whether reconstructions do not generally have a historical character. Inasmuch as this is a theoretical question at all, I am not able to answer it. Inasmuch as it concerns the institution of a new concept of a rational reconstruction, I want to emphasize that with the subsumability of historical situations the most important, albeit, as we shall see, not the only, extension of the orthodox concept of reconstruction has been achieved. The concept of a reconstruction, introduced by means of the six essential characteristics just reviewed and explicated with the help of rather simpleminded examples, is still very general and does not yet name a characteristic of a reconstruction which would make it a rational one in a narrower sense. Yet this extension is quite intentional and its purpose is to exercise an integrating effect on the numerous actually or only apparently diverging efforts in the theory of science of recent times. Far from wanting to assume the particular reconstructionism of the logical empiricists as the absolute standard, I nevertheless want to defend its reconstructionist basic tendency and show that it does not stand as isolated as it has seemed to some critics. Although for this purpose it is useful to maintain a high level of generality and also a certain vagueness of the concept of reconstruction, since it allows us to bridge great distances and bring out family resemblances, I also want to gain determinacy by expressly limiting the domain of application of the concept of a reconstruction. As far as science is concerned, the domain of application shall comprise logic, mathematics, and the natural sciences, and of these possibly only physics and chemistry. From the start, however, these sciences shall be comprehended also in their historical dimension. Given the overall sense of my discussion, it goes without saying that the periphery of this domain of application will have to be rounded off with the efforts hitherto made by philosophers and theoreticians of science, regarding the disciplines mentioned, as well as with our relevant pre-scientific intellectual equipment and ordinary life experience. III
With this domain of application in mind I now want to review some of the main types of reconstructions, beginning with what is probably the best known type, the so-called conceptual explications. As was already cited, this
III.13 Towards a Rehabilitation of Reconstructionism
203
concerns "the search for new determinations for old concepts" where ''the new determinations [are supposed to] surpass to the old ones in clarity and exactness and above all [are supposed to] fit better into systematic conceptual structures" 13 According to this characterization, the general idea of reconstruction is the clarification of given concepts, and the distinguishing feature of the frame into which a given concept is to be reconstructed is the system of a conceptual structure. Details of the procedure of conceptual explication have been described long before the empiricists, and it is not with the intention of detracting from their accomplishments, but rather in order to counteract the view suggested by Feyerabend that the new theory of science is "a hitherto unknown form of madness"14 that I now briefly want to refer to Kant. In his Preisschrift (of 1763) entitled, "On the Distinctness of the Principles of Natural Theology and Morality,,15, Kant compares the "manner of achieving certainty in mathematical cognition with how it is achieved in philosophical cognition". The crucial point for us is the position which Kant accords to the definition of a concept in the philosophical method as contrasted with its position in the mathematical method. He explains:
In mathematics I begin with the explanation of my object, e. g. a triangle, a circle etc .. In metaphysics I must never begin with it, and it is so far from being the case that the definition is what I first discover about a thing, that it is rather almost always the last. For in mathematics I do not have a concept of my object at all before the definition provides it; in metaphysics I have a concept that has already been given to me, albeit in a confused way, and I must discover the clear, detailed, and determinate concept of it. The fact that in philosophy, for example, one does not have the definition of time as readily available as the definition of a circle in mathematics prompts Kant to look for a kind of intermediary state, something which today we call the adequacy conditions of an explication. He says:
In philosophy... one can often know much about an object with clarity and certainty... before one has the definition of it, even when one does not undertake to give the definition at all. For I can have immediate certainty regarding several predicates of every thing, even though I do not know enough predicates in order ... to give the definition. The methodological part of Kant's Preisschrift ends in a specification of two rules, the first of which states that "one should not begin from explanations [=definitions]", while the second recommends in a positive sense to grant a 13 14 15
no.6. For further discussion of this topic see Carnap 21962 , Ch.1. Feyerabend 1973 Kant 1764, p. 283f and 285
204
111.13 Towards a Rehabilitation of Reconstructionism
special status to the "immediate judgments of the object" mentioned in the previous quotation and "thus to [premise] them like the axioms of geometry as the basis of all inferences". In light of Kant's text, we are reminded of the recent attempts at conceptual explication beginning with Tarski's definition and T-scheme for the concept of truth, Hempel's and Oppenheim's efforts regarding the concept of explanation, where as a rule one prefers to provide the HO-scheme rather than a definition, and finally the most recent and so far not very successful attempts at defining the concept of truthlikeness, where - conversely - the conditions of adequacy are lacking. 16 In light of these attempts, it should be emphasized that Kant himself says of his two rules that they are "quite different from those which were hitherto followed, and, if applied, they promise such a successful outcome as could never have been expected in following another route". It is of no consequence to us that Kant claims his new method of conceptual explication only on behalf of philosophy and that as an analytical method he expressly distinguishes it from the allegedly synthetical definitions of mathematics. This opposition springs from a view of mathematics, already outdated at Kant's time, which was too narrowly focussed on geometry while ignoring the situation in arithmetic, algebra, and analysis. Had Kant paid more attention to these areas of mathematics, he could have noticed, as Berkeley already had before him 17, that there existed a conceptual chaos comparable to the chaos known in philosophy. Many areas of the mathematics of the 19th and the early 20th century were dominated by efforts to establish a conceptual order, and the axiomatic method triumphed in the final stage of this development. 18 In any event, it is only if one regards the domain of application, demarcated earlier, as a unity that one will be able to speak, if not of a successful outcome, then at least of a successful progress in the matter even (and particularly) in questions of the philosophy of science. It is only if one ignores this that the impression of scientific irrelevance can arise which - as was cited at the beginning - the critics have reproached the more recent explicatory attempts of the logical empiricists with. For in the said domain there exists a systematic continuity, in particular with respect to the degree of explicitness with which the whole is worked out, a continuity which does not allow one to separate without considerable 16
17 18
Tarski 1936. Here we are given definitions (for several object languages) as well as conditions of adequacy for the concept of truth. - Hempel/Oppenheim 1948. In this work on explanation we are given, besides conditions of adequacy, a definition of the concept of explanation ((7.6) in conjunction with (7.8)). Due to a great number of difficulties a second attempt of this kind has never been seriously undertaken in the extensive literature that followed. - We find the opposite situation in the attempts at explicating the concept of truthlikeness. For an overview see Niiniluoto 1978. See Berkeley 1951 and the subsequent articles in the same volume. See, for example, Kline 1980.
111.13 Towards a Rehabilitation of Reconstructionism
205
arbitrariness certain parts as irrelevant for the rest. One can imagine a continuous path beginning from the concreteness of our sense-impressions and extending all the way to the abstractness of logical inference, a path which may be cut off at any point with just as much or as little justification as at any other point. If we start, for example, from our sensation of heat, then in a first step this sensation finds its explication in instruments for the measurement of temperature. The logical empiricists said about this that the ordinary language concepts for warm and cold are explicated in terms of the concept of temperature. This, of course, has nothing to do with philosophy. Rather, here we are only at the stage of experimental physics, and this explication belongs to the latter. But then there is theoretical physics, and it tells us that temperature is the mean kinetic energy of the molecules. With this the concept of temperature is introduced into a complex mesh of theories and it becomes possible to determine even remote temperatures: by means of measurement and calculation. At this stage of the explication of the concept of temperature a lot of mathematics is already involved, and with it - in the final step - logic. In particular, there is no indirect determination of temperature or of any physical quantity, where the relevant value is not eventually calculated and hence inferred. Where would one non-arbitrarily end this path before one has at least reached logic? Humanity managed to live long enough without thermometers. Even today there are experimental physicists who face theories only with great skepticism. Again and again one encounters theoretical physicists who are prepared to make only the most sparing use of mathematics, and many mathematicians from Descartes until this day wanted and still want nothing to do with logic. From this we can see that nothing really unheard-of is happening when now the attempt is made, on the part of the theory of science, to give certain meta-physical concepts a precise logical status by means of explication, nothing that would justify demanding a legitimation which would not be appropriate anywhere else. It is not as if, for the theory of science that is concerned with it, natural science were a foreign land with different laws. Rather, a large part of the explicative work in particular occurs within these sciences, and here one should always remind oneself of the mediating role of mathematics.
IV
The situation is similar with respect to the second main type of reconstruction to be briefly presented here: the reductive reconstructions, as I want to call them. This type is not sharply marked off from the explicatory type of reconstruction. Its principle, however, is not one of clarification and specification, but rather of the reduction of, for example, one concept to another concept or of one theory to another theory. For if, for example, a theory is reduced to another theory, there exists always as a third element the reconstruction of the reduced theory in the reducing theory. In terms of how I put
206
111.13 Towards a Rehabilitation of Reconstructionism
it earlier, the reduced theory would be the original, the reducing theory would be the frame of reconstruction, and the reconstruction would be the form in which the reduced theory lives on in the reducing theory. Reductions often have a historical interpretation, and here we find ourselves on the already mentioned territory interpolated between historical and rational reconstructions of the history of science. Examples of historically important reductions from our domain of application are the reductions of Kepler's laws to Newton's gravitational theory, of chemistry to physics, of geometry to arithmetic, and of arithmetic to set theory. Physicalism is the philosophical anticipation of a reduction of psychology to physics. A miniature example that is especially transparent is the reconstruction of the Aristotelean assertorical syllogistic in the predicate or quantificational logic established by Frege. The gamete of this reconstruction is the re-interpretation of the concepts that A belongs to all or to some B through the concepts that for all x to which B applies A applies as well or that there exists an x such that A and B apply to it. This is not properly speaking a conceptual explication as might initially be supposed. It is in the final analysis not any less clear to say that A belongs to all B than it is to say that for all x to which B applies A applies as well. Rather, the real achievement of reconstruction was the demonstration that and how Aristotelean logic, which Kant still held to be incapable of extension, is incorporated into the new, significantly extended logic - how it reappears in it. We know that in this reconstruction Aristotelean logic does not quite remain what it was. Even leaving the deeper interpretative questions aside, the fact remains, for example, that the inference from 'A belongs to all B' to 'A belongs to some B' is lost. Yet there is no doubt that with this reconstruction we find before us in the new logic that which if anything at all corresponds to Aristotelean logic. Since through Fregean logic we have also become acquainted with completely new logical laws, the transition to it appears as a particularly clear case of a linear succession of theories which represents a progress because it essentially preserves the old while adding something new. 19 In other cases the situation is not at all that simple, even if we restrict ourselves to mathematics and logic. The mentioned reconstruction of Euclidean geometry on the basis of arithmetic, for example, deprives geometry of the basis for its existence as a theory of space - namely of space. Thus the continuing existence of geometry on a purely arithmetic basis can be understood only against the background of a significant change of contexts: geometry as a theory of space is no longer the paradigm of a mathematical theory and space is transferred to the purview of physics. This is not a linear progress, but rather the splitting up of what had hitherto been regarded as unified theory into two essentially different disciplines: empirical physics and 19
The first comprehensive reconstruction of Aristotelean logic within the framework of the new logic is found in Lukasiewicz 1951. For the historical reconstruction of Aristotelean logic see Pat zig 1959. For the idea that the mentioned deviations can also have repercussions for modern logic see Lambert 1967.
III.13 Towards a Rehabilitation of Reconstructionism
207
a mathematics that has become more abstract. And yet the latter for its part has not simply dissolved into the new logic: The logicist program was a typical reconstructive program of the reductive variety. Its failure showed, however, that the new logic could not even be conceived as a successor theory to the de-geometrized mathematics. From the point of view of academic mathematics, if anything, it is the axiomatized set theory which could count as such a successor theory, provided that one keeps in mind Codel's theorems of limitation. The failure of the logicist program, the reductive successes in set theory, but even the limits of the latter demonstrated by the theorem of incompleteness are the most impressive examples of reconstructive ideas with a very precisely defined frame of reconstruction and a fairly precisely defined reconstruendum. The degree to which the relevant task of reconstruction is binding is underlined precisely by the negative results. 2o Proceeding from the formal sciences over to the natural sciences, we are entering the proper playground of the most recent confrontations with logical empiricism and critical rationalism. Aside from the reconstructionist method, what has been attacked in substance is especially the idea of a unified science on the basis of a universal criterion for the empirical success or failure of the efforts of the individual sciences. It is impossible to characterize in brief the phalanx of counterpositions any more closely than through catchphrases such as scientific revolution, semantic change, incommensurablity, Kuhn-losses, rationality gaps, theoretical pluralism, anything goes and so forth.21 I believe that with everything that has been presented in this context the philosophy of science thus attacked has been dealt a serious blow particularly in its monistic tendency and that in this respect one must really moderate one's demands. Yet it is possible to moderate one's demands, if one withdraws to a somewhat more liberal reconstructionism which is especially well represented by the reductive reconstructions. If for clarity's sake we imagine for a moment the two levels of science and of the theory of science as separate and first step onto the level of the theory of science, then we would find as a monistic postulate in an objective regard, for example, Carnap's (early) demand to continue the practice of philosophy only as a syntax of the language of science. A consequence of this, one that was not made very explicit, is the idea that a rational reconstruction is really only the establishment of a language of science through the explication of given concepts. Yet, as I am about to show, there are very different types of reconstructions which all have their different respective theoretical function. The function of the reductive reconstruction is just a kind of indemnification for the failed monisms at the scientific level - at least in an objective regard, that is, in light of the plurality of scientific theories. For it is just the reductive reconstructions which at least ensure the historical continuity of physics, for example, and beyond that also a For the development sketched here see the book cited in no. 18. Particularly useful for the purpose of a systematic comparison is Fraenkel et al. 21973 . 21 For an overview see Suppe 1974. 20
208
III.13 Towards a Rehabilitation of Reconstructionism
certain convergence of its efforts. It is true that some of the catchphrases just listed suggest that in so-called scientific revolutions certain reconstructions of the superseded view do not succeed in the framework of the new view. At this time, that is, not even half a century after the first attempts at reductive reconstructions were presented, the value of such a claim including the accompanying arguments is no more than a challenge to work out the relevant reconstructions more precisely. 22.
v Without claiming that my list is complete or that it represents a well thought-out system, I still want to mention a third main type of reconstruction: the descriptive reconstructions. A descriptive reconstruction is a description in the ordinary sense in which we describe to someone a route he wants to travel, or a city which we have just seen, but also in the more sophisticated sense in which we say that classical celestial mechanics describes the planetary system and quantum mechanics describes the behavior of atoms in the Stern-Gerlach-experiment, and finally also in the sense in which we or some of us say that the theory of science describes the natural sciences. I claim the right to call descriptions of this kind reconstructions above all from the fact that at least some descriptions bear those six characteristics which I established as the basis of the concept of a reconstruction. The fact that other descriptions lack some of these characteristics does not irritate me, since for these cases it can be demonstrated that just because of this lack we know much too little about what we are really doing when we give such descriptions. Not least because of this we speak of descriptions in these cases, since we have grown accustomed not to demand too much of them. Indeed, a well-known development, unfortunate to my mind, has devalued the word 'description' in the sense that when we hear that someone is giving a description we all too readily associate with this the idea that he is giving merely a description and not also an explanation, let alone an understanding. Regarding this development, I can only say that I would count myself happy to know what it means merely to give a description of physics. Hence I take the liberty to suggest that, in those cases in which, for example, the frame in which a description is supposed to take place has not yet been sufficiently established, one should go out and find such a frame. I recommend, in other words, that in such cases descriptions be completed as reconstructions. Regarding the so-called description of nature offered today by physics and chemistry, it is not difficult to recognize its reconstructive character. Already if we abstract from what I established earlier, the word 'reconstruction' alone indicates that even in a description not everything remains as it was. Even if one accepts, as a regulative principle of the natural sciences, that they describe nature as it is, it quickly becomes apparent that in concrete research 22
A sketch of the situation and references are given in Scheibe 1982a.
III.13 Towards a Rehabilitation of Reconstructionism
209
this is not easily achieved. One of the operative principles of actual practice - perhaps the most important one - is, for example, the idealization in the sense of a conscious simplification of the actually obtaining conditions. In an experiment the object is isolated and prepared, and these two steps are just the extraction from a primary and inaccessible environment and the transfer into a precisely known and controllable context on the basis of wellformulated questions. The physical theories in turn are adapted to these questions and they often contain additional simplifications which take into account the mathematical possibilities available in a given case. And finally, theories are subject to the requirement that they make possible explanation and prediction: A theory is not just any kind of description, but one that makes this possible. This emphasizes once more the reconstructive character of the description of nature: Nature, as it is described in physics, is something other than mere nature; it is more, but also less than the latter. The view of the physical description of nature as a complex system of reconstructions serves to make intelligible the difficulties with which the theory of science is forced to struggle, even if it considers science merely under the aspect of delivering descriptions of nature. These difficulties have prompted some to drop the descriptive aspect altogether - at least to the extent that it is tied to a correspondance theory of truth. The same view (of descriptions as reconstructions) also throws some light on the situation we encounter when we move to a higher level of reflection and ask ourselves what it would mean to say that the theory of science gives a description - merely a description of the natural sciences. The constructive theory of science has held the view that the analytic theoretician of science approaches historically given theories with the intention of "providing an elucidating description of them, an intention that bears a resemblance, by no means accidental, to the attitude of modern natural scientists who - at least according to their own conception - also approach nature with a descriptive intention".23 As we saw at the beginning, this purely descriptive attitude is criticized in favor of a normative orientation of the theory of science. A reason to do so, however, exists only as long as one takes the word 'description' in its naive sense. Yet, as I have noted, this is not an option even for the kinds of descriptions which at a lower level - are given by the natural sciences. I think that with this presupposition we may very well orient ourselves to some extent along the lines of the procedure of the natural sciences when considering the question regarding the nature of descriptions in the theory of science. Aside from the question regarding global objectives which I do not consider here, a theoretical description of science as a reconstruction is not in danger of becoming uncritical just because it brings along with it certain reconstructive principles and ready-made frames of reconstruction. Even for the theory of science it is true what Kant said about the natural sciences: It is true that it allows itself to be instructed by its object, but it does not 23
See p. 24 of the book cited in no.lO.
210
111.13 Towards a Rehabilitation of Reconstructionism
allow itself ''to be kept in leading-strings, as it were". Rather, it confronts the object as ''the appointed judge", who compels it to answer to question which he himself has formulated. 24 Lakatos has identified the two poles of the tension resulting from this as the theory of science and the history of science. Modifying another Kantian slogan, Lakatos has characterized the situation as follows 25 : "Philosophy of science without history of science is empty; history of science without philosophy of science is blind." The capacity of vision of the historian of science rests on the fact that "philosophy of science provides normative methodologies in terms of which the historian reconstructs 'internal history' and thereby provides a rational explanation of the growth of objective knowledge". Conversely, the theory of science has empirical content because ''two competing methodologies can be evaluated with the help of (normatively interpreted) history". Because of these normative guidelines, descriptions as reconstructions are rather more exposed to the danger, also mentioned at the beginning, of becoming not descriptive enough. With the general concept of a reconstruction I have so far only introduced one virtue which might contribute to the constitution of a rationality: The constituent parts of a reconstruction presented include, if one knows them all, a reflection with which one approaches the old Socratic virtue of knowing what one knows and what one does not know. But there is also another constituent part which was claimed at the beginning in the quotations concerning orthodox reconstructionism and on behalf of which Lakatos has argued under the formula that a rational reconstruction of the history of science must not deal with the persons who bring about the science, but rather only with the scientific products which these persons deliver. 26 In my view, the mistake inherent in that claim of the logical empiricists and in the formula of Lakatos was not so much the thereby attempted delimitation of the theory of science as the name that was given to it. For neither are the elaborations of the orthodox theory of science reconstructions of what scientists are actually doing, nor is it irrational to produce reconstructions of that. Carnap wanted just as little to reconstruct what physicists are actually doing as Frege wanted to reconstruct what mathematicians are actually doing. Kuhn, on the other hand, has provided just such reconstructions for the natural scientists. He will now have to acknowledge that his complaint, of not being able to recognize science as he sees it with the eyes of a historian in the reconstructions of the theory of science, could with equal justification be handed back to him. For his claim that physicists, for example, as a rule accept a theory long before its standard tests are known, is of no consequence as far as the theory of science is concerned, as long as he does not also say why this is so. As long as we do not know that, we are like the fooled spectators who are left to ask themselves how the magician 'really did' his trick. 24
25
26
Kant 21787, B XIII Lakatos 1978, yoU, p. 102 ibid. vol.2, p. 108ff
111.13 Towards a Rehabilitation of Reconstructionism
211
In order to feel dissatisfied with this state of affairs, one does not literally have to prohibit scientific magic, as the constructive theory of science would perhaps like to see it.
111.14 Paul Feyerabend and Rational Reconstructions· The fact that this lecture is presented in a series with the general topic "Why philosophy of science?" requires that I explain to what extent I intend to answer this general question in the following remarks. In the present case, it is only fair that I say a word about this right at the outset. For I do not intend to answer or even take up the question, "Why philosophy of science?", at the level at which it is posed. Yet I hope that my discussion may count at least as a contribution - a contribution at a lower level, as it were - towards an answer to the general question. My aversion to face this question directly is probably based on the fear that were I to pursue it as a matter of course or even assiduously I might give you the impression of subscribing to the view that my publicly funded work in the philosophy of science demands a special justification - a justification which I would not be required to give, were I a professor of medicine, or a secretary of state, or a general. It seems likely that this impression would arise primarily among those who for their part are not required to answer this question. The majority of those for whom this question is really intended, by contrast, will simply not hold the view that they are obligated to legitimate their activity as philosophers of science. Some of them would even find it rather unpleasant if this misunderstanding were to arise. And it was in this sense that I just spoke of a fear. All this is directed only towards the outside: towards the so-called public. And it is vis-a-vis this public, to the extent to which it is represented here, that I have wanted to say right at the outset: I do not stand here because I think that my discipline unlike other comparable disciplines requires a justification towards the outside. Rather, I take this opportunity to make a few remarks on what is merely an internal debate, albeit a debate of a special kind. Every scientist knows that it is not only debates on matters of truth that play a role in science. We are not merely dealing with the rather harmless kind of dissent in which one person says A, while the other says not-A. It is essentially also a question of interests, not in the sense of an opposition of interest and cognition, but rather in the sense of an absolutism of interests in certain cognitions as opposed to others. And, of course, the question of interests is tied to the question of values. If certain interests are given absolute priority, so are certain values. In the case of competing interests, what is uninteresting for me is all to easily made out to be what is uninteresting in general, and what I consider to be unimportant is made out to be what is unimportant in general or even worthless. And this can become a matter of dispute. A dispute of this kind - an intellectual struggle of competing interests has taken place over the last thirty years in the philosophy of science. This * First published as Scheibe 1988f. Translated for this volume by Hans-Jakob
Wilhelm
212
III.14 Paul Feyerabend and Rational Reconstructions
213
dispute involved an attack on the method of logical analysis and its aim of providing a rational reconstruction of science, as represented by the program of the logical empiricism of the 1930's and 1940's. The critics of this program went back to the history of the acquisition of knowledge, particularly to the history of the natural sciences, and attempted to show that decisive steps in the development of physics, for example, do not occur at all in the reconstructions of the theoreticians of science, while, conversely, substantial parts of these reconstructions are not to be found in the proper reality of science. As far as this reality is concerned, the reconstructions are said to be misleading distortions and at best irrelevant. Thus Toulmin writes l that" . .. Carnap's system of inductive logic was expounded not in terms of reallife scientific examples but in a formalized logical symbolism whose relevance to actual scientific languages was always assumed, never demonstrated." In Kuhn's judgment as we1l 2 , the reconstructively oriented philosophy of science misses what is essential. Focussed primarily on textbook accounts and a few classics of science, "the philosopher's reconstruction is generally unrecognizable as science to either historians of science or to scientists themselves". Yet Kuhn's criticism is not directed against rational reconstructions in the broader sense of the word. Even historians as well as scientific specialists provide such rational reconstructions, according to Kuhn. "Instead the difficulty is the identification of essentials. To the philosophically minded historian, the philosopher of science often seems to have mistaken a few selected elements for the whole and then forced them to serve functions for which they may be unsuited in principle and which they surely do not perform in practise". In this dispute, in which the theory of science has to choose its path between the conflicting ideals of logical reconstruction and a historically accurate representation of reality, Paul Feyerabend has played a prominent role. He also belongs to the critics of the orthodox theory of science, a theory which - in his words 3 - tells a "fairy-tale" about the genesis of scientific achievements. For well-known reasons, it is difficult to ascribe a definite position to Feyerabend and to give a consistent presentation of it. Hence I shall not even attempt to do this here. These same reasons give us the right simply to single out a few of the approaches developed by him (and other critics) and either proceed to a counter-critique or appropriate them for the purposes of further development. In this sense I now want to sketch an attempt at a mediation between, on the one hand, logical reconstructionism and the goal of the socalled unity of science (as two determinants of the method and content of logical empiricism) and, on the other hand, the anti-logicism - and to that extent anti-reconstructionism - of the said critical movement in conjunction with Feyerabend's so-called theoretical pluralism (understood as an antithesis to the unity of science). The following considerations represent a contribution 1 2 3
Toulmin 1972, p. 62 Kuhn 1977a, p. 14 (1977b, p. 65) Feyerabend 1975, pp. 300ff (1976, p. 399ff)
214
111.14 Paul Feyerabend and Rational Reconstructions
to the question regarding the purpose of the philosophy of science inasmuch as the opposition which I take up is, as mentioned, not one between truth and falsity, but rather - or so it seems - one between what is to count as interesting and what is not, what is essential and what is not, what is relevant and what is not, and what goals philosophy can set for science.
II. Logical reconstructionism or - more precisely - the method of logical analysis for the purpose of a rational reconstruction of science has never been described by its positivist representatives in any greater detail. Even when the critique of Feyerabend and others had become known, this did not initiate a deeper reflection on the principles of logical reconstructionism. The only reaction was a more or less successful assimilation of existing reconstructions to one or the other of the counter-positions developed by the critics. In any event, in this lecture I can only call to mind what is probably the main antagonism. Let us thus first of all recall an attempted characterization of reconstructionism by Reichenbach: 4 Epistemology does not regard the processes of thinking in their actual occurrence; this task is entirely left to psychology . .. Epistemology . .. considers a logical substitute rather than real processes. For this logical substitute the term rational reconstruction has been introduced . .. It is ... , in a certain sense, a better way of thinking than actual thinking. In being set before the rational reconstruction, we have the feeling that only now do we understand what we think. One typical reaction of the critics to the thus postulated separation of the logical order of justification and of the psycho-social order of discovery was to place both orders into the actual development of a science, of physics, for example. Once this step has been taken, then of course one can rightfully argue, as Feyerabend does,S that, for example, the principles of justification . .. often prohibit steps in the history of science which today are ascribed to the order of discovery. Yet science only exists because one insists on these steps. Hence there is an interaction between the two realms which makes a strict distinction illusory. Or - almost the other way around - one can, again according to Feyerabend,6 "understand 'logic' as ... the investigation of a certain system separated from history, [as] most logicians and theoreticians of science [seem] to think of it, when they use the word 'logic'." And hence the attempt at a reconstruction of science on the basis of this isolated logic may seem like a futile undertaking. 4
5 6
Reichenbach 1938, p. 5f (1983, p. 2f) Feyerabend 1975, pp. 165ff (1976, p. 230) ibid. p. 254 (p. 349)
111.14 Paul Feyerabend and Rational Reconstructions
215
With the help of two considerations, I want to defend logical reconstructionism against this kind of critique, that is, against both reproaches. 7 First of all, reconstructionism stands opposed to any kind of absolutist interpretation of science. It assumes that it would be an illusion to think that one could say with respect to the history of science that ''this is how it really was". Of course, its opponents and every historian of science will say that they know that too. Yet they are much more in danger of falling prey to this illusion. For they argue against rational reconstructions: This at any rate is not what science looks like, and then they state what in their view was historically 'really' important. For the reconstructionist, by contrast, the decisive question is not the absolute one: Is this an adequate reconstruction of science? Rather, it is the relative question: Is this a reconstruction of a part of science in this particular frame of reconstruction (in a certain logic, for example)? The frame into which the reconstruction is set together with the principles according to which it is carried out are to be made as explicit as possible. And in the end, the only claim that is made when a reconstruction is presented is a conditional one: If we choose these and these frames and principles, then the part of science which served as our original takes on this and this shape. The reconstructionist is thus not interested exclusively in the relevant original, but rather in its relation to a means of representation which is to be made as explicit as possible, and his reconstructions are in a certain sense artificial products. The reproach that such an artificial product does not resemble its original would be just as inept as the reproach against a painter that the landscape does not at all look like it does on his painting. It goes without saying that reconstructionism has many questions to answer, questions about whether a suggested reconstruction can be regarded as successful, whether perhaps one should have chosen a different frame of reconstruction, and indeed whether reconstructions of science are really required. There is no doubt that reconstructions in this sense result in a more or less de-historicized picture of science. An order of justification that appears in a reconstructive frame is as such not an object of historical analysis. Rather, it is a constitutive part of this frame, and together these are suggested for a certain employment in science. This does not mean at all, however, that the chosen frames of reconstruction would have to be - or indeed could be ahistorical products. Surely, logic and mathematics are also part of the history of science. And yet logic has become an important framework theory for mathematics as the latter has for physics. Here there are still relations open to systematic investigation, and it is completely incomprehensible how particularly Feyerabend's vision of the whole of science in which everything is to be given a voice can suddenly allow him to exclude certain tasks. Why don't we make the weak reconstructionist philosophy of science stronger than, say, 7
On this issue compare the more extended treatment in Scheibe 1984a (this vol. 111.13).
216
111.14 Paul Feyerabend and Rational Reconstructions
chemistry and invite the former to join the general competition and do its part towards the enrichment of our culture!8 I want to state my point more precisely by means of my second consideration. In all our sciences taken as a whole there exists a systematic continuity which does not allow one to separate without considerable arbitrariness certain parts as irrelevant for the rest. One can imagine a continuous path beginning from the concreteness of our sense-impressions and extending all the way to the abstractness of logical inference, a path which may be cut off at any point with just as much or as little justification as at any other point. If we start, for example, from our sensation of heat, then in a first step this sensation finds its explication in instruments for the measurement of temperature. The logical empiricists said about this that the ordinary language concepts for warm and cold are explicated in terms of the concept of temperature. This, of course, has nothing to do with philosophy. Rather, here we are only at the stage of experimental physics, and this explication belongs to the latter. But then the kinetic theory of gases appeared, and it tells us that temperature is the mean kinetic energy of the molecules. With this the concept of temperature is introduced into a complex mesh of theories and it becomes possible to determine even remote temperatures: by means of measurement and calculation. At this stage of the explication of the concept of temperature, a lot of mathematics is already involved, and with it - in the final step - logic. In particular, there is no indirect determination of temperature or of any physical quantity, where the relevant value is not eventually calculated and hence inferred. Where would one non-arbitrarily end this path before one has at least reached logic? Humanity managed to live long enough without thermometers. Even today there are experimental physicists who face theories only with great skepticism. Again and again one encounters theoretical physicists who are prepared to make only the most sparing use of mathematics, and many mathematicians from Descartes until this day wanted and still want nothing to do with logic. From this we can see that nothing really unheard-of is happening when the attempt is made, on the part of the theory of science, to give certain meta-physical concepts a precise logical status by means of explication, nothing that would justify demanding a legitimation which would not be appropriate anywhere else. The reconstructive theory of science and the logic that it employs are as little or as much "separated from history" as one wants to have it. And properly understood, it is by no means an abstruse item to be displayed in an exhibit of philosophical and scientific curiosities. III. Yet, with regard to reconstructionism, we are dealing only with a methodological problem. We must now proceed to the other main problem, a problem 8
Feyerabend 1975, pp. 47ff (1976, p. 48f)
111.14 Paul Feyerabend and Rational Reconstructions
217
concerning content. In contrast with the methodological problem, this problem was essentially created by Feyerabend alone. While in the former case we were concerned with the opposition of logical reconstruction and historical accuracy, from now on we are dealing with the (partly) ontological problem of the opposition of unity and plurality. We are familiar with this problem since the time of ancient philosophy, but in what follows we shall only be dealing with it in its most recent version: with Feyerabend's radical theoretical pluralism as a counter-model to the empiricists program of the unity of science. The goal of my reflections on this issue will be to show that this pluralism too must be held together by something, if it is not to degenerate into a triviality. And this coherence will also give it a unity, albeit not the kind of unity against which it was set up. Any conception of a unity of human knowledge must account for the historical fact that such a unity does not yet exist. In past and present we are confronted with an overwhelming multiplicity of concepts, theories, conceptions, problems, etc. Feyerabend9 in particular has emphasized that in almost no case do we have a definite grasp of this intellectual material. Its plurality, however, gives us a certain room to develop it further. In particular, we have two opposite possibilities: assimilation and alienation. We can try to reconcile theories or we can try to reinforce existing differences. These are two opposite tendencies which do not yet tell us anything regarding the direction in which we shall find success as far as the subject matter is concerned. Those who strive for unity in the conventional sense will, of course, take the path of assimilation. The concept of a reduction has become the key concept of this school of thought. One tries to reduce concepts or theories to one another, and one tries to understand the development of the sciences as a development towards a unity at least in the sense that older concepts or theories can be reduced to later improvements of these, if possible, to such an extent that their total number decreases. Significant reductions of this type would be, for example, the reduction of chemistry to physics or of biology to chemistry. Today, the reductions of astronomy to physics and of mathematics to set theory may count as examples of successful reductions. For two theories to be reducible to a third, they must be, as some have put it, commensurable or - in light of the mentioned room for development - they must be capable of being made commensurable. Although this is in the final analysis not a satisfactory definition, we can regard two theories as commensurable, if there exists a third theory in which they can be (in the broadest sense of the word) embedded and if correspondingly their two domains of application can be conceived as parts of the domain of application of the third theory. For this process of reduction, of course, the respective unity of the reducing theory as regards subject matter and language must be secured. But on this conception, the unity of any theory must be guaranteed in this regard. While this does not really present a problem within the 9
ibid. p. 252£, (1976, p. 348)
218
III.14 Paul Feyerabend and Rational Reconstructions
framework of logical reconstructionism, the authors of the last work in the logical-empiricist tradition of the unity program lO deplored the fact that we really have no concept of what in a positive sense, that is, in a sense that goes at least far beyond the mere demand of consistency, the unity of the propositional content of a theory would be. Even today ~ thirty years later ~ we do not know much more about this issue. Thus, from a reconstructionist standpoint, the situation regarding the concept of a unified science does not look very promising, not to mention the historical realization of such a science. It is only fair that this be acknowledged in light of various charges that there exists no sufficiently clear idea of what the concept of a theoretical pluralism is supposed to consist in: Neither is there a clear view of the concept of the unity of science. Feyerabend now argued against the historical possibility as well as the theoretical desirability of certain reductions in the orthodox sense. And moreover, he challenged the idea of unity in the usual sense with his call for a theoretical pluralism in the genuine senseY The plurality of theories must not be regarded as a preliminary stage of knowledge that will at some time in the future be replaced by the 'one true theory'. Theoretical pluralism is assumed to be an essential feature of all knowledge that claims to be objective. The theoretical pluralism thus sketched is of methodical interest ~ as I now want to emphasize ~ precisely for the logical reconstructionist, since for him it presents a challenge, the acceptance of which should end in either triumph or failure. But the position is of interest also as far as our subject matter is concerned, and this primarily for two reasons. We begin by observing that in Feyerabend's characterization of pluralism the concept of a theory is not avoided, but is rather expressly made use of. It is then said, however, that it is necessarily the case that our total picture of the world cannot be represented by means of a theory, but rather only by means of a conglomerate of theories. Yet, the context of the quoted passage shows that this situation is necessary in the sense that while some theories which we can have exclude one another in a particularly intricate way and as such cannot be incorporated into a higher theory, this does not mean that we could do with only one of these theories, while dispensing with the others, that is, perform a kind of "downward" reduction. Rather, these theories at the same time mutually complement one another in such a way that without this complementation our picture of the world would be incomplete. The key concept of theoretical pluralism is thus the concept of the incommensurability of theories which is supposed to comprehend the relation in which irreducible theories necessarily stand within a picture of the world. I am using the expression "picture of the world" in a non-standard way to refer 10 11
Oppenheim/Putnam 1958, p. 4 (1970, p. 340) Feyerabend 1965c, p. 149
III.14 Paul Feyerabend and Rational Reconstructions
219
to the supra-theoretical unity of several incommensurable theories. For in my view it is important to focus precisely on these unities and to have a name for them. And at this point, I already want to emphasize that by incommensurability I mean the dual relation of exclusion and complementation and not merely the exclusion, as the word itself and its attempted definitions suggest. One cannot understand the positive role of a theoretical pluralism as long as one does not also think of this aspect of completeness. For only in this way can it become comprehensible why after all one theory does not suffice.
IV. But first we want to look at the negative and official aspect of incommensurability and ask whether it is really as terrifying as some have presented it. 12 We must distinguish the attempted definitions of the concept of incommensurability from the hitherto demonstrated incommensurabilities themselves. And as far as the concept is concerned, we must distinguish between the radical tendency of Feyerabend and the much more moderate tendency of his partner in incommensurability, Thomas Kuhn. The kind of irreconcilability of two theories with which we are familiar is their logical incompatibility. Inasmuch as the question of truth is meaningful in such a case, it follows that both theories cannot be true. Thus they are in competition with respect to truth. It seems clear - and it can be quite generally proven - that in this situation the theories must share some part of language with the same interpretation. Just on this point, however, Feyerabend's extreme incommensurability goes one step further in that it excludes even the sharing of possible interpretations. Two incommensurable theories do not dispute each other's right to the truth. Rather, they dispute each other's right to a share in their meaning. And in a certain sense they do so through and through. One definition states: 13 Two theories are said to be incommensurable if the meaning of their essential descriptive terms rests on principles that contradict each other. It must be noted that according to this definition it is not the theories themselves that contradict each other, but rather their underlying semantic principles. We shall see, however, that in more harmless cases these principles can also simply be the axioms of the theories concerned. In somewhat greater detail, Feyerabend describes his extreme case as follows: 14
There are theories of which one would say intuitively that they 'talk about the same things' and which yet do not have a single proposition in common. This is not simply because the theories describe different domains ... , but because the employment of the conceptual apparatus of the one theory posits conditions which thwart the 12 Besides Feyerabend's works cited above see especially Kuhn 21970 . 13 Feyerabend 1965c, p. 227, no.19 14 Feyerabend 1973, p. 98
220
111.14 Paul Feyerabend and Rational Reconstructions
employment of the conceptual apparatus of the other theory (the theories are incommensurable). What makes it particularly clear that here we are dealing with an extreme version of incommensurability is the demand that incommensurable theories, even though they are in some sense supposed to be theories of the same reality, have no proposition in common. Why do they not share some propositions, even if they do not share others? It is a priori unlikely that besides commensurable theories science offers us at the same time extremely incommensurable theories or even exclusively extremely incommensurable theories. Because of his strong normative tendency, Feyerabend falls into the same danger in which he sees logical reconstructionism, that is, the danger of concocting cases which do not exist in science or which at least are not typical cases. Kuhn, the cautious historian, describes a much weaker concept of incommensurability in a more recent paper: 15 Most of the terms common to the two theories function the same way in both; their meanings, whatever those may be, are preserved; their translation is simply homophonic. Only for a small subgroup of . .. terms and for sentences containing them do problems of translatability arise. The claim that two theories are incommensurable is more modest than many of its critics have supposed. Kuhn suggests the term "local incommensurability" for the situation thus described. But it is immediately clear that a description in terms of dichotomies would only be misleading, since the whole matter is now more a question of degree such that at the extreme points we have total commensurability on the one hand, and total incommensurability on the other. I now want to show in more detail what happens when we think in terms of this dimension. I shall do this by means of an exemplary consideration for which I could not think of any better characterization than to say that it consists of a logical analysis. Two sentences prior to the description of local incommensurability just cited, Kuhn gives the following definition: The claim that two theories are incommensurable is ... the claim that there is no language. .. into which both theories, conceived as sets of sentences, can be translated without residue or loss. I nOw want to show that the local incommensurability of which Kuhn subsequently speaks can already appear in what, according to this definition, are commensurable cases. And this will be revealed by demonstrating that One of the cases from physical theory which both Kuhn and Feyerabend classify as incommensurable - the relation of the Minkowskian to the Galilean theory of space-time - is in fact commensurable according to Kuhn's definition. Such demonstrations, of course, can only be furnished if Kuhn's still vague definition is interpreted in a certain way. And such an interpretation (or: 15
Kuhn 1983, p. 670f
111.14 Paul Feyerabend and Rational Reconstructions
221
reconstruction) could possibly misunderstand the intention of the original proposal. There is no other way to proceed, however,16 and to me it seems appropriate to determine Kuhn's concept of incommensurability in the following way: We begin with two theories S' and T' which are irreducibly formulated in the concepts a and 13 (we allow several concepts in each case). There is commensurability, if there exist concepts 'Y which are respectively interdefinable with a and 13 and theories Sand T which are both irreducibly formulated in 'Y such that there exist the equivalences:
Sb]' 'Y = P[a] , a = P- 1 b] T'[f3] C2' Tb]' 'Y = Q[f3]' 13 = Q- 1 b] S'[a]
C2'
(1)
The interdefinability (or: translatability) must be understood in the broad sense that the definitions employed may not only presuppose logic and set theory but also the respective physical theories. This assumption will prove to be decisive, and it seems to me to represent one of the safest scientific practices at the conceptual level. Let us now look at the Minkowskian geometry of space-time (as part of the special theory of relativity) in its relation to Galilean geometry (as part of pre-relativistic physics). With respect to this relation, Feyerabend says (essentially) the following: 17 The new laws will not only read differently, they will also conflict in content with the preceding classical laws. ... Not a single primitive descriptive term [of the latter} can be incorporated into [the former}. . .. This we may express by saying ... that the meaning of all descriptive terms of the two theories, primitive as well as defined, will be different: [the two theories] are incomensurable theories ... This is, of course, greatly exaggerated. In fact, we have here a case of commensurability in the sense of the definition given, even though the characterization of what is thus defined as a commensurability is unfortunate. But this characterization is due to Kuhn. In our example there are several common conceptual bases 'Y. Both theories can be formulated as theories about spacetime with a distinct class of inertial systems. In this formulation, the two theories on the whole simply make different and even contradictory statements about this class. 18 In another way, both theories can be formulated as theories about two (constant) tensor-fields (one double covariant and one double contravariant) as well as one (constant) scalar. 19 In this case their difference can be localized in the numerical value of this scalar:
Sb] == Rb]/\'Y = 0 T['Y] == Rb]/\ 'Y = c- 2 16 17 18 19
For the following see Scheibe 1986c. Feyerabend 1981c, p. 114f (German in Feyerabend 1981a, p. 141) Scheibe 1982c (this vol. VI1.31) Ehlers 1986
(2)
222
111.14 Paul Feyerabend and Rational Reconstructions
where c is the speed of light. Thus the difference amounts approximately to the difference between Euclidian and hyperbolic geometry. The question is, where, if anywhere, do we now find the incommensurability? Inasmuch as it is present, we can find it again if we consider that from a common conceptual basis the two theories make contradictory assumptions which in each of the two theories allow us to define concepts the definition of which is impossible in the other theory. In Galileian geometry, time can be introduced as independent of the inertial system, whereas in Minkowskian geometry, it cannot. In the latter, by contrast, it is possible to define an absolute (finite) velocity which does not exist in Galileian geometry. Indeed, locally, that is, in certain individual cases, what happens is exactly what Feyerabend speaks about when he says that contradictory semantic principles reciprocally cancel the concepts of incommensurable theories. In the example discussed, these principles are simply the theoretical assumptions themselves. One can even introduce the conceptual bases a and f3 into the two theories (in the sense of (1)) such that the meanings of all a as well as those of all f3 do not exist in the sense of the other theory. To this extent, the competition with regard to truth (in the case of a common 1) can be transformed into a competition of interpretations. But this semantic catastrophe is by no means a total catastrophe. And why should it be? This is why I said we should come around to the idea that we are rather dealing with a whole spectrum of incompatibilities which reveals Feyerabend's extreme case only to a limited extent. Total commensurability, in which not even the phenomena just mentioned can occur, exists when Sand T in (1) are compatible or even identical, that is, when S' and T' are equivalent. The case just exemplified, in which Sand T contradict each other, is already quite difficult to handle. For while the translation into a common language is still possible, the embedding into a common theory is not easily achieved. Even if we admit, as is in any case necessary with a view to the examples from physics, approximative embeddings, then (1) is possibly only a special case of the already employed formulation (in the previous section)
S'[a] T'[f3]
:::S :::S
Uh], a = P'h] UhJ, f3 = Q'[1]
(3)
where U is the embedding theory and we have the possibility of approximations in all four relations. Only if this case cannot be attained would we have to say that there are more serious incommensurabilities at play, even though not (yet) necessarily Feyerabend's extreme case. I do not want to conclude this section without having stated that I am by no means satisfied with this last analysis. Taken by themselves, the local incommensurabilities just demonstrated are trivial, and it all depends what is affected by this state of affairs. If a law tells us that an object moves in an ellipse or a hyperbola, then, for the purpose of formulating further
111.14 Paul Feyerabend and Rational Reconstructions
223
propositions, we can introduce the concepts of the axes (of the ellipse) or of the asymptotes (of the hyperbola). Each of these concepts is meaningless in the other theory, and no one would be alarmed by this fact. The situation is changed only when it concerns such fundamental concepts as space and time. But the fact that these concepts are fundamental, that is, fundamental in regard to the whole of the description of nature, cannot be expressed as a feature of those theories in which we speak about space and time in isolation.
v. But perhaps there still exists a possibility of connecting the fundamental status of a physical theory with the appearance of genuine incommensurabilities. With this question in mind I want to - or rather, I must - speak in a final section about what seems to me to be Feyerabend's somewhat ambivalent relationship towards the Copenhagen interpretation of quantum mechanics and towards its authors, Bohr and Heisenberg. 2o First of all, it must be recognized that it was Bohr and Heisenberg - and not Feyerabend and Kuhn who first invented the incommensurability of theories in connection with the establishment of quantum mechanics in the late 1920's.21 Feyerabend seems to be aware of this fact in the case of Bohr, but not in the case of Heisenberg. With regard to our question, however, this would have been more important, as will be apparent in a moment. Yet in his favorable remarks about the Copenhagen interpretation, Feyerabend emphasized other aspects, and the object he singled out for criticism seems virtually to contradict the idea of incommensurability: considerations of the finality of quantum theory and the demand of a classical description of the instruments of measurement. That is to say, he chose altogether conservative features of the Copenhagen interpretation. Yet a second point is even more remarkable: Feyerabend's relationship towards Bohr's concept of complementarity.22 On this issue as well, we find a detailed discussion spanning several works. But in this discussion Feyerabend does not notice a conspicuous structural similarity between this concept of complementarity and the concept of the incommensurability of theories. And yet this similarity is especially useful for the purpose of a rational explication of theoretical pluralism. As far as the first point is concerned - the question with which we began this section - we shall first direct our attention to a particularly transparent parallel case. The relation of the contradiction of two theories has already been cited several times. This time we ask ourselves: Does there exist a property of theories such that two theories which both have this property and which are distinct from each other already contradict each other? The formal completeness of a theory is certainly one such property. For if each one of two theories either proves or refutes every (relevant) proposition and if a 20 21 22
See esp. Chs. 16 and 17 (resp. 17 and 18) of the works quoted in no.17 Scheibe 1988b (this vol. 1I.6) In addition to the papers cited in no.20 see also Scheibe 1989b.
224
111.14 Paul Feyerabend and Rational Reconstructions
proposition A is provable in the one but not in the other theory, then the proposition is refutable in the latter, and we have the contradiction. Accordingly, the question which really interests us is: Does there exist a property of theories such that two theories which have this property and which are distinct from each other are already incommensurable? In the parallel case of the contradiction it was a property of completeness which fulfilled our condition. Can we expect something corresponding to hold for incommensurability as well? This seems to be precisely Feyerabend's view when he says23 that incommensurabilities in his sense would most likely appear in the case of universal theories, by which he means theories that "contain the means for the description of every process possible in their respective domain and that allow us to define the operations of measurement that we use to test them." This is obviously a demand for completeness, and the automatic appearance of incommensurabilities in connection with them can hardly mean anything other than that for universal theories their difference turns into an incommensurability. In a much more direct way, Heisenberg attempted to capture the same idea with his concept of a closed theory. First, it must be emphasized that for Heisenberg the transition from Newtonian mechanics to quantum mechanics represented a revolutionary step which he repeatedly described as a "radical restructuring of the conceptual foundations".24 Not much is taken away from this view if one accounts for Bohr's demand of a classical description of the instruments of measurement, as Heisenberg certainly, even if somewhat half-heartedly, did: the abstract basic concepts are nevertheless subject to a radical restructuring. 25 For him this fact was final, and it was something the like of which had never before occurred in physics. What circumstances could have finally brought it about? As a physicist, of course, Heisenberg was primarily concerned with the laws of physics, and any decisive change would essentially concern these. But if it is more appropriate to describe the relevant change primarily as a change of concepts rather than a change of laws, this may simply be due to the fact that with some theories the applicability of their basic concepts already determines which laws are valid in the respective domain. In such a case, the laws cannot be improved in any other way than through a change of the concepts in which they are formulated. Two equivalent formulations of the concept of a closed theory at once suggest themselves. The formulation originally chosen by Heisenberg goes as follows: 26 To the extent to which one can describe any given appearances with the concepts of [the closed theory T], the [laws of T] also hold with 23 24 25
26
Feyerabend 1973, p. 101 For the relevant passages see no.21. On the relationship between Bohr and Heisenberg see Folse 1985, Ch. 3.7 and 8. Heisenberg 1969, p. 135. The original formulation refers to Newtonian mechanics.
111.14 Paul Feyerabend and Rational Reconstructions
225
strict validity. .. More precisely. .. perhaps ... : The [laws of T] are valid with the same degree of accuracy with which the appearances are describable using the [concepts of T]. C. F. von Weizsacker has given an essentially equivalent formulation in terms of the idea of theory change. 27 According to Weizsacker, a theory is closed, if it cannot be improved upon by means of small changes. Large changes are those that involve changes of the basic concepts of a theory rather than merely the introduction of corrective terms. The resemblance to Feyerabend's incommensurability is obvious, except that in one case the emphasis is on the one-place concept (closedness), whereas in the other case it is on the two-place concept (incommensurability). It seems to me that Heisenberg's version involves a formulation of completeness which so far has been even less understood than the (probably) quite different completeness involved in Feyerabend's concept of a universal theory. In any case, we are here dealing with model cases of concepts that have as yet not been analyzed, even though they occupy a central position in our understanding of physics and its development. Incidentally, I think that with a case of this type, the question, "Why philosophy of science?", is answered quite easily. And with regard to Feyerabend, I am especially grateful that this case would exist even without him, and perhaps even without philosophy. Finally, let us now look at Feyerabend's relationship to Bohr's complementarity. As I have already said, theoretical pluralism is trivial, as long as it does not also tell us, what keeps a plurality of theories together: We can always have a multiplicity of opinions that remain unintelligible to us as far as their mutual relations are concerned. That is no great feat. The specific mastery of scientific research and of intellectual pursuits in general cannot be sufficiently described without recourse to a unifying element. That does not mean that it is obvious what the unity to be established in a given case consists in or that in every such unity all plurality disappears. On the contrary: Here we are facing the non-trivial aspect of pluralist thinking, and it was a great realization that the unity of our picture of the world could not be achieved without the price of incommensurabilities. It seems to me, however, that this idea was already developed by Bohr, only that he used the word "complementarity" for it. Compared to its rival "incommensurability", the word "complementarity" accentuates more the positive aspect of the relation: We are dealing not only with a relation of exclusion but also with one of completion. This emphasis in fact takes into account the main fear of theoretical pluralism, the fear that a monist attitude would all to easily run into the danger of simply overlooking remote aspects of reality which are important nonetheless. The most famous example which quantum theory has contributed towards illustrating incommensurability and complementarity is the so-called waveparticle-dualism, that is, the discovery that electrons have the properties of 27
Weizsacker 1971, p. 193f (1980, p. 156)
226
111.14 Paul Feyerabend and Rational Reconstructions
waves whereas light has the properties of particles. Unfortunately, most presentations (including that of Feyerabend!28) immediately make the mistake of characterizing the relation between the classical picture of particles and the classical picture of waves as a contradiction: It is said that these pictures contradict each other. This view is not only false, but one thereby gives one of the clearest illustrative examples of incommensurability in its extreme form. Noone has ever deduced a proposition A from a logically flawless formulation of particle theory and at the same time the proposition not-A from a flawless formulation of wave theory. Yet only in this way would the relation of contradiction in the usual logical sense be established. There is in fact no contradiction, but rather the incommensurability of two formalisms whose usual rules of interpretation do not allow for a common interpretation. But to this purely negative assessment it must be added that we have a theory, that is, quantum theory, in which the duality of these two incommensurable classical representations are united in a complementary way. To be sure, the price to be paid for this unification is that quantum theory is in essential respects a theory of a different kind than the classical theories whose complementary unification it makes possible. This is especially well illustrated in the more simple example of the incommensurability of certain observables that occurs within ordinary quantum mechanics, in the incommensurability of location and impulse for instance. First of all, incommensurability of quantum-mechanical observables means precisely what the word says: the impossibility of common measurement. An analysis of what this means in quantum mechanics, however, immediately reveals that here we are also dealing with a case of incommensurability in Feyerabend's sense. In quantum mechanics, the ontology of propositions with which we ascribe properties to individual objects has become problematic. In the Copenhagen interpretation this is expressed by the fact that such propositions are not granted a meaning independently of a measurement actually performed. The existence of incommensurable observables shows that this renunciation is not voluntary. Once one has accepted it, however, one can describe the resulting situation precisely in the sense of Feyerabend's incommensurability: Two contingent propositions such as "particle E has the position x" and "particle E has the momentum p" are incommensurable in that the presuppositions that give them meaning exclude each other (physically). For these presuppositions state that the relevant propositions are decided by measurement. Yet a common decision (in the present case) is precisely excluded by quantum mechanics. This exclusion is just what the incommensurability of quantum mechanics expresses. The two mentioned propositions do not have a common interpretation. Nevertheless, both had to be included in the quantum-mechanical description of an object. Indeed, this step gave us a theory in which infinitely many incommensurable partial languages are 28
Feyerabend 1981a, p. 446
III.14 Paul Feyerabend and Rational Reconstructions
227
in a certain sense united under one roof. So far, we have only managed to understand a handful of them. In recent years, Hans Primas 29 has used the type of complementary relations that we have become familiar with through quantum mechanics as a model for a more general program of complementary-pluralist thinking in the natural sciences. As far as it is feasible, Primas makes use of the fact that today these complementary relations are thoroughly formalized in the theory of the ortho-complementary lattices. For this reason alone, I would be the last not to welcome such an attempt. Yet it must be emphasized that the complex of problems of incommensurability, complementarity, and pluralism dealt with in this lecture constitutes a very rich topic which cannot easily be cast into a simple formal scheme. Bound up in it are such basic and yet heterogeneous concepts as progress, reduction, logic, theory, and language. And it is neither a confusion in the subject matter, nor a formal clarity, but rather a combination of both, which lends our topic its appeal.
29
Primas 1981
IV. Laws of Nature
For a long time the major task in the field of the laws of nature was seen in finding necessary and sufficient criteria for the lawlikeness of a statement occurring in physics. l But all attempts of this kind have failed, and the papers of this chapter, full of mistakes as they might be, do not repeat the mistake of adding one more proposal of what it is for a statement to be lawlike. Instead the major topic of the chapter is an astonishingly unnoticed phenomenon that may be called the polarity (or complementarity or reciprocity) of generality and coherence. It is treated in all papers (except [15]), with special care in [18] and [19], whereas in [16] and [17] the two side issues of predication and substances in modern physics are added. In [15] the concept of coherence is confronted with that of contingency. The task of theoretical physics is the description of physical systems of all kinds - the exact and complete description as far as possible and, most importantly, the lawlike description. The task can be divided in two parts: 1) A single system S is described by making a statement
E(S)
(la)
i. e. S has the property E. If, for instance, S is a gas (la) could say that pressure, volume and temperature of S satisfy the van der Waals equation. 2) (la) is raised to the dignity of generality by a statement for all systems S: if S E K then E(S)
(lb)
where K is a set of physical systems - the domain of intended applications - characterized in a pre-theoretical way (cf. [18], §II, [19], §II). The proper physical task is the finding of E, and the kind of this finding is different for different kinds of physical systems. By contrast, the step from (la) to (lb) does not add anything to the content of the law and, apart from K, is the same in all cases. Normally the generality in (lb) is not the only one occurring in this statement. Already in E and, therefore, in (la) generalities may occur. If, for instance, E is a particular theory within classical mechanics or quantum mechanics then E contains the axioms of euclidean geometry of (ordinary) space 1
Cf. Scheibe 1973d
E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001
230
IV. Laws of Nature
and, accordingly, ever so many generalities (= propositional forms containing quantifiers in the axioms and their consequences). Not by themselves but in their function within (la) these generalities are different in kind from the generality explicitly occurring in (lb). Whereas the latter is system-transcending the former is system- constituting. In (lb) the instances 17(81 ), 17(82 ), ... are independent of each other, and they are rivals. Their independence is a necessary condition of their being appropriate instances of an empirical test of (lb). And the various solutions of the equation of motion of, for instance, a harmonic oscillator are competitors because a given oscillator can be described only by one of them. Other solutions cannot add to this description but represent other oscillators. By contrast instances of the generalities occurring in 17 co-exist and co-operate in the formation of 17, bringing about the coherence that is achieved in the law. As system-constituting elements they lead to the unbounded interconnectedness and to the various dependences of the structural elements of the respective system. That these dependences in (la) run counter to the independence of the instances in (lb) can be seen particularly clearly in the case of Kepler's laws vis a vis Newton's gravitational theory. In the former the interaction between the planets is ignored. The resulting independence makes it possible to formulate Kepler's laws already as statements about any single planet (and the sun): high generality but low coherence. By contrast, in Newton's case the behaviour of every body depends on the behaviour of all other bodies, and we have a maximum of coherence together with vanishing generality: no two Newtonian systems can exist independently of each other in one and the same world - strictly speaking. The question then is: how can such a thing happen at all? To find an answer it is convenient to introduce three aspects under which a law of the form (lab) can be seen. According to the first aspect the physical systems that occur as instances (or counter-instances) in (lab) are possible worlds existing completely independently of each other. Consequently, the question whether two or more instances could exist side by side in one and the same world - our world - simply is not raised. This question arises only in connection with the question how a law like (lab) can be empirically tested. As a rule this requires the existence of several instances of the law. The aspect of possible worlds, however, does not tell us at all how we could get hold of these instances. To this end we have to extend the first aspect by the second: the laboratory aspect. One has to require that in principle arbitrarily many instances of the law can be realized in our real world at least by approximation, idealization and other procedures of the kind. The aspect of possible worlds can then be maintained in the sense that after empirical verification by sufficiently many real instances every further possible instance, if it, too, could be realized, would satisfy the law. (Newton's law of gravitation would be valid in a world where the earth had no moon.) According to the third, the cosmological viewpoint, in physics we try to give a description of the
IV. Laws of Nature
231
whole universe so that in the last analysis we have to deal with one single system. All other systems would have to be described innerworldly as subsystems of the one all-embracing system. In particular, a law of the form (lab) could not be accepted unless it were reconstructed according to the aspect of innerworldliness (cf. [18], §III). In [15] the coherence of a law of nature is confronted with its contingency. 2 In this investigation the basic concepts had to be left rather vague, and the whole treatment is badly in need of more precision. According to the tradition a proposition in an axiomatic theory is contingent if it neither follows from nor contradicts the axioms. If the axioms are identified with the laws of the theory, the contingent propositions and only they are not lawlike, and such propositions are to be met with in almost all physical theories: the theories of physics are essentially incomplete (in this sense). The best known examples of contingent physical propositions are the initial and boundary conditions. But also laws may be contingent in a relative sense. They are then explainable by or reducible to theories of a higher degree of coherence. Kepler's laws are certainly contingent with respect to Newton's gravitational theory: They follow from the latter together with certain 'facts' as additional premises. In [15] this situation is generalized to other theories and explanations. Unfortunately it remains still a matter of intuition to see that in such a step of explanation it is the growth of coherence that brings about the degradation to contingency of the explained theory.
2
See also: [4], §III, Scheibe 1973d, 1986b, 1987a and for examples of reduction: Scheibe 1997b and 1999.
IV.15 Coherence and Contingency. Two Neglected Aspects of Theory Succession* By calling coherence and contingency two neglected aspects of theory succession, as I do it in the title of this paper, I do not mean to imply that philosophy in general has neglected these concepts. Even if they are put into the context of the development of human knowledge in general we can find them treated now and then in the history of philosophy. What I am missing is their re-evaluation under the particular viewpoint, taken by modern philosophy of science, of scientific development as a complex network of theory successions. From time to time we have to recall the work of our ancestors and put it into perspective to our own endeavors. In the case before us we would have to ask what the impact would be of the traditional views on coherence and contingency on the present views on theory succession if somebody took the trouble to bring these views together. In this paper I will not do this in a way that would pass judgement of a historian. My ambition is more of a systematic kind. I want to look at the development of science as being characterized by an increase of both coherence and contingency. At face value this suggestion may not come as a surprise to a philosopher of science of our days. The striving for unity in science may very well be expressed by saying that science becomes more and more coherent. And is not the increase of contingency but a mirror image of getting at ever more universal laws? Now, as I said, I do not want to claim any originality in matters of principle. But as to the first point, the unity of science movement of logical empiricism has not resulted in any useful suggestion for an explication of coherence or - for that matter - unity. Who nowadays wants to start work on induction is still not badly advised by being referred to Carnap's "Foundations of Probability". Who wants the same with coherence has just to start from scratch. As to the second point the change from growing universality to increasing contingency is perhaps mainly a matter of emphasis as regards the philosophical outlook. It sounds great to have the final, all inclusive law of nature. But what if it will leave us with a world in which almost everything happens by chance? Would this be the maximum amount of coherence that could be obtained? Here comes in sight what my major interest will be in giving this paper: It will not be the increase of coherence or contingency, each taken separately, but the simultaneous appearance of both. It seems that in a sense the development of science is characterized by the appearance of both. But if you tell this to a coherence theorist then he will not only be surprised, he will blow up in your face. So here we have a problem, and far from giving a solution I will try to give a more detailed and more precise description of it. * First published as Scheibe 19S9c
232
IV.15 Coherence and Contingency
233
Coherence and Contingency: an Introduction In a first section of my paper I may be allowed to introduce the aspects of coherence and contingency by reminding us of two philosophical positions that are representative for them. The first, standing for coherence, is philosophical rationalism. It is characterized by the belief that the world can be understood or, more modestly, that understanding the world is a supreme goal of human endeavor. The time-honored tradition of rationalism can here be remembered only by mentioning the work of one of our contemporaries who most vigorously has defended the sovereignty of reason: In his books "The Nature of Thought" and "Reason and Analysis" Brand Blanshard 1 has given a formulation and a defense of a rationalistic epistemology that in some parts is at least of a considerable heuristic value for present philosophy of science as I would like to see it. Perhaps the following passage2 from Blanshard's "The Nature of Thought" can give an impression of his major concern and at the same time may introduce the concept of coherence: ... reality is a system, completely ordered and fully intelligible, with which thought in its advance is more and more identifying itself. We may look at the growth of knowledge . .. as an attempt by our mind to return to union with things as they are in their ordered wholeness ... And if we take this view, our notion of truth is marked out for us. Truth is the approximation of thought to reality. .. Its measure is the distance thought has travelled ... toward that intelligible system ... The degree of truth of a particular proposition is to be judged in the first instance by its coherence with experience as a whole, ultimately by its coherence with that further whole, allcomprehensive and fully articulated, in which thought can come to rest. What shall we do about these breath-taking sentences? Is there any chance to get help from them for understanding the development of science? Our first reaction will perhaps be mitigated if, by contrast, we look at the other extreme. The other extreme is a position that has great influence since the time of Hume and is today best known in its most recent version as logical atomism. Logical atomism is an ontology, copied from the language form of modern logic. Wittgenstein has couched it in the pithy statements: 3 1.2 1.21 6.3
1 2 3
The world divides into facts Each item can be the case or not the case while everything else remains the same The exploration of logic means the exploration of everything that is subject to law and outside logic everything is accidental.
Blanshard 1939 and 1961. Blanshard 1939, vol. II, p. 264 Wittgenstein 1922
234
IV.15 Coherence and Contingency
Obviously, this is a position of total contingency of the world. In an ontological interpretation we would have to say: Not only can we imagine the world to be other than it is. To the full extent of logical possibility it could be different from what it is. You will not be surprised to hear that Blanshard has called logical atomism ''the most formidable attack ever made on reason as an independent source of knowledge.,,4 And even if the matter is seen from the viewpoint of modern science this other extreme could not be accepted either. You cannot meet the efforts of physics, all this struggle for law and order, with simply pointing out that before the throne of logic all are equal - from Newton's laws of mechanics down to the most trivial statements on my present sense impressions. In fact it is hard to believe that science, not owing its successes the extravagancy of its theories, will be happy with either of those opposing philosophical positions. One would rather expect that for the description of science and its development both conceptions - coherence and contingency - will be useful if combined in an appropriate manner. Moreover, it is not difficult to find a starting point for such a combination. We have only to look at the basic structure of physical theory as the most systematized outcome of scientific theorizing. The starting point then is the following dualistic structure: On the one hand, a physical theory has its laws for which an at least local validity is claimed. On the other hand, a class of so-called initial and boundary conditions is specified. For them the theory leaves it open whether they are valid or not. The paradigm of this basic structure had been Newtonian celestial mechanics where the unconditional validity of the gravitational law is set against the complete arbitrariness of the positions and momenta of the bodies involved at a given time. But also the new theories of our century are constructed quite similar as regards the point in question: Einstein's field equations have many solutions that are restricted only by initial and boundary conditions, and in quantum mechanics the Schrodinger equation does not allow to determine one single probability. A physical theory - so it seems to be in general - qua theory does not answer every question that it allows to be raised, though. Insofar an element of contingency is present in every theory. On the other hand, given certain initial and boundary conditions the laws do allow to draw many contingent conclusions from them that without the laws would stand completely disconnected from those given conditions. There is, therefore, also some element of coherence introduced into the theory. How does this come about? The following analysis replies to this question in two steps. The first step is the decision to reconstruct a physical theory in the sense of Aristotelian axiomatics, i.e. as a system of primitive concepts and basic axioms grounded on a logic that allows to give definitions and draw inferences. According to the present state of the art there is a great number of logics available. But they have one feature in common: They use atomic languages, i.e. languages built from certain elementary parts like a 4
Blanshard 1961, p. 92.
IV.15 Coherence and Contingency
235
house is built from bricks. More precisely, every sentence of such a language is constructed from atomic sentences and sentence forms by means of logical connectives and quantifiers. There is, therefore, a sharp division of the language into logically simple and logically complex sentences. This division now leads to our second step. If we want to describe some piece of reality in an atomic language then two extreme possibilities suggest themselves. We could try to give the description by using atomic sentences and their negations only, or, secondly, we could try to avoid atomic sentences altogether and use instead pure sentences of unbounded logical complexity. We could, of course, also do both. And this is what is actually done. But the two possibilities are given quite different functions. The logically complex propositions are used as possible axioms of a theory. In other words, they are used to express the laws of physics. The atomic descriptions, on the other side, are being used to formulate the initial and boundary conditions and, with them, the possible observations. Previously, I had associated the aspect of contingency with the initial and boundary conditions and the aspect of coherence with the laws of physics. With the reconstruction just given at hand we can now describe this assignment in a more general fashion. An atomic diagram - a complete atomic description - taken by itself is totally contingent precisely in Wittgenstein's sense: Each item can be the case or not, while everything else remains the same. Or - equivalently - if from any subset of an atomic diagram any atomic proposition or its negation can be inferred then this proposition is already an element of that subset. In the presence of laws, however, the situation may change. The laws induce dependencies in the diagram enabling us to draw non-trivial inferences from one part to other parts of it. In other words, laws introduce an element of coherence into an otherwise entirely incoherent aggregate of atomic propositions. In searching for laws, if only in this sense, physics thus did make a decision in favor of coherence although it could not avoid introducing contingencies either. You simply must have something to go on.
The Increase of Contingency In the previous section we have seen that although philosophers tend to maximize the importance of either coherence or contingency up to their mutual exclusion, in science they may be found together and even have a correlative existence. We now come to the question what happens to them as science develops. In this section I will try to understand the development of science under the aspect that, again and again, a typical step in the development is the recognition that what so far had been taken to be a lawlike affair really
236
IV.15 Coherence and Contingency
is a matter of contingency while the reverse of this step never occurs. The progress of science thus includes the increase of contingencies. 5. By way of introduction I shall first try to illustrate the feature in question by two or three examples taken from the history of science. In these examples my paradigm of a contingent entity is some part of an object that may be different from what it in fact is just because it may undergo a change in time. As opposed to this a lawlike entity would be a timeless structure - timeless in the strong sense that here a change in time is excluded not merely as a matter of fact but as a matter of principle. I do not hesitate to remind us of a passage in Plato's Phaedo (78 b3 ff.) where such a pair of opposites is introduced and characterized in two ways. Trying to prove the immortality of the soul, in this part of the dialogue Socrates starts his investigation with the question: "For what kind of thing should we fear that it may be dispersed, and for what kind should we not?" A first answer is suggested by the following characterization: "Isn't it most probable that the incomposite things are those that are always constant and unchanging (ChCEP &El xcnlX 1"IXLn:1X XlXl walX\hw~ EXEl), while the composite ones are those that are different at different times and never constant (1"1X M ClAA01"" ClAAW~ XlXl flT]OETC01"E XIX1"1X 1"IXIJ1"Il)?" But there is also a second answer using a different characterization. Immediately after the passage quoted Socrates goes on to ask: "What of that very reality of whose existence we give an account when we question and answer each other? ... Can the equal itself, the beautiful itself, the being itself whatever it may be, ever admit any sort of change? Or does each of these real beings ... , remain unchanging and constant, never admitting any sort of alteration whatever?" As opposed to these timeless entities Socrates then refers to ''the many beautiful things, beautiful human beings etc . .. What about all the things that are called by the same name as those real beings? Are they constant, or in contrast to those is it too much to say that they are never identical with themselves ... ?" Putting aside questions of interpretation concerning this text, Plato's distinction may very well serve as a first approximation of the kind of distinction that will be used in formulating the following examples. Moreover, whereas Plato's first characterization seems to be more appropriate for application to the earlier stages of science, the second suggests itself for the description of more recent developments. This can perhaps best be seen from my first example: the development of our insight into the structure of matter. This development has passed through the four levels of state transformations of substances, of chemical reactions, of radioactivity as the spontaneous decay of heavy atomic nuclei and of the decay of elementary particles first observed in cosmic rays. In each of these cases it was recognized that what at first had been conceived to be an unalterable structure - a state, a chemical compound, an atomic nucleus, an elementary particle - finally turned out to be changeable in time. In each case a deeper structure was discovered not changing during the respective 5
Scheibe 1987a
IV.15 Coherence and Contingency
237
transformations: the chemical constitution is not changed in a state transformation, the atomic nuclei are essentially stable during a chemical reaction. Processes within nuclei usually are accompanied by transformations of elementary particles. But at least some quantities characteristic for this level are conserved. It is here where it became clear that it is not any more a constituent of the object undergoing a process that remains constant during the process. In terms of Plato's two characterizations this would mean a shift from the first to the second, more abstract one as being appropriate to describe the situation. The behavior of the elementary particles has confirmed this view. 6 Altogether, we here have a succession of theories - classical mechanics, thermodynamics, chemistry, nuclear physics and elementary particle physics - having had their fioruit in this order and each explaining a new kind of process that was veiled by assumptions made by the preceding theory. My second example is the history of cosmology. Apart from some singular movements the predominant world view of antiquity and its Christian version renewed during the Middle Ages were essentially static. The earth was thought to be at rest in the center of the universe, the celestial sphere revolving in uniform circular motion with the stars fixed to it. There were the planets exhibiting their rather irregular motions. However, saving the phenomena these motions were explained by reducing them again to constant circular motion. The development starting with Copernicus is a gradual destruction of these simple structures in favor of more and more changes in time and other contingencies. The earth moves, the stars move. Theories about the genesis of the solar system come up, the stars are declared to be alterable products with birth and death. Finally, gravitation, the new static quantity, and Euclidean space, the time-honored structure, are merged into one timedependent metric in general relativity. And this theory tells a story about the universe according to which its original state was toto coelo different from what it is now. Certainly, stories like these could be multiplied, and not the least among them would be the evolution of the organisms where again the seemingly timeless structures of living species were recognized to have a history. For the present the examples may suffice to show that there is an uni-directional shift of the borderline between what is still assumed to be a timeless structure and what is already recognized as being capable of change. Hypotheses and discoveries resolving timeless structures into processes often are of incisive importance because they often lead to the assumption of new and more basic structures. More generally, the frequent concomitant of the replacement of one theory by another one is the emergence of a new contingency in the sense that some part of the old theory can be seen to correspond to some part of the new theory which according to this theory for the first time is recognized and explicitly admitted to have genuine alternatives, not only in the sense of 6
In his later writings Heisenberg liked to give this situation in elementary particle physics an interpretation in Platonic terms; see Heisenberg 1984 and 1985
238
IV.15 Coherence and Contingency
possible change, but also in the more general sense of logical alternatives. A general description of the process of increasing contingency suggests itself by making use of the idea that theory succession is accompanied by explanations of the earlier theories by the later ones. That science develops no one will deny. But there has been much opposition against the view that science develops in such a way that its earlier stages always are explained by the later ones. A more specific formulation of this view has been given, for instance, by Popper when he says that it is even the aim of science ''to explain what so far has been taken to be an explicans, such as a law of nature. Thus, continues Popper, the task of empirical science constantly renews itself. We may go on forever, proceeding to explanations of a higher and higher level of universality . .. ". 7 In the following I accept this view of the development of science, especially physics, conceding the opposition mentioned that by now no satisfactory general concept of explanation suitable for the description of the development in question has been elaborated. 8 But for my present purpose I need not more than some rather general features of this concept. The most important one is that whatever part A of an earlier theory T is recovered from its successor theory T' will be recovered by means of (absolutely) contingent propositions c specifying the particular conditions under which even according to T' that part A of the earlier theory T holds as far as it does hold at all. We may symbolize this by
T',e
1-------+
A
(Ex)
The precise definition of this relation does not matter too much as long as it implies that within the new theory T' alternatives to A have become known for the first time. In this way the earlier theory, formerly the Last Word in the field, becomes contingent with respect to its successor, and it is in this general sense that an increase of contingency, displayed by the conditions e, takes place as physics develops. Besides the examples mentioned at the beginning of this section there are many other cases that can be subsumed under this conception. The step from Kepler to Newton is a somewhat outworn but still instructive case in point. For Kepler who, although he was a Copernican, still believed in the old cosmology of the celestial sphere the sun and the planets known at his time had a quite unique and exceptional position in the universe. Kepler found the beautiful regularities expressed in the laws named after him and, although he already entertained the notion of a force exercised by the sun on the planets, he still tried to understand the relative distances of the planets from the sun. In his view the solar system was an essential constituent of the structure of the universe that was to be understood precisely as it is given to us. Accordingly, it would not have made much sense for Kepler to have entertained any alternative or to have asked for the particular conditions 7 8
Popper 1958, p. 26 More details in Scheibe 1984b and 1986c
IV.15 Coherence and Contingency
239
under which the planets showed the regularities that were discovered by him. This was left to Newton and his followers in the 18th century. They came to realize that neither the solar system itself nor Kepler's laws about it are the kind of thing that could not be different from what it actually is. In their view the former became a brute fact that could be understood only by asking for its genesis, and the latter were explained within Newton's theory of gravitation by pointing out those particular conditions under which Kepler's laws are approximately true. The history of science provides us with an abundance of cases where - as in the Kepler-Newton case - basic lawlike assumptions lose their privileged status of being the Last Word in the field and thereby become contingent with respect to the new Last Word. There were cases of minor importance such as those where only correction terms are attached to some law. There were cases of fundamental importance as was the replacement of classical mechanics by quantum mechanics and the transition form pre-relativistic to relativistic physics. Sometimes the development of a radical change led to a series of steps following each other in rapid succession. Such was the case with the treatment of the electron by the Schrodinger equation, Pauli equation, Dirac equation and quantum electrodynamics. As I already admitted, our understanding of the relationship between succeeding theories is far from satisfactory. But whatever the details may be, I am pretty sure that this relationship can be reconstructed in such a way that the increase of contingencies will be among its outstanding features.
The Increase of Coherence Leaving now the matter of contingency and turning to coherence, in a third section I want to show that the development of physics can be characterized by its increase, too. This being the purpose it would be good to know beforehand what coherence is. However, as was said in the introduction in matters of coherence the difficulties begin already at this point. In their article on the "unity of science as a working hypothesis" Oppenheim and Putnam, after having mentioned the numerical reduction of languages and theories to one of each, go on to say that "unity of science in the strongest sense is realized if the laws of science are not only reduced to laws of some one discipline, but the laws of that discipline are in some intuitive sense 'unified' or 'connected"'. Obviously, the authors could also have said "coherent". They then continue: "It is difficult to see how this last requirement can be made precise; and it will not be imposed here". This was in 1958, and I am afraid that the situation has not essentially changed in the meantime. Therefore, I cannot presuppose any ideas about coherence that would go beyond our common understanding of the term in philosophy. And I do not commit myself to the following suggestions, taking up the one already made in the introduction. They are not meant to have any interest by themselves but only shall put something
240
IV.15 Coherence and Contingency
definite before our eyes in order to facilitate grasping the general idea of the increase of coherence. Coherence in the sense of the introduction was a relative property of a possible axiom system. It was a measure of the amount of dependencies induced by the axiom in an atomic diagram. (Instead of atomic sentences one could also use some other basis of absolutely contingent statements, e.g. Hintikka's constituents. But atomic statements are certainly the simplest choice.) To have a concrete idea of the degree of coherence that may be obtained for this case think of the theory of linear order that can be defined by three very simple axioms. Given the length N of a sequence, a complete atomic description of it consists of N 2 statements. Using the axioms this number is reduced to N - 1. Thus if we had to describe a macromolecule consisting of 1000 molecules ordered in a sequence we could do this by means of about 1000 statements from which, together with the laws, the 999,000 other atomic statements would follow. If we define the degree of coherence to be the quotient of the number of statements saved by the laws and the number of a complete atomic description, in the case before us this quotient converges to unity. You will believe me that the reducing effect of the differential equations of dynamical theories is even much stronger than the one shown by this childish example. Differential equations can reduce an infinity of contingent statements to a finite subset. Accordingly, the step from a description of a physical system using atomic statements only to one applying laws governing the behavior of the system as a whole is accompanied by a considerable gain in coherence in the sense under discussion. Related concepts of coherence come to the mind if we ask ourselves what direct properties of a theory will bring about those reducing effects in contingent descriptions. This is a difficult question that can hardly be answered in general. But there can be no doubt that physics avoids decomposable or factorizable theories. It was, by the way, an idealist, the British Hegelian Bosanquet, who once asked: "Is there any man of science who in his daily work, and apart from philosophic controversy, will accept a bare given conjunction as conceivably ultimate truth?"g But what is it that we have to avoid according to this rhetorical question? I think it is something like this: Given an axiom system there could be a reaxiomatization splitting the new axiom system into two parts using disjoint languages:
e[a,,6]
~
A[a] U B[,6],
an ,6 = 0
(1)
a =I- b
(2)
This can be rewritten as era, b, ,]
~
A[a, ,] U B[b, ,],
with constants a and b if it is understood that not all of the, must actually occur in A or B. So (1) is contained in (2) but, obviously, (2) is more general 9
Oppenheim/Putnam 1958, p. 3f
IV.15 Coherence and Contingency
241
than (1). To obtain coherence we could exclude (1) or even (2) in the sense that given any two disjoint Q and f3 or different a and b there should be no reaxiomatization (1) or (2) respectively. Coherence conceptions like these can be illustrated most impressively by the way in which the interaction between physical systems is treated in classical physics. Part of what classical physics says about a system consisting of two subsystems even is of the form (1). But it is the trivial part as compared with the interaction. That the matter has non-trivial aspects, though, when we come to quantum physics, I shall have occasion to discuss at the end of the paper. At any rate in classical physics the non-trivial part is the interaction introduced by the dynamical law of the theory. And it is this law that makes the theory coherent in the sense of avoiding (1) or even (2). A famous example is Newton's theory of universal gravity and the step from Kepler's theory of the solar system to Newton's. According to Kepler's theory any planet moves independently of any other. The statement how all planets move is the bare conjunction of the statements concerning the movements of each individual planet. By contrast, the theory of universal gravity, introducing an interaction also between any two planets, is an indecomposable theory representing a considerable gain in coherence as compared with Kepler's theory. An outstanding example has been the discovery of the planet Neptune. It was grounded on a forecast from data pertaining exclusively to two other planets. Such a forecast is impossible on the assumption of Kepler's theory. In general, the coherence of Newton's theory verifies and even makes intelligible many sayings of philosophical coherence theorists. What that theory has to say about one body as being a gravitating body cannot be said other than by relating it to every other body in the universe. Moreover, if we were to find a system of bodies moving exactly according to Newton's theory this very same theory would permit us to conclude that system to be all-inclusive. In other words, the part can only be understood by referring to the whole, and a completely coherent system must be the whole. The development from decomposable to irreducible theories can have the peculiar feature that the entities connected with the decomposable theory loose their independent existence and somehow are absorbed in a larger whole. The step from quantum mechanics to quantum field theory displays examples such as the various transformations of elementary particles. The unification of static electric and magnetic fields in electrodynamics is an earlier example. Its foundation is probably the most amazing case in point: the development from Newton's view on space and time to Einstein's special relativity. Newton's theory of absolute space and absolute time is a paradigm for an incoherent theory - a bare conjunction of two theories referring to two quite different subjects. In modern terms: Newton's spacetime is just the direct Cartesian product of space and time. In the time after Newton Galilean spacetime has been developed. In it the concept of space does no longer occur as an independent entity. Consequently, the corresponding the-
242
IV.15 Coherence and Contingency
ory is no longer decomposable into two independent subtheories. However, the new theory still contains a theory of absolute time as a subtheory built on a proper sublanguage. From the special relativistic spacetime also time has been extirpated. In 1908 Minkowski could describe the situation not unjustly by his famous saying: "Henceforth space by itself and time by itself shall become degraded to mere shadows and only some kind of union of them shall remain independent."l0
Coherence and Contingency: an Outlook The result of our considerations up to this point is that as physics develops the network of its theories becomes less coarse by an increase of coherence whereas at the same time, in some sense of the word, the contingency woven into that network also increases. In the concluding section we must try to understand how this is possible. The best strategy to do this is to try to understand why it is even necessary. Let me first make it quite clear that an increase of coherence necessarily leads to a decrease of contingency in the sense in which the two concepts were envisaged in the introduction. Coherence there meant the amount of connections that are introduced into an atomic diagram by axiom systems consisting of 'lawlike' sentences, e.g. pure sentences containing no constants of the type of variables quantified over in the very same sentence. And contingency meant just disconnectedness within a set of statements as it is most impressively illustrated by those atomic descriptions. Consequently and trivially, an increase of either of them means a decrease of the other. And this is the case not only at the lowest level, defined by atomic statements. To be sure, in physics to have some theory at all presumably is just this: to have lawlike connections between atomic statements. But once this stage is reached we can go up and enter higher levels. On them, too, that disconnectedness will occur although it becomes more difficult to grasp. And it will be reduced by even higher level theories. The reduction of many experimental laws by general electrodynamics is a well known case in point. So there, no doubt, is this complementary pair of coherence and contingency. But there also is contingency in another sense. 11 Although this is not directly related to coherence itself, its change is related to a change of coherence and indeed in such a way that an increase of the latter necessarily is accompanied by an increase of the former. As an epistemological concept, contingent in this sense is what is known to have alternatives. This contingency, therefore, will increase whenever something that up to a certain point 10
11
Minkowski 1909, p. 54 K.J. Lambert suggested to me not to use the term "contingency" in this (major) sense because it could easily lead to misunderstandings. I feel that he is right. But in spite of honest efforts during our discussions we could not find a suitable substitute.
IV.I5 Coherence and Contingency
243
in time had been considered to be unique comes to be viewed as having alternatives. Now precisely this happens as part of an increase of coherence. If a couple of hitherto uncorrelated physical statements of whatever level becomes correlated by a new coherent theory then they will be explained by this theory in the sense indicated in the previous section, i.e. absolutely contingent conditions will become known under which those statements hold. The corresponding increase of contingency in the sense under discussion is here reinforced by the appearance of those conditions called absolutely contingent. I will try to make clear what I here mean with absolutely contingent propositions in two ways. First, as a matter of logical fact contingency in the first sense, complementary to coherence, cannot be reduced to zero. Even for categorical theories a model, although uniquely determined up to an isomorphism, cannot just be derived from the theory. However advanced our theories may be there will remain a residual of statements that together with the theory have to be assumed in order to construct a model. It is not to be seen how this situation will be changed by whatever high increases of coherence. And it is for this reason that an increase of coherence can lead to explanations of the kind described. It is perhaps but another way to put the same consideration if we imagine a list of all explanations in question that have ever been given in the history of physics. Then in the premises of these explanations we can distinguish the fundamental assumptions of the respective theories from the contingent assumptions added to them for the sake of explanation. Call the former to appear in an L-position and the latter to be in a C-position. Then, although many of the premises of our explanations will also occur as explananda of other explanations in the list, it will never happen that a proposition occurring in an L-position in one explanation will occur in a C-position in another explanation and vice versa. In spite of this final emphasis on absolutely contingent propositions I would try to express the relative weight that the two extreme philosophical positions, from which I started out, still have, if they are recovered from the most advanced science in existence, by saying: of course there is increase of coherence, there is unification and perhaps even the mark of an eventual unity of physics. But there is also this apparently inexhaustible reservoir of contingencies more and more of which become known as such. And it is only at the price of its actual increase that we can have growing coherence. Thought will therefore not come to rest in the sense of any absolute understanding, although there is local progress. By way of an outlook I may be allowed to briefly touch upon an aspect of coherence and contingency that, although also being of first importance, seems to be completely different from all that has been said so far. What has been said so far exclusively concerned the coherence and contingency status of the fundamental assumptions of a physical theory and their change. Now, perhaps the greatest advance that physics has made in our century
244
IV.15 Coherence and Contingency
was the step from classical to quantum mechanics. The importance of this step, no doubt, could at least partly be made visible by giving that balance of coherence and contingency also for this case which so far was our major subject. However, for this case that would be only half of the story. For in this case there is also an increase of coherence of an entirely different kind concerning not the fundamental assumptions but rather the contingent descriptions provided by a theory. And perhaps the most striking fact is that here, too, we find a corresponding increase of contingency. In classical physics a system consisting of subsystems is described as the direct Cartesian product of the subsystems. This means that a complete contingent description of the total system simply is the conjunction of the complete contingent descriptions of its subsystems. Consequently, there are no inferences from data pertaining to one subsystem at time t to any properties of another subsystem at t. Inferences from one subsystem to another one can only be grounded on an interaction between the systems and will then involve at least two different time points. This situation is entirely changed when we come to quantum physics. The quantum theoretical mode of description of a system consisting of subsystems has the remarkable feature that a complete contingent description of the total system at time t does not generally imply complete descriptions also of the subsystems at t. In fact the overwhelming majority of total states lead to incomplete descriptions, the missing information having drifted away into contingent so-called EPR-correlations between the subsystems. Thus instead of having fairly definite information of what the result of a measurement of observable A of subsystem I will be we are quite definitely informed about what this result would be if we were to measure a certain observable B of the other subsystem II. Thus we here do have some coherence between subsystems at a given time, and although this coherence may be brought about in controlled way by means of an interaction its nature seems to have nothing to do with the latter and can be described independently. This contingent coherence, as it may be called somewhat paradoxically, is characteristic for quantum physics and completely foreign to classical thinking. It can be used to illustrate traditional philosophical ideas on coherence even more impressively than we have seen it for the other type of coherence. For the quantum theoretical coherence allows the thesis that the domain of validity of quantum theory - and that is according to some authors the entire universe - strictly speaking admits of no isolated object but is rather an undivided whole. However, as we found it in the foregoing case also the step from classical incoherence to quantum mechanical coherence of subsystems seems to necessitate a simultaneous increase of contingency. As seen from classical mechanics, the process of its quantization consists in first destroying the independence of position and momentum of a particle and making them complementary observables. This new relation, however, cannot exist without the introduction of an infinity of quantum mechanical observables that
IV.15 Coherence and Contingency
245
have no classical counterpart whatsoever. They also come in complementary pairs, and as a consequence have entirely independent empirical interpretations. Thus here again, before it comes to those new coherences we have to accept this wealth of new contingencies. They make a seemingly simple object as an electron as complicated as any many-particle system can be.
IV.16 Predication and Physical Law* I
Traditionally the two propositional forms sis Q
(1a)
in the sense of Socrates is mortal and all Pare Q
(1b)
in the sense of all men are mortal were distinguished as being the only logically correct forms of (affirmative) statements. In Aristotle form (1a) of a particular statement actually was some P is Q, and this together with (1b) prevailed also in traditional syllogistics. But gradually the truly singular statements of form (1a) were smuggled in,! and a common illustration of Barbara was given by the chimera all men are mortal Socrates is a man Therefore: Socrates is mortal For reasons that will become clear in the course of my considerations I take (1a) and (1b) as my starting point, and of them I had been saying that they were distinguished traditionally as being essentially the only logical correct forms of (affirmative) statements. To-day we do not longer believe in this postulate, and the tradition from Aristotle to the middle of the 19th century is blamed for having used only a small fragment of logic and a distorted one at that. Russe1l 2 describes the logic of Leibniz whom he highly respected as a logician in the words: Every true proposition is either general, like "All men are mortal", in which case it states that one predicate implies another, or particular like "Socrates is mortal", in which case the predicate is contained in the subject. * First published as Scheibe 1991b 1 On Aristotle's usage and its gradual distortion see Ch. I of Pat zig 31969 2 Russell 1946, pp. 614 ff.
246
IV.16 Predication and Physical Law
247
With respect to this feature traditional logic is then criticized as a "defective logic". Russell continues: The subject-predicate logic, which Leibniz and other a priori philosophers in the past assumed, either ignores relations altogether, or produces fallacious arguments to prove that relations are unreal. Like Leibniz Kant 3 was blamed for building up his Critique of Pure Reason on the traditional subject-predicate logic of which he even said that it is "to all appearance a closed and completed body of doctrine." It is with respect to such opinion, seemingly widespread at those times, that the work of Frege 4 can really be called revolutionary. The new logic created by him and completed in Whitehead and Russell's Principia 5 was an advance in at least three respects: 1. the generalization of (la) to predications with n- termed predicates for any n 2: 1; 2. the freeing of the quantifiers from the bounds of (lb) and its particular counterpiece to become iteratively applied operators; 3. the introduction of higher order predicates and predication. In this logic the natural successor of (la) was
P(s)
(2a)
and that of (lb) was
Vz.P(z)
~
Q(z).,
(2b)
where P and Q are arbitrary I-termed predicates - elementary or defined. If we ask again what the distinction of these propositional forms is, this time with respect to the new logic, then, I think, most of us would answer:
(2a) along with its many-termed siblings is distinguished, in case P is an elementary predicate, as being an elementary predication. On the other hand, (2b) - from a purely logical point of view - is no longer distinguished at all. At any rate the former kind of distinction has completely disappeared. This is not to say that we are left with no problems. For one thing, we now have the problem of the nature of elementary predication which immediately leads to the further question what kind of terms - subjects and predicates - would be involved if we were able to point out genuinely elementary predications. More globally we could try to use classical logic (including a theory of types) as a key to ontology and ask for the ontological meaning of our whole logical apparatus with special emphasis on predication. Thus (2a) was and still is associated with ontology. Quantum logic is a case 3 4 5
Kant 21787, B VIII. Frege 1879 Whitehead/Russell 1910.
248
IV.16 Predication and Physical Law
in point. (2b) although not distinguished from a logical point of view could remind a philosopher of science of the thesis that this formula gives us the general form of physical law - and if not the general then at least the typical form. Since we are convinced that the mere form (2b) is not sufficient for lawfulness we have to inquire about the nature of P and Q in (2b) and are thus led back to the first complex of questions. Thus if we are not too narrow-minded in the question of a distinguished role played by the classical successors of the subjectpredicate forms (1) we can easily find interesting attempts to characterize them. In this paper - as I have to confess right at the beginning - I shall not attempt to answer any of the aforementioned questions in a direct way. I feel, however, and I hope that what I shall be doing has a bearing upon each of them. What I actually shall be doing - and there you have an explanation for my starting point - is sort of a reevaluation of the subject-predicate forms (1) in the light of recent attempts to reconstruct theories of physics. In doing this my emphasis is on the systematic, not the historical aspect of the subject. For understanding the following considerations it may, however, be of some help to remember the way in which in the 17th century people like Leibniz and Locke talked about substances in general and the theory or, as they expressed it, the essence of a substance in particular. It then comes to mind that there is at least one further interpretation of the subject-predicate statements (1). This interpretation we have to recall anyway because from a systematic point of view my review given so far of the fate of those statements is incomplete. In the time following the Frege-Russell period an important subjectpredicate relation different from (2a) was established: the model relation, 6 i.e., the relation between a structure S and a formal theory E holding if B is a model of E:
BFE
(3a)
Intuitively, this relation is a predication, the theory E being the predicate and B the subject about which E is predicated. Therefore, as seen from outside, (3a) even is a singular statement. In spite of this the richness of internal structure that the predicate E as an arbitrary formal theory can have guarantees a wide range of application of the new predications. Indeed from the standpoint of mathematical logic (or: model theory) the predications (3a) even appear to be the most general statements that can be made. If, for instance, the reconstruction of physical theories could not be attained by means of the predications in question then, so it seems, one would just have to go beyond the bounds of mathematical logic to make one's fortune. There is, however, one immediate objection to this reasoning, and this will lead us to the general counterpart of the singular predication (3a). One way to reconstruct a physical theory as a predication (3a) is to assume that a 6
The beginning of modern semantics was Tarski 1936. Today the model relation is defined in every textbook of logic.
IV.16 Predication and Physical Law
249
physical theory is about one single physical system and that this system can be conceived of as being a structure in the technical sense of model theory. I think that these two assumptions are basically sound, and I shall make them in the following. However, just if we do this we are exposed to the question: what about the universality of physical law? It is of the essence of physics - so it will be objected - that its theories are universal, and it would be of no help to counterargue that any amount of generality be guaranteed by means of quantifications within the theory E (which, of course, are permitted). For the difficulty here is that, if our theory says that the physical system S satisfies theory E, the universality wanted is that this should be true not only of S but of a whole class of physical systems. Therefore, besides (3a) we should include a universal implication Y~.~ E
K
-7 ~
F E.
(3b)
where K is the class in question, in our store of statements necessary for reconstructing physical theory. We shall have to investigate the meaning of this formula later on. But for the moment it suffices to conclude that (3b) is a development of (2b) and therefore remains within the bounds of my re-evaluation program. The first part of my paper cannot be concluded without mentioning the set theoretical versions of (2) and (3). Originally axiomatic set theory 7 was developed as an alternative to Russell's theory of types. At a later stage set theories and type theories merged or were linked by model theory. Thus, for instance, with respect to a model M of Zermelo-Fraenkel set theory, to be kept fixed in the following, sEy
(4a)
and
Yz.z
E
x
-7 Z E y.
(4b)
correspond to (2a) and (2b) respectively if the sets x and yare interpretations of P and Q in a given structure S from M and quantification in (4b) is restricted to S. Whereas this is a quite close connection of (4) with (2) the set theoretical version of (3) seems to be a considerable generalization of it. We have
E(S)
(5a)
as an analogue of (3a) where E now is any set theoretical predicate and S a structure from M. The analogue of (3b) then obviously is Y~.~ E 7
K
-7 E(~).,
An introduction to this tradition is Fraenkel et al. 21973 .
(5b)
250
IV.16 Predication and Physical Law
where K is a class of structures in M. Thus, whereas the statements (4) are made within a given structure S the statements (5) are made about structures. 8 The connection of (5) with (3) is that - as the notation indicates every formal theory E, if it is finitely axiomatized has an obvious reformulation as a set theoretical predicate. Consequently, whenever a physical theory can be reconstructed by (3) there is also a reconstruction by (5). At present the latter reconstruction even is preferred for the ease of its handling. On the other side, the inverse of the statement in question does not hold, if only because quantification in M may not be restricted to the structure S (as it is in the reformulation of a formal theory). In the set theoretical reconstruction of some physical theories, e.g. quantum mechanics, use is made of unrestricted quantification, and it is not known whether, in all cases occurring in physics, this procedure can be eliminated in favor of a reconstruction of type (3).9 Summarizing this exposition I would say that I have offered two modern explications of traditional predication (1): Predication (2) and (3) of the 1st and 2nd kind, as I will call them henceforth. Each has a set theoretical counterpart: (4) of (2) and (5) of (3) which I shall also call to be of the 1st and 2nd kind respectively. Alongside with this distinction we also have two kinds of generality, one structure-internal as (2b) and (4b), and one if quantification is over structures as in (3b) and (5b). No definite claim is made as to which explication of (1) - (2) or (3) - is more adequate. To decide this is a historical matter, and as I said, I do not want to follow up this road. Suffice it to say that to my mind the sentence "Socrates is mortal" is quite near to "the air in this room obeys van der Waals' equation" and therefore has to be classified as a predication of the 2nd kind. This goes against the frequent use of the first sentence as illustrating first order predication. However, apart from moments when I become dogmatic about reconstructions, I would not find anybody guilty of just making a mistake when he so uses the sentence. At any rate, as we shall see, elementary predications (2a) seem to occur in physics only where the subject already is a higher-order predicate. Finally, I do want to claim that theory predication (3) or (5) is sufficient to reconstruct any given physical theory: The only statements made in physics are subject-predicate statements. And it may even be asked whether universal predication is not reducible to singular predication - the b-case to the a-case in (3) or (5), or more generally - whether we can do without universal theory predication. II
In the second part of this paper I shall illustrate the two kinds of predication and generality by examples taken from physics. In this illustration the main burden of a proof of my theses consists. Moreover, by looking at examples the theses themselves are to be sharpened and clarified. For, as I already 8 9
The heterogeneous formulation of antecedent and consequent in (5b) and (3b) will become clear at the end of section II. Scheibe 1986d (this vol. ch. VII1.36); Ludwig 1985.
IV.16 Predication and Physical Law
251
indicated, as long as we do not want to go beyond classical logic and set theory ~ and I don't want to do this ~ the argument given so far seems quite inevitable: If all physical theories can be reconstructed as so many formal theories or set theoretical predicates plus interpretation then their common propositional form necessarily will be (3a) or (5a) respectively. Thus apart from instantiation the only question that is left seems to be the question what justification there is to classify the statements having those forms as subjectpredicate statements. However, reconstructions of physical theory have been produced that make already the first part of our thesis questionable. They do this by the occurrence of so-called constraints in the sense of Sneed, and indeed the concept of constraints is one of the pillars on which a whole reconstructionist approach rests: the structuralist approach. IO The structuralists argue that in physics there are statements besides the laws (3b) or (5b) that ~ as statements about the class K of physical systems ~ cannot be formulated by saying something about the elements of K, as is done in (3) and (5). In addition to the law of the lever, for instance, we would have to say something about any class of levers, something not reducible to a law about systems: We would have to say that, if some of the weights used in preparing any two levers of our class happened to be identical. then also their masses are equal. Now, on occasion of this argument I am anxious to emphasize that it may justifiably be asked ~ and I did ask it already ~ whether even the b-case of (3) and (5) really must occur. It is evident why they must occur in predication of the 1 st kind. i.e. in (2) and (4): There the a-case may be elementary, i.e. they may be irreducibly singular. Therefore, as long as we want to make any general statements at all we have to insist on the corresponding b-cases (and even further general statements). Not so in the case of theory predication: Here the question of reduction must be asked, and I shall come back to it in the last part of the paper. However, the further extension to statements formulating constraints in the sense of Sneed seems to me a point of minor importance from a physical point of view, and I have mentioned it only because it nicely illustrates further 2nd level statements in physics. Predication and generality of the 2nd kind is most conspicuous in general frame theories of physics such as Hamiltonian mechanics or quantum mechanics. The theory of general relativity is also a case in point. These theories have variable universes of discourse. and the difference in question can be shown on the basis of this feature. For reasons of greater familiarity I may take as a purely geometrical example Riemannian geometry, i.e. the theory of Riemannian manifolds. Its axioms have the invariance property that any structure isomorphic to a Riemannian manifold is also a Riemannian manifold. This property is shared by all so-called species of structures known from mathematics such as groups, rings. topological spaces, manifolds, Banach spaces etc. The widespread use that is made of these species of structures to define physical theories makes it plausible that the property in question can be a 10
Sneed 1971, Ch. IV; Balzer et al. 1987, Ch. II. 2.
252
IV.16 Predication and Physical Law
feature of physical theories, too, though there the structures are somewhat richer than the simpler examples from pure mathematics. Even the usual specializations of Riemannian geometry e.g. to manifolds of constant curvature or even Euclidean manifolds, have that generality that the model classes are invariant under arbitrary isomorphisms. Now this feature, interesting also on its own account,l1 is pointed out here only to show what theory predication and, correspondingly, generality of the 2nd kind is like: it concerns, so to speak, whole universes or physical systems and is absolute in an obvious sense. The predication is weak in the sense indicated and, therefore, the corresponding generality is fantastic and even unpleasant from the viewpoint of physical theory. By contrast predication and generality of the 1st kind are restricted to a given structure and relative in this sense. We may, for instance. be interested in a particular Riemannian space. For more than two thousand years geometry has not been a theory of spaces but rather of points, curves and other figures in space - the only space that was envisaged during that time. The axioms and theorems of geometry being formulated with the help of quantifiers then illustrate generality of the 1st kind. Correspondingly, it is here that we find predication of the 1st kind - statements of the form that point PI has distance r from point P2, that curve C is a geodesic, that the Riemann tensor is asymptotically 0, and so on. To be sure, statements like these, being contingent with respect to the axioms of Riemannian geometry and possibly using an extended language, can be used to make statements about the manifold itself, too. We can, for instance, specify two triangles in a space that turn out to be similar. This certainly is a genuine restriction of the class of Riemannian spaces, and insofar it is a predication of the 2nd kind. However, these contingent statements usually are viewed as statements restricted to a given manifold: other manifolds would then be excluded from being alternatives, and only a different choice of geometrical objects in the given manifold could lead to such. That there are physical theories deserving the name whose axioms really are species of structures and therefore predications of the 2nd kind is evidenced by the frame theories mentioned a moment ago. However, there are also theories in which the universes of discourse are assumed to be numerically fixed or at least fixed up to an automorphism belonging to the symmetry group of the theory. The theory of a particle moving in a central field of force according to the laws of Newtonian mechanics is a case to the point. Here the structure used to describe the behavior of the particle in the field consists of four parts: Absolute space, absolute time, a field of force as well as the orbit and mass of the particle. Correspondingly, the theory is made up of Euclidean geometry, a corresponding (degenerate) geometry of time. general Newtonian mechanics and a special force law. The universes of discourse are the base sets of space, time, a 3-dimensional vector space of possible forces and a III
Scheibe 1982c (this vol. VII.31)
IV.16 Predication and Physical Law
253
dimensional mass spectrum. Though we here touch upon deep questions you will understand what I mean by saying that these universes of discourse are intended to be unique. Whereas in the case of Hamiltonian mechanics we want to have different phase spaces - non-isomorphic as well as isomorphic ones - in the case before us the space, time, etc. are meant to be uniquely identified up to an Euclidean transformation. The question is whether in these cases the difference between predications of the 1st and 2nd kind disappears. I think that this is not the case. It is easy to find features of this difference other than the one considered so far. One is that predication of and quantification over physical systems should be kept apart from these operations as applied to the structures used to describe those system. The reason is that our descriptions may never be complete and in many cases are known to be incomplete. As a consequence different physical systems may have the same description. The distinction must, therefore, be maintained if only to avoid a conceptional muddle. Let me illustrate this by presenting the philosophical folk view of a law in the light of the present approach. According to this view (2b) is the typical or even general form of a law, and its universe of discourse is the class of objects indicated by the variable z. It is perhaps not more than a matter of emphasis, but I, for one, find it misleading to look at our subject in this way because it suggests that quantification in (2b) is structure-internal quantification of the 1st kind. It suggests a universe of discourse irreducibly structured by P and Q and possibly further predicates of the theory to which our law belongs. We are then tempted to see the relation between the theory and its universe of discourse as we see it in Euclidean geometry where no point of space can be omitted without violating one of the axioms. However, just in the case where a universal implication is assumed to be a physical law it is not properly placed in category (2). Rather it belongs to (3) or - equivalently - (5) and is to be reconstructed as follows. The matter is perhaps best understood if we first have a brief look at the set theoretical reconstruction of that type of physical law that is nearest to the philosophical folk version of a law. Laws of this type state a relation between finitely many given physical quantities. Any gas law, for instance, relates pressure, volume and temperature of a gas, Kepler's third law relates the length of the main axis of a planetary orbit to its period, etc. Let us take the Boyle-Mariotte law of a gas for a closer look. It relates pressure and volume for a given temperature. The unique universe of discourse of the describing structure here is a scaled set of all theoretically possible values for pressure and the same for volume. More properly stated it is the Cartesian product of the two sets of possible values. Add to this one value for pressure and one for volume, and you have the structure S uniquely assigned to a given gas ~ for its description. The Boyle-Mariotte law then requires p(~)v(~)
= Co
(6a)
254
IV.16 Predication and Physical Law
where p(~) and v(~) - the essential parts of 8(1;) - are pressure and volume of the system~, and Co is a constant given in advance. This is what our law says about one single system, and it is with respect to what it says in this sense that our law differs from other laws. The essential information, therefore, concerns the single system, and the universal form (5b), in our case
(6b) where K is now any class of gases. is common to all laws. The philosophical folk version of a law now appears as a degenerate case of the type of physical law exemplified by Boyle-Mariotte's law. In the latter case, if the law relates n quantities the unique universe of discourse of the describing structures is an n-dimensional Cartesian space, and the law picks out an (n -1 )-dimensional hyper-surface as the space of physically admissible states of a system. In the former case n = 2, and the two "dimensions" are given by the alternatives {P, not-P} and {Q, not-Q}. Then instead of a continuous spectrum of possible values we have only two. And of the four possible "states" resulting in this way the law excludes one and admits the other three. With functions p: K
I-t
K
I-t
q:
we could write our law (suppressing
{P,not-P} {Q,not-Q} 8(1;»)
as (7a)
in the singular and (7b) in the universal version. Here the only quantification occurring appears as a quantification of the 2nd kind. Mind that here as well as in (6) many physical systems can have the same structural description. The double occurrence of material implication in (7b) gives the occasion of at least briefly mentioning an important problem connected with the premise in universal predication (3b) (or (5b)) of the 2nd kind as compared with that of the 1st kind. Some people think it essential for the statement of a law that it has the form of a universal implication. This is shown by the attempt to understand the essence of law by reformulating ordinary universal implication as a subjunctive or counterfactual conditional. This attempt in turn is often made in cases where the premise expresses that something is done to the object in question while the conclusion then says what the result of the action is: "If this piece of butter had been heated it would have melted". This is all right for a restricted domain of disposition predicates, empirical laws and the like. But it is doubtful that the general form of law is conditional in this
IV.16 Predication and Physical Law
255
usual sense. The form of Maxwell's equations, as far as I can see, is not. However, if what I assumed is true, namely that a physical theory primarily is a statement (3a) about one single physical system, then the problem arises what the nature is of the premise in the corresponding generalization (3b): The demarcation of the domain K cannot be produced by an ostensive act as it could be done in the case of (3a) where only one individual system had to be pointed out. The only alternative then seems to be a conceptual description. But this cannot be given in the language in which E is formulated and defines a certain range of structures as its models. We just had the case of gases described by their pressure, volume and temperature. If we now want to use a gas equation (as E) in a universal statement (3b) then. even if we take the risk to claim the equation for all gases, we still would have to say what we mean by a gas. It would not suffice, as is usual, to restrict generality by restricting our parameters to certain intervals, e.g. to low pressure. In the last analysis the characterization of a gas in the premise of (3b) has to be given in a language different from the one used in the conclusion: we have somehow to describe the way a system is given to us or is produced or some thing of the sort. Thus here we have another point where predication of the 2nd kind is sharply to be distinguished from that of the first kind - this time with regard to their universal versions. In particular, there is a genuine asymmetry between premise and conclusion in universal predication.
III One aspect indicated already in the formulation of (6) and (7) may now lead us to the last part of my paper in which I wish to discuss the question of an eventual reduction of universal to singular predication (3) or (5). The point is that the degenerate form of universal implication (7b) makes it particularly clear that we should make a distinction between a physical system on the one hand and a structure - a mathematical structure as one might be tempted to say at this moment - by which the system is described on the other hand. In the first part of the paper no sharp distinction was made between physical systems and structures, and if henceforce the distinction is made then this is no denial of that identification in principle. Because of the incompleteness of our descriptions it is, however, wise to distinguish between a system and the structure by which it is described according to a particular theory. (5) would then be rewritten in the form
E(S)
(8a)
Vf~ E K ~ E(SW).
(8b)
where K now is a class of physical systems and S(I;) is the structure describing the system ~ E K. Under the new aspect we can bring out one more difference between quantification of the 1st and the 2nd kind. If system ~ is described by structure S(() then we cannot, without leaving the theory, add to this
256
IV.16 Predication and Physical Law
structure in order to get a more complete description. In other words, within one theory full descriptions are incompatible: A structure describing a system is the generalized state of the system. Just as a state of a system at a given time does not allow the system to be in a different state at the same time so it is with structures in general. So far this makes explicit only what was already implied by our notation 'S(e)' as denoting the structure describing ~. However, when it now comes to quantification over systems and with it over structures describing those systems the presence of so many competing entities in one statement seems to contrast with the usual system-internal quantification of the 1st kind as in (2b). Within a statement saying, for instance, that an equation is true for all time points there is no competition between the various time points. Rather they are, so to speak, cooperative in constituting time, and it is similar with all the other elements of the sets making up a structure. But once we have reached the level of a structure itself the situation seems to change: at least with respect to one given theory there seem to be no super-structures built up from simple ones. Now you will realize at once that this argument, as it stands, is not sound. For the various structures describing systems of a class K, though they cannot add up to describe one of the systems of K, may very well be used for the construction of a system corresponding to K itself And more than that is not required anyway. The various product operations known from mathematics illustrate the principle of such constructions. The Cartesian product of a family of sets is perhaps the most simple example. Another example which is certainly relevant to physical theory is the Boolean product of a family of Boolean algebras. This product is again a Boolean algebra, and if the originally given algebras are fields of sets then the product is also a field of sets. We thus have the situation that
(9et) where Prod . .. is the respective product formation and as S(e) describes ~ so ProdeEK(S(e)) describes K.
(9,6)
In such cases, therefore, we do have a structure describing the whole class K if we have structures describing its elements, and the structure describing K even satisfies the same theory as do its elements. Moreover, if we could justify the respective product as leading to the "smallest" structure containing all the structures S(e) with ~ E K as components, would we not then have succeeded in transforming our universal statement into a singular subjectpredicate statement? I do not wish to follow up this question. For although in a purely logicomathematical sense the answer may be affirmative there is the further question what use can be made of this answer in physical theory. And the answer to this question certainly is not very encouraging. First of all, from the
IV.16 Predication and Physical Law
257
viewpoint of physics the idea behind the question of reducing (8b) to (8a) is the following.12 On the one hand there is the fact that all events, processes, objects etc. that have ever been or will ever be made the subject of an empirical investigation are events, processes, objects etc. in one and the same, namely our, universe. In particular this holds for the various systems entering a lawful statement (8b). On the other hand, this very statement, although it lends itself to a possible-worlds interpretation, does not express that fact of innerworldliness and does not, by itself', give some hint to find an innerworldly interpretation for it. On the contrary, we have seen that there is a certain competition between descriptions of systems quantified over in the standard form of a universal law which thus may even be unfavorable to their co-existence in a common world. Although everybody believing in a universal law tacitly implies that the systems talked about in the law do belong to our universe the prevailing possible-worlds interpretation almost seems to contradict this implication. The somewhat unrealistic system-external generality of a law in a possibleworlds interpretation has a realistic approximation that may be called the laboratory view of lawlike generality. It is the view that we are able to produce (or: reproduce) in our various laboratories mutually independent systems with different descriptions but all obeying the same law according to (3b) or (5b). On this view we can practically realize different independent ''worlds'' within our universe. However, we should take our problem also as a matter of principle: It is not an approximation but a matter of principle that we do our physics in one and the same universe. Therefore, the laboratory view of laws has to be confronted with a cosmological view - as it might be called. According to this view we do not satisfy ourselves with pragmatic excuses but insist on a strict innerworldly reformulation of a universal implication like (3b) or (5b), if it is to express a law. Since, however, the cosmological view is naturally attached to a singular subject-predicate statement (3a) or (5a), our problem is that of a reduction of the b-case to the a-case. It is in order to find such a reduction that the transformation indicated by (9) may be of some help. For it assimilates the status of the class K of physical systems to that of anyone of its elements or the status of the universal to that of the singular statement (5). And for these systems or statements an innerworldly interpretation is always assumed as a matter of course. However, as I said I don't want to make the idea connected with (9) the center of the following argument. The problem of a reduction of universal to singular predication of the 2nd kind is naturally divided into two parts. In the first part we have to ask for a physical justification of universal theory predication. For in those cases where a physical justification is not possible there would be nothing to be reduced. Only the justifiable cases would then have to be analyzed and possibly reformulated as a singular predication. Now 12
A more detailed presentation of this viewpoint is given in Scheibe I99Ic (this vol. IV.18)
258
IV.16 Predication and Physical Law
the interesting thing is that already the first part of our task may lead to the result that we end up with singular predication to the exclusion of universal predication. And it is this line of thought with which I want to conclude this paper. There are some important restrictions imposed on the class K of physical systems as it appears in the universal predication (8b) if we take seriously the aforementioned aspect of the uniqueness of the universe, i.e. if we take the cosmological view on laws. The most obvious assumption that we make in the innerworldly description of several physical systems is that all systems are to be met with in one and the same spacetime and that therefore spacetime must be a common element in the description of all the systems belonging to K. This uniqueness of spacetime has consequences for the presentation of its material content. At the dawn of modem physics Kepler's three laws did not yet allow to recognize this. On the contrary, they became the paradigm case of a positive solution of a reformulation (9). The reason is that these laws can be spelled out for every planet without taking into account the existence of the other planets. If somebody should prefer to express the laws as so many statements about the set of planets we could easily show him how this formulation could be reduced to a (finite) conjunction of identical statements about each planet and vice versa. However as we know since Newton this reduction is only an approximation that eventually becomes grossly false in appropriate cases. The essential insight was that, since all celestial bodies exist in the same universe, they may interact with each other such that only their totality makes up a closed system whose behavior as a whole follows a law. In fact, the matter stands even worse: The mutual gravitation in a system of bodies, according to Newton's theory, strictly speaking leads to a totally irreducible system of equations of motion: If a system of bodies moves according to these equations, no subsystem does. Within one and the same spacetime at most one gravitational system could be realized. The consequence is that there can be no question of a decomposition of Newton's theory into statements about the behavior of single bodies as in the Kepler case. This is not to say that even no part of the theory can have such a decomposition. The Boolean algebra of contingent properties of a mechanical system is the Boolean product of the algebras of contingent properties of its subsystems, e.g. of particles. We can even have product situations that include the time development of a system and its subsystems if there is no interaction between the latter. In Hamiltonian mechanics the total Hamiltonian then is just the sum of the Hamiltonians of the subsystems. But as soon as interaction between the subsystems comes into play universal predication is out of the question: Interaction terms almost by definition prevent the independence required. Moreover, we have seen that there are cases for which the assumption that more than one instance of a universal law exists in the same universe is selfcontradictory. There is, therefore, a genuine competition between universal
IV.16 Predication and Physical Law
259
and singular subject-predicate statements in their physical applications. The competition illustrates a general reciprocity of lawfulness and interconnectedness in nature. 13 Lawfulness in the standard form (3b) or (5b) demands strictly independent instances of the law. In searching for laws the point just is to find such independences. To be sure, these independences go together with internal dependences as they constitute the contents of the respective laws. At the same time they mark the limits of the latter. As long as we have reason to assume that laws in this sense are realized in nature - strictly realized - there is no total interconnectedness in the universe. On the other hand, the realization in one and the same universe, as it will be required even by a modest empiricism, constantly draws our attention to the possibility to have missed some dependence. And the discovery of one in the context of an accepted law inevitably will destroy the law. The increase of discovered dependences cannot but lead to a decrease of laws in the usual sense as something fundamental. Interconnectedness is an aspect at least as important as lawfulness. Since we know quantum mechanics we have difficulties not only with the interactions but already with the product formation of physical systems. Let us consider, for instance, the quantum mechanics of free electrons. The state of an electron is given by its w-function that determines for every observable its expectation value in the given state. According to the theory there is a whole Hilbert space of states. Now let ¢ and Wbe any two of them. According to the laboratory view the pair (¢, w) again determines a possible description of the situation. In a concrete case we would say that we have prepared both states independently of each other. According to the cosmological view, however, this conjunction is by no means the most general description of the situation: If we take seriously that the particles belong to the same world we have to treat the situation as a 2-particle-system. We must pass from two I-particle-ensembles to one 2-particle- ensemble. For the latter, however, the pair (¢, w) is a correct description only in exceptional cases - the socalled separable cases. In general the two subsystems are inseparable, and our knowledge about them is not maximal. Rather the information about the total system concerns many correlations between observables of the two subsystems. Again the possibility of an innerworldly reformulation of (5b) is paralyzed from the outset. Against this argument it may be objected that the difficulties for a cosmological interpretation of (5b) in connection with interaction and inseparability do not have any practical importance. All fundamental interactions have a finite range allowing for practically independent and yet internally interacting systems. Similarly, we can prepare practically separable quantum mechanical systems showing all the features of inseparability internally. And both possibilities follow from the respective theories. All this has to be admitted, 13
From a slightly different viewpoint this subject is treated more fully in Scheibe 1989c (this vol. ch. IV.15)
260
IV.16 Predication and Physical Law
let alone the overwhelming number of cases where we find the independence in question not by inference from the theories but simply by experience. On the other hand, we have to remember that we are investigating a matter of principle. Theories about gravitation and the mechanism of compound systems are of a fundamental character. Such theories are not proposed only to say afterwards that they may be taken cum grano salis anyway. And if such theories show us that we run into trouble with the usual formulation of a physical law then this deserves to be recognized and understood. The question whether there are any universal laws of the form (5b) that are not only approximations but strictly valid is a matter of principle. If the question should be answered in the negative this would mean that the theories from which we can derive the laws in question as approximations cannot themselves be of this kind. And we would then be faced with the question of what kind they are after all. Let me summarize. In physical theory two kinds of predication are involved. Predication of the 1st kind belongs to that part of the theory by which the statements made about a physical system receive their meaning. Predication of the 2nd kind is about the respective system; apart from being meaningful it is also referential. Correspondingly, predication of the 1st kind is relative to a structure to be used for the description of our system, predication of the 2nd kind, by being about the system (and the describing structure), is absolute. Predication of the 1st kind is the primitive basis for system-internal statements of arbitrary complexity with respect to language in general and the quantifiers in particular. Generality of the 1st kind, therefore is arbitrary but it is system-internal and restricted to a given structure. By contrast, the kind of generality that is required for a physical law - the generality of the 2nd kind - is rather special: it is a universal predication. However, it refers not to a single system but to a whole class of systems. From a physical point of view the distinction between two kinds of generality is more important than the corresponding distinction between two kinds of predication. But the former is brought about by the latter which, therefore, is also in need of being better understood. Finally, the simultaneous existence of singular and universal predication about physical systems is a challenge to reduce the latter to the former. For, on the one hand, according to the general method of physics a physical system is a world substitute. On the other hand, all physical systems are part of our universe and therefore of one and the same world. We have seen that in some cases universal interaction and inseparability indeed do not allow for the independence of systems required by a lawful universal predication. In other cases the reduction may be possible by a reformulation using a suitable product formation.
IV.17 Substances, Physical Systems, and Quantum Mechanics* In many of his papers Paul Weingartner has proved himself to be a master in connecting present-day problems of philosophy with their forerunners in the philosophical tradition 1. In this paper I try to emulate him by approaching the question how modern physics would have to be rephrased if it had to be fitted into the traditional thinking in terms of substances. It goes without saying that I can here deal with only a small fraction of all aspects relevant to this important notion. Also emphasis will be more on a systematic development of my main thesis than on producing historical evidence. The thesis is that it is the modern concept of a physical system that comes nearest to the traditional notion of substance. On the part of the philosophical tradition a clue to this result is mainly given by men like Locke and Leibniz . On the part of physics quantum theory is the main obstacle for an entire identification. Four major aspects connected with the traditional notion of substance are taken into account: independent existence, monadic predication, completeness and individuality. Using recent investigations 2 I try to show that each of them can be elucidated most naturally by viewing physical systems as being the substances of modern physics.
1. Ontological Independence Let me begin my considerations with a point on method. It is a commonplace that since the time of Galileo the method of physics is twofold: it is mathematization and experimentation. Physical theories have to be couched in mathematical language and they have to be tested by experiment. The point to which I want to draw the attention, though related to this analysis, is more specific. It, too, is twofold. First, in physical theory, although we are never concerned with the universe in toto, we always conceive of the actual system of interest as if it were the whole universe. Even in physical cosmology we never make the whole universe the object of our theoretical investigation. The conception of the universe as the unrestricted totality of everything existing may be an interesting conception from a philosophical point of view. In physics it would be of no use whatever. There a drastic selection takes place in every case, and the amount of what is selected usually is negligibly small when compared with what we omit. The selection is made under various viewpoints: we idealize, we neglect, we isolate, we approximate, we simplify, we abstract. In every case this means that we pass from a larger whole that still is a real piece * First published as Scheibe 1991g 1 2
Weingartner 1971 Scheibe 1991b (this vol. ch. IV.16) and 1991c (this vol. ch. IV.18)
261
262
IV.17 Substances, Physical Systems, and Quantum Mechanics
of nature to some fictitious fraction of it, and it is only this fraction about which we theorize. It is usual to call it a physical system. Furthermore, it is important to realize that what is omitted in this way - what is not taken into account in our theory - is so radically wiped out that we cannot but view the product of our selection as being a world of its own - a complete substitute for the actual universe. May be that the latter still plays a role in the background and that it is re-introduced in part when we apply the theory. The theory taken by itself does not know about this. The object of electrodynamics as defined by Maxwell's equations is a field and a portion of charged matter and nothing else. Quantum mechanics of the hydrogen atom has as its object one hydrogen atom (or an ensemble of such) and nothing else, and so on. In each of these cases we act as if the object of our theory be the total universe although we know that this is not the case and sometimes mitigate the situation by introducing more complex systems. The fact that this "method of the as if" as it might be called works is a highly non-trivial fact about the universe: We can successfully investigate parts of the universe without considering everything. By this kind of success we are even entitled to assume that the portion of the universe isolated in our theory could exist by itself - is a physically possible world that might have been the actually existing universe without anything else being there. Keeping in mind that physical systems as actually conceived by physicists in fact always are proper parts of the universe, their relative independence as indicated is a first hint that they, if anything, are good candidates for being identified with substances in at least one sense of the philosophical tradition. At any rate, if S is a physical system and E, is what a theory says about S then the claim that E(S)
(1a)
i.e. that E is true of S, is the kind of claim which we end up with in pursuing the method in question and which in many cases has been found to be in excellent agreement with the facts. However, - and here comes the second part of my methodological remark - statements of the form (la) cannot be the whole story on physical method. It seems a generally acknowledged view that physics is confined to the investigation of events or situations that can be reproduced. "The natural scientist is concerned with a particular kind of phenomena [ ... ] he has to confine himself to that which is reproducible [ ... ] I do not claim that the reproducible by itself is more important than the unique. But I do claim that the unique exceeds the treatment by scientific method. Indeed it is the aim of this method to find and to test natural laws [ ... ]" 3. Now, reproducibility in experimentation is not yet universality of a theory or law. It is a special case at best, and it is remarkable that Pauli jumps from one to the other. But even in its narrow sense, i.e. as reproducibility under the same initial and 3
Pauli 1961, p. 94
IV.17 Substances, Physical Systems, and Quantum Mechanics
263
boundary conditions, reproducibility does not mean that we could reproduce a singular event or situation. It is only with respect to some kind of events or situations that we can speak of their reproduction or repetition, and since the choice of such a kind is to some extent arbitrary, we may even speak of different individual systems S underlying a theory E as in (1a) as being so many reproductions of systems of a certain kind defined by that theory. But if this is sound reasoning, we see that besides (1a) there is an even better candidate for fulfilling the basic idea of physical method: the universal implication for all S : if SEA then E(S)
(lb)
where A is any domain of application of the theory E. Formula (1b), if correctly interpreted, gives us the general form of a physical law, and the requirement of physical method to look for such laws reinforces our tentative identification of physical systems with substances. Indeed the fulfillment of this requirement in any case of a law provides us not only with one but with many systems each having an independent existence and showing a lawlike behavior. We are here not dealing with a repetition in the system-internal sense, it is not the question of a periodic motion - no two swings of the pendulum. In the context of lawlike behavior, repetition of a first instance of some law means a second independent instance of the law - instance or counterinstance but at any rate a new system with possibly different initial conditions. There would be no laws without ontological independence. As I said, the system-external generality of (1 b) sometimes is even raised to the metaphysical level that the systems comprised within A are so many different worlds. This, in an obvious sense, is not realistic. But there is a realistic approximation that I will call the laboratory view of lawlike generality. It is the view that we are able to produce (or: reproduce) in our laboratories approximately independent systems with different descriptions but all obeying the same law according to (1b). In a sense we can, therefore, practically realize many different independent worlds or substances within our universe. At the end of this paper I shall ask whether we are entitled to assume this independence not only in practice but also in principle.
2. Monadic Predication In my next argument I want to show that the statements (1) are predications of a peculiar kind - one singular, the other universal - that typically occur whenever we make statements about physical systems or - for that matter substances. Traditionally the two proposional forms Sis Q
usually exemplified by 'Socrates is mortal' and
(2a)
264
IV.17 Substances, Physical Systems, and Quantum Mechanics all P is Q
(2b)
usually exemplified by 'all men are mortal' were distinguished as being the only logically correct forms of (affirmative) statements: singular in the first and universal in the second case. Today we do not longer believe in this postulate, and the tradition from Aristotle to the middle of the 19th century is blamed for having used only a small fragment of logic and a distorted one at that. Russell 4 describes the logic of Leibniz whom he highly respected as a logician in the words: "Every true proposition is either general, like 'All men are mortal', in which case it states that one predicate implies another, or particular, like 'Socrates is mortal', in which case the predicate is contained in the subject." With respect to this feature traditional logic is then criticized as a "defective logic". Russell continues: "The subject-predicate logic, which Leibniz and other a priori philosophers in the past assumed, either ignores relations altogether, or produces fallacious arguments to prove that relations are unreal." Now it is certainly the case that there are propositional forms profoundly different from (2). At least if we look at (2a) as typically illustrated by '2 is prime' a singular statement expressing a binary relation like '2 is smaller than 3' is considerably different in kind from the foregoing one. Even more so, if 'all primes larger than 2 are odd' is taken to be a typical instance of (2b), then the statement 'to every number n there is a prime number p that is larger than n' is a universal statement that cannot be analyzed as a universal implication (2b). The insight into these differences is, if anything, the enduring lesson that we have been taught by Frege . However, this does not mean that predications like (1) have lost all their dignity and that there is simply no sense in which they could be distinguished from their new competitors. In fact it was the very development of logic after Frege that re-installed classical predication in a most natural way. In the time following the Frege-Russell period an important subjectpredicate relation was established: the model relation, i.e. the relation between a structure S and a formal theory E holding if S is a model of E. For the moment let us look at formula (la) as expressing just this. Intuitively, the model relation is a monadic predication, the theory E being the predicate and S the subject about which E is predicated. Therefore, as seen from outside it even is a singular statement. In spite of this, the richness of internal structure that the predicate E as an arbitrary formal theory of mathematical logic can have guarantees - so it seems - a wide range of application of the new predications. Indeed, if we were to succeed in using formal theories E 4
Russell 1946, pp. 614ff
IV.17 Substances, Physical Systems, and Quantum Mechanics
265
in the sense of mathematical logic to formulate physical statements in the original sense of (la), then monadic predication would be the major form of theory statement. Moreover, a model-theoretic counterpart of (lb) would be readily at hand in order to formulate also the universal statements (1 b ) corresponding to (la). E being a formal theory, we would only have to choose a class A of structures of the appropriate type to obtain the model-theoretic version of (lb). The question, therefore, is: Can model theory be used to reconstruct theories of physics in the way indicated? I think it can, and, questions of rigor left aside, this is what modern theoretical physics is doing all the time. In physics we attempt to describe physical systems by means of mathematical structures. In this way physical laws, obeyed by those systems, can be expressed by statements about the describing structures. Let me exemplify this procedure by the theory of a particle moving in a central field according to the laws of Newtonian mechanics. In this case the structure being used to describe the behavior of the particle consists of four parts: absolute space, absolute time, a field of force as well as the orbit and mass of the particle. Correspondingly, our theory is made up of Euclidean geometry of space, a corresponding degenerate geometry of time, general Newtonian mechanics and a special force law. And all this is usually formulated in mathematical terms well- known in this case even to the beginner. At present a logical frame for reconstructing physical theory even more convenient than model theory is set theory 5. Then the structure occurring in (la) is an (m + n)-tuple
(3) of sets, and E of (la) usually is a conjunction of statements concerning more and more of the elements X/L and 8 v of the describing structure. The sets X/L and 8 v are to be identified with the extensions of the basic physical concepts describing the system in question. We distinguish the universes of discourse X/L from the typified sets 8 v . The nature of the elements of the former can only be known from without our system. By contrast, the elements of the latter are known from within. Indeed the 8 v themselves are elements of sets constructed from the universes of discourse by operations exactly corresponding to the formation of many-termed subject-predicate statements in the language of theory E. In our example, Xl and X 2 would describe space and time respectively, 81 and 82 the distance in space and time, respectively, and so on. The elements of 81 would then stand for triples consisting of two points in space and one number such that the number is the distance of the two points. On the other hand, the question what a point in space (or time) is could not be answered in this way. And this is the general situation whether we are dealing with point mechanics, continuum mechanics, electrodynamics, 5
Ludwig 21990; Scheibe 1986d (this vol. ch. VIII.36) and 1988c
266
IV.17 Substances, Physical Systems, and Quantum Mechanics
quantum mechanics, gravitational theory according to Newton or Einstein or what not. Assuming, then, that we have succeeded in reconstructing a physical theory according to the foregoing ideas, it can now easily be seen that this reconstruction involves two kinds of predication. The first kind is given by the predicates used for the formulation of E. The second kind of predication is the kind of statements that we make by predicating E of a structure. To be absolutely clear on this point let me illustrate the difference first by a purely mathematical example. Let E be the Peano axioms for arithmetic and S the usual number system. In this context predicates like 'prime' or 'odd' or 'perfect' etc. are predicates of the first kind. Their subjects in predication are numbers in S. By contrast, predicating E or any consequence of E means predicating it of S or any other possible arithmetical model. It would be as meaningless to say of the number 7 (of S) that it satisfies the Peano axioms as it is meaningless to say of the standard or any other number system that it is prime or odd. A case somewhat nearer to physics is geometry. Let E be an axiom system for Euclidean geometry and S a Euclidean space. Then the predicate 'being a circle' can be sensibly used to make statements - true or false - about point sets in S. By contrast, E and its consequences are used to make statements about S itself or any other relevant structure. Again sentences like 'this point set (in S) is Euclidean' or'S is a circle' would be without any meaning. All this seems pretty evident, and it may be asked whether it is worth our while to make this point. However, coming to physical examples the distinction in question, though still being there with all its logical force, is somewhat blurred by the peculiarity of these cases. Take our previous paradigm case of a Newtonian particle moving in a field. Here the term 'Newtonian' is meant to extend the term 'Euclidean' including now besides space also time and a law of motion. To say of our system that it is Newtonian is, therefore, as good a predication of the second kind as it was to say of its space that it is Euclidean. But now we can also meaningfully ask: Does our particle move in a circle? The statement that it does now appears to be a predication of the second kind about our system meaning that the points through which its (only) particle moves happen to make up a circle. Does this contradict our previous result that 'being a circle' is a predicate of the first kind? By no means. What happened is just that quite often we use predicates of the first kind directly in predication of the second kind. In fact, we always use them more or less directly in this way. For it are the predicates of the first kind by which our system is described after all. In the arithmetical example, for instance, a direct use of 'prime' would come up if one number in our number system were distinguished as part of the structure considered. We could then improperly say of the whole structure that it is prime if the distinguished number is. Now in our physical example a point set is distinguished as the orbit of the particle of our system and as a facon de parler we then apply the
IV.17 Substances, Physical Systems, and Quantum Mechanics
267
first kind predicate 'circular' to the whole system. In neither case, of course, does this alter the difference in question as a matter of principle. Let me, then, take it that the statemental part of a physical theory is a onetermed predication - subject-predicate statement - although in this predication of the second kind an unbounded number of many-termed and higher order predicates of the first kind may be used. In principle, therefore, and in spite of Russell's judgement we are still in the same boat with Locke and Leibniz, and, concluding this argument, this result may be confirmed by indicating that we have still at least some of the classical difficulties concerning substances. In the eyes of Locke ''when we speak of any sort of substance, we say it is a thing having such or such qualifies, as body is a thing that is extended, figured, and capable of motion [ . .. ]. These, and the like fashions of speaking, intimate that the substance is supposed always something besides the extension, figure, [ ... ], motion, [ . .. ] or other observable ideas, though we know not what it is" 6. Leibniz' answer in his Nouveaux Essais sounds amazingly lighthearted when he said: "En distinguant deux choses dans la substance, les [ ... ] predicats et Ie sujet commun de ces predicats, ce n'est pas merveille, qu'on ne peut rien concevoir de particulier dans ce sujet. II Ie faut bien, puisqu'on a deja separe tous les attributs, OU l'on pourroit concevoir quelque detail" 7. It seems that the disagreement behind these two statements is just the more empiristic and more aprioristic attitude of the two men, respectively. However, Leibniz' admirably clear formulation of the source of the difficulty can still be applied to the attempt to understand the universal predication (lb) in the light of the present analysis. The crucial difference between (lb) and (la) is that the demarcation of the domain A in (lb) cannot be produced by an ostensive act as it could be done in the case of (la) where only one individual system had to be pointed out. The only alternative then seems to be a conceptual description. But on pain of becoming tautological (lb) cannot be formulated in the language in which E. is formulated. To give but one example, gases are described by their pressure, volume and temperature. If we now want to use van der Waals' equation (as E) in a universal statement (lb) then, even if we take the risk to claim the equation for all gases, we still would have to say what we mean by a gas. It would not suffice, as is usual, to restrict generality by restricting our parameters to certain intervals, e.g. to low pressure. In the last analysis the characterization of a gas in the premise of (lb) has to be given in a language different from the one used in the conclusion: it has to be characterized by the way a system is given to us or is produced or something of the sort. For in the language of E everything to be said about the system is said by E. 6 7
Locke 1700, II. XXIII.3 Leibniz 1765, II. XXIII. 2
268
IV.17 Substances, Physical Systems, and Quantum Mechanics
3. Completeness The foregoing introduction of predication of the 2nd kind and its usage in physical science has, I think, reconfirmed our original identification of the traditional substances with physical systems in the modem sense. Only in passing I might touch upon an objection that could be made at this point. It may be objected that features of physical systems, essentially depending on the method of describing physical objects by mathematical structures, cannot, for this very reason, be taken as characteristics of substances. The argument simply is that substances would then share these characteristics with mathematical structures, and we would, for instance, turn the number system into a substance which sounds absurd. Though I don't want to go into any details on this matter, I think it is in order to emphasize that we should not be worried about this objection. Rather we should turn the tables and counterargue that precisely such consequences are the lesson that modem physics has taught us. We need not invoke the authority of physicists like Schrodinger and Heisenberg who on very different grounds have stressed the substantial aspect of (mathematical) form as opposed to matter 8. It suffices to point out that if we came to the conclusion that the "new way of mathematics" in understanding nature had been erroneous then we would have to give up almost all of physics as we now know it. It goes without saying that this situation does not dispense us from restricting the vast area of mathematical theories by looking for further characteristics of the description of nature that are relevant to our major theme. In my third argument I want to point out one such characteristic, and this will allow me - or rather: force me - to introduce the great schism in modem physics: the schism between its classical part and quantum theory. We can approach this matter most aptly by starting out from Leibniz' idea of a complete notion of an individual substance. In his Discours de Metaphysique Leibniz introduces this notion by saying: "II est bien vray, que lorsque plusieurs predicats s'attribuent a un meme sujet, et que ce sujet ne s'attribue plus a aucun autre, on l'appelle substance individuelle". He then continues: "[ ... ]la nature d'une substance individuelle, ou d'un estre complet, est d'avoir une notion si accomplie, qu'elle soit suffisante a comprendre et a en faire deduire tous les predicats du sujet a qui cette notion est attribuee" 9. In the first quotation we meet with the famous Aristotelian characteristic of a substance as something that cannot be said of a subject. As to the second quotation, what Leibniz here says if translated into modem terms comes out as: To every individual substance there is attached a Boolean algebra of predicates (or properties) together with a maximal, possibly atomic, filter completely characterizing that substance. Let us now look for these structures in our context. s Schrodinger 1961, pp. 18f; Heisenberg 1953, 1954, and 1969,Ch. 20 9 Leibniz 1686, Sect. 8
IV.17 Substances, Physical Systems, and Quantum Mechanics
269
Half-baken specimens we can find - already without adding anything to the present setting. Let the theory E and a model S be given. Then the finite theories more special than E form a Boolean algebra with respect to conjunction, disjunction and negation (restricted to E), and among them the theories having S as a model form a maximal filter. However, this filter will never, or at any rate in no case of any interest, uniquely determine S. I cannot enter the details of this matter, interesting as it is. Let me only mention that the species of structures E, typically occurring in the formulation of physical theories all have the property that with any model S of E any structure isomorphic to S is also a model of E. This host of models of a formal theory can be reduced by choosing an incomplete model So of E and restricting the model class of E to models having So as fragment. The most prominent case in which this is done (except for general relativity) is space-time. But even in this case the theories are submitted to important invariance conditions, e.g. invariance under the Poincare group, which again prevents them from uniquely characterizing a physical system. Moreover, theories of physics are meant to have entire generality within a certain class of initial- and boundary conditions. It is then only the addition of these conditions to the theory that uniquely determines a single system. The Boolean "logic" of specializations of a given theory or - for that matter - 2nd kind predicate and the failure of its filters to uniquely characterize a physical system is - as I am anxious to emphasize - common to all theorizing within physics whether it concerns classical or quantum systems. The essential difference between the latter comes in sight if we confine ourselves to systems roughly described by structures (P, S, T; W, D, f)
(4)
of the following kind. Think of our system as being in a definite state at every time. This development is described by f, connecting time T and state space S. Let us further assume that this change of state occurs according to a law D that is deterministic in the usual sense: Given any time point t and any state s there is exactly one "motion" of our system through s satisfying those initial conditions. Thus f, the actual change of the system, is a "solution" of D, and like f also the law D connects time with the state space. But what is a state? f as the system's total development in time is Leibniz' complete notion of it. In Leibniz' own words it "includes all past, present and future predicates of that substance [= system]" 10. Accordingly, a state is the complete notion at any given time. It is the totality of momentary or contingent properties that are somehow possessed by the system at a time. The contingent properties are collected in P, and W tells us which properties and in what manner are involved in any state from S. It is the fragment consisting of P, Wand S where quantum theory deviates from classical thinking. 10
Leibniz/Couturat, p. 520
270
IV.17 Substances, Physical Systems, and Quantum Mechanics
In classical physics, i.e. in classical field theory as well as in classical mechanics, W is a binary relation between properties and states. It holds between a property a and a state s if our system has property a in state s. It is necessary to explain the emphasis that I put on having a property. We are here not concerned, for instance, with problems of secondary as opposed to primary qualities. Likewise no problems of any specific subjectivity will worry us. Rather the problem is that when we speak of a thing having a property we are used to imply that this relation holds (or does not hold) irrespective of whether anybody observes the thing in question. Now even this explication will perhaps be of no much help unless we already knew what it could mean to talk about things and their properties under circumstances where it does matter whether the things are observed or not. Knowing this by the advent of quantum theory we are in a position to point out the part of classical physics that allowed us to talk in the usual way and with the usual understanding. For all we know today it seems that the implied independence of observation is guaranteed by the assumptions that 1) P is a Boolean algebra and 2) the set of properties possessed by a system in a given state according to W is a maximal filter on P. It is convenient to make also some further assumptions. But the two mentioned are the crucial ones. What do they mean? Let me first emphasize that they are an entirely natural continuation of the present argument. This was started with the observation that Leibniz' conception of an individual substance can be rephrased by the very two conditions just repeated. Having shown that physical theories are predicates (of the 2nd kind) with physical systems as their subjects, we continued to point out the two features in question by means of the specializations of a given theory and one of its models. If we now confine ourselves to theories of the type (4) under consideration then, given the sets in (4), the natural continuation of our argument is to ask for a Boolean algebra and a maximal filter whose elements are sets of possible motions of our system, these data characterizing the actual motion f. And precisely this is done by our last assumptions: Fixing a time to we only need to assign to every property a from P the motions which at to are in a state in which the system has property a. They form a Boolean algebra isomorphic to P, and the ones containing f make up a maximal filter completely characterizing f. This situation is realized in principle in classical mechanics and electrodynamics, and we can say with confidence that it was the partial fulfillment of one of Leibniz' many dreams, bringing together important aspects of physics, logic and ontology. To deepen understanding of the situation it is best now to introduce quantum theory in order to see what is going wrong. In quantum theory the foregoing picture is modified essentially in two respects. First, the algebra of contingent properties is no more Boolean. Rather it is the algebra of linear subspaces of a Hilbert space. Second, the fundamental relation W between properties and states and with it the contingent relation f become probabilistic. W gives us the probability that in a given state we would find a given
IV.17 Substances, Physical Systems, and Quantum Mechanics
271
property of the system if we were to perform an appropriate measurement. It has to be emphasized that the occurrence of probabilities in quantum theory is conditional on the non-Boolean structure of the algebra of properties. We are here not in the situation of classical statistical mechanics. There probabilities come in because of lack of knowledge on account of the enormous number of particles involved. As a matter of principle they can be eliminated, and it still makes good sense to speak of the particles as having this or that of their contingent properties irrespective of any measurement. By contrast, in quantum mechanics the difficulty concerns already the treatment of one single atom. If, even in the presence of probabilities, we attempt to introduce a probability-free language describing the atom in terms of its contingent properties the linear structure of Hilbert space causes serious difficulties. One of the most obvious anomalies of the new structure P is that the classical (Boolean) implications a
< (a n b) U (a n b)
b < (b n a) U (b
n Ii)
(5)
are no more valid for all contingent properties. A well known example is given by position and momentum of a particle. If a and b are position and momentum respectively, each confined to some interval, then the right-hand sides of formulas (5) become zero thus invalidating them. This result is, so to speak, the negative image of Heisenberg's indeterminacy relation if projected into the probability-free part of the theory. Mind that the non-boolean behavior of the contingent properties in quantum theory does not mean that we cannot obtain and investigate boolean algebras of sets of motions f. But any such set, if interpreted as a property of the given system, would be a property of the system as described by probabilities. In an obvious sense, to every set of probabilities attached to certain contingent properties at certain times there is associated a property of this probabilistic kind. But even in the special case that we have probability 1 attended to, say, an eigenspace of the energy this could not be reformulated as meaning that our system has the corresponding energy. If a V b is true it follows that a is true or b is true. But even classically that a V b is known (to be true) does not imply that a is known or b is known. If, then, the non-boolean behavior of the quantum mechanical contingent properties of a system does not affect the boolean behavior of probability statements, what else does it mean? It is beyond doubt that these properties are the precise analogue of the corresponding properties in classical physics. If, on account of their non-boolean "logic", they still cannot be used for an ontic, objectifying description of a system what statements other than probability statements can be made with their help? The orthodox answer to this question is 1) that the existence of each property, taken by itself, can be decided by a measurement, 2) that accordingly we can assert the existence of a property as the result of an actually performed measurement, but 3) that the same does not hold with respect to any two or more properties. Rather, to every property there are others incommensurable with it. Indeed
272
IV.17 Substances, Physical Systems, and Quantum Mechanics
these are precisely the cases in which (5) is violated. In this way the burden of an explanation of the non-classical character of the theory as well as of well known experiments illustrating it is shifted to the level of observation. However, the physicists of the pre-quantum era, though they did not formulate their theories in terms of observation, would have had no difficulty in doing so. In other words, our common ideas about observation and measurement, as distinct from epistemologically unreflected description, do not imply quantum theory. The most one can say is that the latter allows a formulation in terms of the former because, among other things, statements telling the result of observations are much less committing than the corresponding ontic statements.
4. Individuality Up to this point I did not take account of one feature of a substance that in the tradition has been viewed as the most important one: its individuality. The association most likely to be aroused by this notion is atomism, i.e. the idea that matter as we know it is composed of parts which again are composed of even smaller parts until we finally reach a stage where no further partition is possible: the stage of individual atoms or ~ as would be more adequate today ~ of elementary particles. As seen from the viewpoint of atomism the solar system or the galaxy to which it belongs, though they may be rather well defined independent physical systems, are by no means individual substances. And if during the foregoing considerations we should have tacitly assumed that they were, this would have been a bad mistake. However, there is at least one further aspect of individuality according to which the systems mentioned, each taken by itself, are individual systems. For lack of a better name I shall call this aspect the holistic aspect of individuality. As compared with the atomistic one the holistic aspect has not received the attention that it deserves. Therefore I shall concentrate the following consideration on it. In what sense did I say a moment ago that the solar system is an individual system? Is it not obvious that this system is composed of numerous bodies each of which can be recognized and investigated on its own account? This is true enough except for one important aspect: the gravitational interaction between any two bodies in the system. According to Newton's theory of gravitation the bodies of the solar system, viewed as being an isolated system, move in such a way that ~ strictly speaking ~ no subsystem, i.e. no system composed of a true subset of bodies, also satisfies the theory. In this sense, then, the system is an individual whole. Moreover, widening the horizon we can even say that Newton's gravitational equations are either false or, if true, are true only of the totality of bodies in the universe. We here see what happens if we revoke the assumption that was made at the beginning of my considerations. Then we allowed ourselves, following the practice of the physicists since Galileo, to isolate, for instance, the solar system from
IV.17 Substances, Physical Systems, and Quantum Mechanics
273
the rest of our galaxy or the system, consisting of the sun, the earth and the moon, from the rest of the solar system etc. because we know that the mistake we make in doing this is negligibly small. Now we see that if we give up the laboratory view of physics in favor of a strictly cosmological view we are driven to entertain the idea that there simply are no individual substances in the universe except for the universe itself. The foregoing argument can easily be generalized to include also classical field theories. In general relativity, for instance, there is a clear distinction between gravitational and electromagnetic fields and we even know the laws how these fields would develop in time if each could exist independently of the other. On the other side, it follows from Einstein's equations that an electromagnetic field cannot exist without a gravitational field however weak the latter may be under normal circumstances. In principle, therefore, we are hardly legitimized to consider either of them apart from the other. In general (but confined to the nonrelativistic case) the curious combination of a decomposition of a system into subsystems and, at the same time, the individuality of the former can be described as follows. Exploiting the analysis already given in connection with formula (4) we see that there is a sharp division between a timeless part of our description of a physical system and the description of its behavior in time. Whereas the dynamics D and the motion I are somehow related to time, time does not enter the set P of contingent properties, the state space S and the relation W connecting properties with states. Now the decomposition of our system into two subsystems I and II amounts to a representation (6a) of S as the Cartesian product of the state spaces SI and SII of I and II respectively and, correspondingly, of P as the boolean product of the (boolean) algebras PI and P II of contingent properties of I and II. Obviously, the decomposition of S immediately leads to a decomposition I(t)
= lI(t)
x III(t)
(6b)
of the actual motion I of our system. By contrast, nothing can be inferred as regards the dynamic D. Whereas (6a) signalizes a mutual independence of the subsystems sufficient to distinguish them conceptually as well as in reality, D may still be a genuine interaction in the sense that the development of II essentially depends on that of III and vice versa. And this suffices to prevent any subsystem from moving according to the same dynamical law D. The foregoing analysis can be summarized by saying that in classical dynamics the individuality of a physical system composed of two subsystems is entirely due to the interaction between the latter. By contrast, in quantum theory individuality is brought about not only by the interaction. It also affects in a most dramatic way the decomposition of a system into independent subsystems. It is true that here, too, we have a decomposition of Sand
274
IV.17 Substances, Physical Systems, and Quantum Mechanics
P as in (6a). But the first product, instead of being Cartesian, is a tensor product. This has the disastrous consequence that the decomposition (6b) of the actual time variations of the three systems breaks down. The story is usually told in form of the following thought experiment 11. Imagine that relation (6b) holds for three (pure) states at some initial time, that then the two subsystems enter into a temporary interaction and finally separate again. Though the total system is still in a state that is known if its initial state and the dynamics of the interaction are known, the same does no longer hold for the subsystems. Their (pure) states have been lost and can be regained only by a measurement. Moreover, such a measurement need only be performed on one of the two subsystems. Depending on its result it will immediately reveal the corresponding information for the other system even if the two systems are spatially separated by light years. It is evident that the EPR situation allows an extrapolation resulting in a second, independent argument in favor of a monistic ontology for the universe: If quantum theory is universally valid and if, therefore, we can assume the universe to be in a pure state then it is a priori highly probable that, strictly speaking, no part of it that we recognize as one of its subsystems also is in a pure state. Rather all subsystems are (contingently!) mutually EPR-correlated in a net of unimaginable complexity. The universe is one undivided whole. And this time, as opposed to the classical case, although the situation may be brought about and maintained by continuous mutual interactions its source is not interaction but rather the quantum theoretical rule for the description of composite systems according to which the overwhelming majority of states are not factorizable. It has to be admitted that the arguments given in favor of ontological monism rest on assumptions that are quite unreasonable from a practical point of view. However, theories of gravitation and the mechanism of compound systems are of a fundamental character. Such theories are not proposed only to say afterwards that they are to be taken cum grano salis anyway. Moreover, they have consequences not only on the high road of philosophical speculation. Rather they are likely to get into conflict with well established methods of physical science. A case in point is physical law. The general proposition (Ib) - as a 2nd kind generality - has no more the innerworldly character of the quasi-singular proposition (Ia). Precisely if the latter refers to such a world substitute as a physical system is taken to be it is unclear within what new world a proposition (I b) is to be understood. It is true that we want to conceive of the system SEA referred to in a law (Ib) as being mutually independent possible worlds. Only this, after all, explains our amazement about the regularity expressed in the law. On the other side, we know that in any case of a physical theory the systems to which it refers are to be met with, if at all, as parts of one and the same, namely our, universe. This, however, is not expressed in the law (Ib) as it is expressed in (Ia) for each system taken separately. 11
Einstein/Podolsky/Rosen 1935
IV.17 Substances, Physical Systems, and Quantum Mechanics
275
And if we try to express it we immediately run into trouble on account of the foregoing argument. We have seen that there are no two independent realizations of Newton's equations in one universe. And we have seen that there is normally a crucial inseparability of the subsystems of a quantum mechanical system. As a matter of principle, therefore, there are no strict realizations of a law (lb) in one and the same universe. We have to replace the one-sided view of the laws of nature as the hallmark of physical science by a certain complementarity or reciprocity of lawfulness and interconnectedness in nature. Lawfulness in the standard form (1 b) demands strictly independent instances of the law. In searching for laws the point just is to find such independences. Of course, these independences go together with internal dependences as they constitute the contents of the respective law. At the same time they mark the limits of the latter. As long as we have reason to assume that laws in this sense are realized in nature - strictly realized - there is no total interconnectedness in the universe. On the other hand, the realization in one and the same universe, as it will be required even by a modest empiricism, constantly draws our attention to the possibility to have missed some dependence. And the discovery of anyone in the context of an accepted law inevitably will destroy the law. The increase of discovered dependences cannot but lead to a decrease of laws in the usual sense as something fundamental. Causal interconnectedness and strictly regular lawfulness are to be viewed as being complementary aspects of the universe.
IV.18 General Laws of Nature and the Uniqueness of the U niverse* Dedicated to Peter Mittelstaedt on the occasion of his 60th birthday. It seems a generally acknowledged view that physics is confined to the investigation of events that can be reproduced. "The natural scientist - says Pauli l - is concerned with a particular kind of phenomena ... he has to confine himself to that which is reproducible ... I do not claim that the reproducible by itself is more important than the unique. But I do claim that the unique exceeds the treatment by scientific method. Indeed it is the aim of this method to find and to test natural laws ... " Here for Pauli as for everybody else a natural law is a statement expressing a regularity more or less directly related to repeatable events. And one may add that it is not only the possibility of testing that is responsible for our demand of reproducibility. Rather it is the very fact of regularity expressed in it that gives a natural law its dignity and makes it a subject worth studying on its own account. The characterization of physics by the natural law and the reputation that physics thus understood has gained during the last centuries has often been felt to be a difficulty for cosmology, evolutionary biology and other kinds of natural history. For in these disciplines the typically historical element and with it the unique event and the unique development becomes the primary subject of investigation. The opposition is expressed - to give but one quotation - by Friedrich Hund 2 by saying: "One may characterize physics as the doctrine of the repeatable, be it a succession in time or the co-existence in space. The validity of physical theorems is founded on this repeatability ... By contrast 'cosmology' is the doctrine of the unique universe, of its special, perhaps historically, developed features." Now history in a general sense must not be a stumblingblock to repetition and reproduction. Strictly speaking all events are unique, and in the sense in which they can be repeated they are repeated in the course of time. It is only when we come to more and more extended processes such as biological evolution or the recession of the nebulae that we cannot hope to become witnesses of a repetition. And it would be such processes that we had to face if we wanted to save the scientific status of the disciplines mentioned in any direct way. In this paper I shall not take this direct way. Rather I shall investigate the premise of the foregoing argument, i.e. the claim that in physics proper we really have the situation of repeatable events and natural laws expressing regularities between such events. And the particular aspect under which I want to analyze this claim is the fact that all events, processes, objects etc. that have ever been made the subject of an empirical investigation are events, * First published as Scheibe 1991c. 1 2
Pauli 1961, p. 94 Hund 1972, p. 274; see also Wigner 1979, p. 3, no.l; Vollmer 1986, pp. 53££ 276
IV.18 General Laws of Nature and the Uniqueness of the Universe
277
processes, objects etc. in one and the same universe. Thus I want to challenge the common view of the lawful character of physical theory by taking seriously an aspect of uniqueness that, although it may be very weak, has some obvious relevance to our theorizing in the natural sciences. There are two main reasons to be suspicious about the regularity view (in a wider sense) of physical theory if we introduce the aspect in question. One is that we cannot a priori exclude a thoroughly holistic structure of the universe, and we cannot do this even after having accumulated hundreds and thousands of empirical evidences to the contrary. This is certainly an extreme position but it has been taken even by physicists. Schrodinger, for instance, asking how we can come to make precise predictions about the future behavior of a physical system argues 3 that "it may be, and if we are entirely strict about it, it certainly is the case that we are forced to extend the system considered to the entire universe." The second reason that may raise doubts about our subject is methodological in nature. It is that the usual formulation of a physical theory does anything but invite us to believe in the regularity view in any innerworldly sense. Rather the theories are formulated primarily as statements about a single physical system, and their generalization to universal statements about a whole class of systems, although it lends itself to a possible-worlds interpretation, does in general not give the slightest hint to find an interpretation within one, namely our, universe. I
In the first section I will explain in greater detail what I mean by generalities of the 1st kind. The main thesis, coming in two parts, is here: First, a physical theory is essentially a theory about one single physical system. What a theory says -- what makes us recognize that in a given case we are faced with quantum mechanics and not with electrostatics, with thermodynamics and not with acoustic etc. - these contents of the theory, I say, concern one single physical system. Insofar a theory, if viewed as a statement, essentially is a singular statement. However - and here comes the second part of the thesis - already this statement, singular with respect to the physical system, contains two obvious generalities: the concepts in which our system is described and the quantifiers - the universal and existential quantifiers - applied to the concepts in order to bring about the statement in question. These generalities are the ones I want to call generalities of the 1st kind. Obviously, they are system-internal generalities and are not used to express the eventual universal validity of the theory. This applies even in the case of probabilistic theories, e.g. quantum mechanics. This is again obvious if we advocate the view that probability statements are about single systems. But even if one is not willing to view probability statements as statements about single systems one has to admit that the theory then is about one single ensemble of 3
Schrodinger 1932, p. 2
278
IV.18 General Laws of Nature and the Uniqueness of the Universe
physical systems in the usual sense. What the theory says is then said about this ensemble and is certainly not a universal statement with respect to the individual systems. For according to the very advocates of the ensemble view a probabilistic theory does not make any statements about single systems. Precisely for this reason it cannot make a universal statement about a single ensemble. Among the evidence for the fact that most people, contrary to what I have just been saying, like to view theories as being universal in the first place there is the fact that they emphasize the exceptional situation occurring in cosmology: In cosmology - it is usually said - we meet with the serious obstacle that our theory is about one system only simply because the universe as a whole is given to us only once. To me this seems to be the wrong kind of emphasis because it favors one component of theorizing - universality to the exclusion of another one that, as we shall see, is equally important. I would, therefore, rather begin my analysis with the remark that in physical theory, although we are never concerned with the universe in toto, we always conceive of the actual system of our interest as if it were the whole universe. Thus, on the one hand I take it for granted that even in physical cosmology we never make the whole universe the object of our theoretical investigation. The conception of the universe as the unrestricted totality of everything existing may be an interesting conception from a philosophical point of view. In physics it would be of no use whatever. There a drastic selection takes place in every case, and the amount of what is selected usually is negligibly small when compared with what we omit. The selection is made under various viewpoints: we idealize, we neglect, we isolate, we simplify, we abstract. In every case this means that we pass from a larger whole that really is a piece of nature to some fraction of it, and it is only this fraction which we are going to deal with. On the other hand, it is important to realize that what is omitted in this way - what is not taken into account in our theory - is so radically wiped out that we cannot but view the product of our selection as being a world of its own: a complete substitute for the actual universe. May be that the latter still plays a role in the background and that it is re-introduced in part when we apply the theory. The theory taken by itself does not know about this. The object of electrodynamics as defined by Maxwell's equation is a field and charged matter and nothing else. Quantum mechanics of the hydrogen atom has as its object one hydrogen atom (or an ensemble of such) and nothing else, and so on. In each of these cases we act as if the object of our theory be the total universe although we know that this is not the case and sometimes mitigate the situation by introducing more complex systems. The method of the "as if' might be called after its inventor the Galilean method 4 . The fact that it works is a highly non-trivial fact about the universe that we shall keep an eye on: We can successfully investigate parts of the universe 4
See the quotations in Me Mullin 1967, pp. 329f and 356f
IV.18 General Laws of Nature and the Uniqueness of the Universe
279
without considering everything. And actually we do so already in our daily life. The Galilean method, however, deserves to be studied with special care. For it involves the far-reaching and intricate concept of a physical theory, and therefore we should now have a brief look at the logical structure of our theory concept 5 . In physics we attempt to describe physical systems by means of mathematical structures. In this way physical laws, obeyed by those systems, can be expressed by statements about the describing structures. Let me exemplify this procedure by the theory of a particle moving in a central field according to the laws of Newtonian mechanics. In this case the structure being used to describe the behavior of the particle consists of four parts: absolute space, absolute time, a field of force as well as the orbit and mass of the particle. Correspondingly, our theory is made up of Euclidean geometry of space, a corresponding degenerate geometry of time, general Newtonian mechanics and a special force law. And all this is usually formulated in mathematical terms well-known in this case even to the beginner. In general the statement of our theory is of the form
(1) where E usually is a conjunction of statements concerning more and more of the elements XI-' and Sv of the describing structure. These elements are sets later on to be identified with the extensions of the basic concepts describing the system in question. We distinguish the so-called principal base sets XI-' from the typified sets Sv. The nature of the elements of the former can only be known from without our system. By contrast, the elements of the latter are known from within in the sense that they are the product of one or the other of a class of universal constructions from the principal base sets. In our example, Xl and X 2 would describe space and time respectively, 81 and 82 the distance in space and time, and so on. The elements of Sl would then stand for triples consisting of two points in space and one number such that the number is the distance of the two points. On the other hand, the question what a point in space (or time) is could not be answered in this way. And this is the general situation whether we are dealing with point mechanics, continuum mechanics, electrodynamics, quantum mechanics, gravitational theory according to Newton or Einstein or what not: In each and every case the theory is given by a statement of the form (1) where the structure (X; s) stands for a physical system and E for what the theory says about the system. In this situation we first meet with a system-internal conceptual generality (of the 1st kind): The given physical system is described by means of concepts the extensions of which are the sets X and s. Accordingly, for these concepts as well as the ones defined by them it is mandatory that they refer to one well determined system although in a general discussion as the present one this determinateness is only assumed. Thus, for instance, the concept that a 5
For the following view on theories see Scheibe 1979 (this vol. 111.11)
280
IV.I8 General Laws of Nature and the Uniqueness of the Universe
point PI has distance r from another point P2 refers to a well determined space, and unless this space is given we do not know what is meant by that concept. Likewise, in our illustrating theory the concept that the particle has velocity v at time t means nothing unless a well defined orbit is given, and so on. Trivial as it is the matter has to be emphasized in view of the concepts of the 2d kind to be introduced later on. The unity of these concepts will be given not by one single system but by one theory. They will be concepts, for instance, of a space or of a particle orbit. Accordingly, they will refer always to a whole class of systems and thus will have, so to speak, a generality of a higher order. But plain generality we already find in the concepts describing any single system. For its description we cannot but introduce possibilities that certainly are not realized: There is an infinity of points PI and P2 not having a given distance r, and similarly an infinity of velocities which our particle does not have at a given moment. This conceptual overproduction not only is a fact but a necessity: We simply do not know a method to describe an individual object without introducing more theoretical elements than correspond to what is actually there. We find a corresponding situation if we now turn to the propositions that are made in physical theory about a single system, i.e. the propositions of which E in (1) is built up. Being the axioms of a theory these propositions are not singular statements by which we are informed which is the distance of two given points or which velocity our particle might have at a given time. Thus although they are statements about a single system they are not singular in the usual sense. Rather we here meet with a propositional generality (of the 1st kind) already on the level of one physical system and precisely corresponding to the conceptual generality mentioned previously. If, for instance, our theory includes a theory of space then, being a theory about a particular system, it necessarily must refer to a particular space. Any theorem about this space, e.g. the triangle inequality, then is a case in point: It says what it says by essentially using quantifiers binding the variables of the terms in which the theorem is formulated. The same holds for our particle theory, for instance, with respect to its equations of motion: They are differential equations submitting the position functions to certain conditions to be satisfied at every time. Likewise a field equation would have to be valid at every point in time and space, and so on. The typical situation as to the axioms of a physical theory is that once we have introduced concepts and want to make a general use of them quantifiers are unavoidable and then represent the propositional generality of the 1st kind that was to be introduced. II
In the previous section it was argued 1) that a physical theory, if viewed as a statement, is a statement about one individual physical system and 2) that already this statement, although being singular in this sense, involves
IV.18 General Laws of Nature and the Uniqueness of the Universe
281
two generalities: one conceptual, the other one propositional - generalities of the 1st kind as I called them. If we now turn to the problem of universal laws of nature - the main theme of this paper - an entirely different kind of generality comes into play. At any rate this is my second thesis, and I am somewhat puzzled that this thesis does not appear in the relevant literature with sufficient clarity 6 As distinct from generality of the 1st kind it is essential for the generality of the 2nd kind that it concerns a certain totality of physical systems or - as philosophers are used to say - of objects. It is the kind of generality that philosophers have in mind when they talk about the universal validity of a law of nature. And, of course, also physicists do not restrict the meaning of their theories to singular statements of the form (1). Somewhat more modestly they speak of the domain of validity of a physical law. This indicates that the generality of the 2d kind also comes in two parts: one conceptual, the other one propositional. Insofar as we make general use of it a theory is a concept: some physical systems fall under this concept and others do not. Secondly, if we want to express universal validity of the theory we would have to say something like
Ay.y
E
Y -+ 17[X(y); s(y)].
(2)
Here Y is the domain of validity and (X(y); s(y)) is the structure describing the system y E Y. (2) then says that all systems belonging to a certain domain Y satisfy the axioms of our theory. And this statement will now be the major subject of discussion on the new level of generality as had been statement (1) on the lower level. A first point to show us that proposition (2) is different in kind from (1) is that the demarcation of the domain Y cannot be produced by an ostensive act as it could be done in the case of (1) where only one individual system had to be pointed out. The only alternative then seems to be a conceptual description. But this in turn cannot be given in the language in which 17 is formulated and defines a certain range of structures as its models. To give but one example, gases are described by their pressure, volume and temperature. If we now want to use van der Waals' equation (as E) in a universal statement (2) then, even if we take the risk to claim the equation for all gases, we still would have to say what we mean by a gas. It would not suffice, as is usual, to restrict generality by restricting our parameters to certain intervals, e.g. to low pressure. In the last analysis the characterization of a gas in the premise of (2) has to be given in a language different from the one used in the conclusion: it has to be characterized by the way a system is given to us or is produced or something of the sort. By contrast we were not forced to do this in the case of stating the singular version (1) of our theory.7 6 7
See, for instance, Hempel 1965, pp. 264ff, 335ff and 354ff; Nagel 1961, Ch. 4 The problem how the domain of application of a physical theory can be described is treated more fully in Stegmiiller 1976, Ch. IX.4 and 5
282
IV.I8 General Laws of Nature and the Uniqueness of the Universe
For a second consideration that may clarify the situation I want to compare the statement (1) with the philosophical folk formulation Ay.Py
-+ Qy.
(3)
of a law. This frequently discussed version is very likely to mislead us because in it only one of the two kinds of propositional generality occurs, and it is usually left undecided which one. From the point of view taken in the present approach it is immediately clear that quantification in (3) is of the 2d kind if (3) is meant to be a law and the range of the variable y is a class of objects obeying the law. Moreover, propositional generality of the 1st kind does not occur in (3) because the extremely simple description of the objects provided for by (3) is not in need of it. If, on the other hand, (3) is not viewed to be a law the quantification may very well be of the 1st kind. Y being the universe of discourse with respect to which (3) is interpreted anyway, in the case of a law the elements y E Y would be the physical systems. In the other case it may be that a "closed universe" showing lawful behavior is only reached in form of the entire set Y as, for instance, in geometry. (3) being the analogue of (2) it may be asked what the analogue of the quasi-singular statement (1) is in the philosophical folk case. Evidently, it must be Py
-+ Qy
(4a)
and this is the moment to re-emphasize that it is this statement and not (3) which conveys the important information. (4a) can easily be rewritten as a species of structures in the sense of (1). It is then given by L'[{P, P>}, {Q, Q>}; 81, 82] 81
==
E {P, P>} 1\
82
E {Q, Q>} 1\
(81
=
p> V 82
= Q). (4b)
where P> and Q> are the negations of P and Q respectively. Our system then is a system that, since only two predicates are available for the description, can assume only four states a priori, only one of which (namely 81 = P and 82 = Q» is excluded by the law. It is as if we would restrict the investigation of a circuit having resistance R to one value Uo, and 10 , of potential and current respectively, these values obeying Ohm's law Uo = R· 10 . Proposition (4) expresses this connection for one system only, and the generality of (3) consists by no means in admitting also other values of potential and current. Rather it refers to other systems - other circuits (with resistance R) - stating for these the same connection between the two given values Uo and Io. The philosophical folk case simply is degenerated into two properties P and Q, and all possibility of variation concerns the systems, not the physical description.
IV.18 General Laws of Nature and the Uniqueness of the Universe
283
Thus we see that my present point cannot be illustrated in full by the simple universal implication (3), and I mention this case only because, as the philosophers of us know, whole books are filled with considerations on laws of nature without transcending their folk version (3). However, as soon as we turn to real life examples from physics we can easily recognize the symbiosis of the two kinds of generality, propositional as well as conceptual. Take our standard example of a particle moving in a field according to classical mechanics. The usual physical concepts of this theory as, for instance, the concept of the position of the particle at any time are concepts of the 1st kind used to describe one concrete system to which our theory is applied. But there is also the concept of a system satisfying the theory in question, i. e. the concept of a particle moving in a central field, and so on. And this concept is of the 2d kind. With it not the system but, in a sense, the theory is described. Correspondingly, in the axioms of the theory we have quantifications with respect to the concepts describing the system, e.g. we require the equations of motion to hold for every time point. This requirement concerns one system only, and its generality is of the 1st kind. But again there is also the claim that what our theory says about anyone system holds for every system belonging to the domain of validity of the theory - an obvious claim of the 2d kind. The distinction in question is particularly perspicuous in the case of frame theories of physics like classical Hamiltonian or quantum mechanics. For these theories it is important that such parts of the describing structures as the phase space or the (quantum mechanical) state space are variable. The concepts of these spaces are therefore genuine concepts of the 2d kind. By contrast a symplectic metric on a given phase space or an expectation function on a given quantum mechanical state space are concepts of the 1st kind. It is true that these concepts can be defined, so to speak, with generality of the 2d kind. But the very definitions then show that before we can speak of a particular metric, a particular expectation function etc. a phase space, a state space etc. must already be at hand. The question: which is the expectation value of observable A in state s? cannot be answered unless a particular quantum mechanical structure has been given. By contrast, the question whether a given structure satisfies the axioms of quantum mechanics not only can but must be answered without referring to a second structure. It may be added that from a purely set theoretical view point the elements Xp" and Sv of a given structure are always sets in the proper sense whereas the model class of a theory is always a genuine class (and not a set). But in physics as we shall see in the last section it is seldom a whole model class that matters, and the distinction of concepts of the 1st and 2d kind has also different roots. One last consideration may show this. In connection with propositions (1) and (2) the concept of a physical system will allow that numerically different real systems are described by the same mathematical structure. This has to be admitted if only because
284
IV.I8 General Laws of Nature and the Uniqueness of the Universe
we might never attain absolutely complete descriptions. However, we are free to require that systems with different descriptions really are different. Thus if we are given different data for the states of two gases we are entitled to infer that the given data refer to two (numerically) different systems. We then count two gases as different even if merely two different states of the same material object are prepared. This, of course, is but another way of expressing what has already been put into the notation X(y) and s(y) in (2): In the context of one theory to every system a unique description is assigned. This convention is perhaps less innocent than it looks like. For it is meant to imply that different descriptions are even incompatible. If system y is described by structure (X(y); s(y)) then we cannot without leaving the theory, add to this structure in order to get a more complete description. The presence of several systems in the sense of the quantification in (2) is, therefore, altogether different in kind from the presence of the elements of any of the sets XI-" and Sy making up our structure and (possibly) quantified over in (1). There is no competition between, say, the points of space or time as there is a competition between the various descriptions offered by a theory as possible descriptions of a real physical system. We have, then, no difficulty in conceiving of any two parts of a physical system - parts in a very general sense - as "belonging to the same world" The way in which systems are described by structures clearly shows the cooperative role that the various elements of any set belonging to the relevant structure play in building up this structure and, with it, the system. On the other hand, it has still to be clarified what it means that the various systems of which the universal statement (2) speaks all "belong to the same world". The statement (2) by itself offers no hint whatever to answer this question. On the contrary, we have just seen that there is a certain competition between these systems which thus may even be unfavorable to their coexistence in a common world. Although everybody believing in laws of the form (2) tacitly implies that the systems in question do belong to our universe the prevailing interpretation almost seems to contradict this implication. According to the usual understanding the occurrence of any two systems submitted to our theory according to (2) amounts to what is most frequently called by such terms as ''repetition'' or "reproduction". Indeed it is the age-old methodological requirement of the repeatability or reproducibility of every experiment that stands behind a universal law like (2). We are here not dealing with a repetition in the system-internal sense, it is not the question of a periodic motion - no two swings of the pendulum. In the context of lawlike behavior repetition of a first instance of some law means a second independent instance of the law - instance or counterinstance - but at any rate a new system with different initial conditions. The system-external generality of (2) sometimes is even raised to the metaphysical level that the systems comprised within Yare so many different worlds. This, in an obvious sense, is not realistic. But there is a realistic approximation that I will call the laboratory view of
IV.I8 General Laws of Nature and the Uniqueness of the Universe
285
lawlike generality. It is the view that we are able to produce (or: re-produce) in our various laboratories approximately independent systems with different descriptions but all obeying the same law according to (2). In a sense we can, therefore, practically realize different independent worlds within our universe. III
However, we should take our problem also as a matter of principle, and in this case it is easily seen that a certain dilemma is coming up. The general proposition (2) - as a 2d kind generality - has no more the innerworldly character of the quasi-singular proposition (1). Precisely if the latter refers to such a world substitute as a physical system is taken to be, it is unclea~ within what new world a proposition (2) is to be understood. As we have seen, to a certain degree we want to conceive of the systems Y E Y referred to in a law (2) as being mutually independent possible worlds. Only this, after all, explains our amazement about the regularity expressed in the law. On the other side we know that in any case of a physical theory the systems to which it refers are to be met with, if at all, as parts of one and the same, namely our universe. This, however, is not expressed in the theory (2) as it is expressed in (1) for the corresponding elements. The question, therefore, poses itself whether a reformulation of (2) as an innerworldly proposition (1) is possible. The physicist probably would find this question to be of little importance. If experiments of a given kind have been repeated at various places on earth at different times, why demand that, apart from the experiment itself, i. e. its kind, also this general fact or even its extrapolation to future experiments of the same kind should be given a separate innerworldly formulation? It is obvious that for the working physicist the interesting part of the task is completed with the description of the experiment itself, i. e. its kind, even if he would admit that it were of no interest to him if it could not be repeated. However, a philosopher might wish to confront this laboratory view - as I called it - with the cosmological view according to which we do not satisfy ourselves with pragmatic excuses but insist on a strict innerworldly formulation of the universal implication (2). To see whether the cosmological view of a general law of nature can be strictly maintained let me recall how systems composed of several independent subsystems are treated in physics. In the simplest case the connection between the three descriptions may be indicated by (5a) The independence wanted is here expressed by the fact that the statement 1712 about the total system (SYl' SY2) is the bare conjunction of two statements, each referring to one subsystem only. Whereas in (5a) all three statements may still be different, in the attempted reformulation of (2) the two statements 171 and 172 would have to be identical. There are even theories 17 for
286
IV.18 General Laws of Nature and the Uniqueness of the Universe
which we have (5b) if (SYb SY2) is a suitable description of the total system. Hamiltonian mechanics is a case in point and with some qualifications also quantum mechanics. The more general conjunctive decomposition (5a) can be found also in the universal parts of a physical theory. What Newton says in his "Principia" on space and time can easily be reconstructed in this way (with space as one and time as the other member of the conjunction). We know, of course, that this was not the last word on the matter and also for systems in the proper sense we feel that we cannot stick to such decompositions. But before we move on it is important to state that the innerworldly reconstruction of a law (2) that we are after would adequately be prepared by the procedure indicated. However, let us now turn immediately to certain restrictions to which the formation of the product of independent systems is submitted. The most important restriction - the one on which perhaps all the others are dependent - is the uniqueness of space and time. The most obvious assumption that we make in the innerworldly description of several physical systems is that all systems are to be met with in one and the same spacetime and that therefore spacetime must be a common element of the conjunctive members in (5a). Already Kant said: 8 " ... if we speak of diverse spaces, we mean thereby only parts of one and the same unique space. .. Space is essentially one; the manifold in it, and therefore the general concept of spaces, depends solely on limitations." Today we differ from Kant in several respects. Space has to be replaced by spacetime, and there is a general concept (of the 2d kind) of spacetimes. Together with other qualifications this concept is basic for general relativity. Still we do not consider theories of the universe or any physical theories in which the universe of discourse would be described by a structure containing several spacetimes. Now the uniqueness of spacetime has consequences for the presentation of its material content. At the dawn of modern physics Kepler's three laws did not yet allow to recognize this. But they are a good starting point for showing the difficulty we have to cope with. Kepler's first law, for instance, can be spelled out for any planet without taking into account the existence of the other planets. In these statements space and time as well as the sun are common elements. Apart from them the worlds separate, and we have as many physical systems as there are planets. The law that all planets move in ellipses can essentially be expressed by a (finite) conjunction with identical predications. However, as we know since Newton this reduction is only an approximation that eventually becomes grossly false, for instance, in the case of a system consisting of the sun, the earth and the moon. The essential insight was that, since all celestial bodies exist in the same world, they may interact with each other such that only their totality makes up a closed system 8
Kant 21787, B 39
IV.18 General Laws of Nature and the Uniqueness of the Universe
287
whose behavior as a whole follows a law. In fact, the matter stands even worse: The mutual gravitation in a system of bodies, according to Newton's theory leads strictly speaking to a totally irreducible system of equations of motion: If a system of bodies moves according to these equations no subsystem does. As a consequence, it is hopeless to look for an innerworldly reformulation of (2) if E essentially is given by Newton's equations. Within one and the same space-time strictly speaking at most one gravitational system could be realized. The Kepler/Newton case illustrates a general reciprocity of lawfulness and interconnectedness in nature. Lawfulness in the standard form (2) demands strictly independent instances of the law. In searching for laws the point just is to find such independences. Of course, these independences go together with internal dependences as they constitute the contents of the respective law. At the same time they mark the limits of the latter. As long as we have reason to assume that laws in this sense are realized in nature - strictly realized - there is no total interconnectedness in the universe. On the other hand, the realization in one and the same universe, as it will be required even by a modest empiricism, constantly draws our attention to the possibility to have missed some dependence. And the discovery of anyone in the context of an accepted law inevitably will destroy the law. The increase of discovered dependences cannot but lead to a decrease of laws in the usual sense as something fundamental. The physically interesting form of a theory is not the conjunction in (5a) but (5c) where int is an interaction term. But if (5a) is dismissed then so is (2) in any innerworldly interpretation. The interest in natural law and in causal connection, although going together in some sense, conflict with each other as soon as we widen the horizon, and in extreme cases it may follow from a theory (1) that its generalization (2) has at most one realization. Since we know quantum mechanics we have difficulties not only with the interactions but already with the product formation in (5b). Let us consider, for instance, the quantum mechanics of free electrons. The state of an electron is given by its lJi-function that determines for every observable its expectation value in the given state. According to the theory there is a whole Hilbert space of states. Now let lJi1 and lJi2 be any two of them. According to the laboratory view the pair (lJi1 , lJi2 ) again determines a possible description of the situation. In a concrete case we would say that we have prepared both states independently of each other. According to the cosmological view, however, this conjunction is by no means the most general description of the situation: If we take seriously that the particles belong to the same world we have to treat the situation as a 2-particle-system. We must pass from two I-particle ensembles to one 2-particle ensemble. For the latter, however, the pair (lJi1 , lJi2 ) is a correct description only in exceptional cases, the socalled separable cases. In general the two subsystems are inseparable, and
288
IV.I8 General Laws of Nature and the Uniqueness of the Universe
our knowledge about them is not maximal. Rather the information about the total system concerns many correlations between observables of the two subsystems. Again the possibility of an innerworldly reformulation of (2) is paralyzed from the outset. Against this argumentation it may be objected that the difficulties for a cosmological interpretation of (2) in connection with interaction and inseparability do not have any practical importance. All fundamental interactions have a finite range allowing for practically independent and yet internally interacting systems. Similarly, we can prepare practically separable quantum mechanical systems showing all the features of inseparability internally. And both possibilities are in accordance with the respective theories. All this has to be admitted, let alone the overwhelming number of cases where we find the independence in question not by looking at the theories but simply by experience. On the other hand, we have to remember that we are investigating a matter of principle. Theories about gravitation and the mechanism of compound systems are of a fundamental character. Such theories are not proposed only to say afterwards that they are to be taken cum grano salis anyway. And if such theories show us that the Galilean method, successful as it is, in the last analysis not only misses the factual constitution of the universe but also violates its laws then this deserves to be recognized and understood. The question whether there are any universal laws of the form (2) that are not only approximations but strictly valid is a matter of principle. If the question should be answered in the negative this would mean that the theories from which we can derive the laws in question as approximations cannot themselves be of this kind. And we would then be faced with the question of what kind they are after all. This paper does not allow to touch on this question - let alone to answer it. I may only remind us of one important feature of physical theories in their singular version (1): Together with assumptions contingent upon the theory we can infer other contingent statements from the theory. The best known examples are given by so-called initial conditions. Now whenever a real system is a candidate for satisfying the theory at least some of those additional premises will be true of the system. The conclusions drawn from these premises together with the theory are thus open to test without going beyond our system. It is true that in this way only a fraction of the theoretically possible initial conditions (in the general sense) can be put to use. However, as is shown by celestial mechanics we sometimes had to be and actually were content with this situation. And the amount of available evidence is restricted anyway. On the other hand, if one day we should come to the conclusion that the regularity view can only be maintained as an approximation this would be a most interesting turning point in methodology. We should neither be afraid of it nor loose sight of its possibility even now.
IV.19 On Limitations of Physical Knowledge* In her book "How the Laws of Physics Lie"l Nancy Cartwright (= NC) is mainly concerned with a certain reciprocity or polarity or complementarity between the explanatory power and the truth of a physical theory. This at least is the aspect of her work I became interested in anew after I had come across another such reciprocity - a reciprocity between the coherence and the generality of a theory or law. In both cases, it is a matter of a pair of epistemological virtues - truth and explanatory power in her case, generality and coherence in mine - virtues that we have learned to appreciate all the time but that, although we do find each realized by itself, cannot be realized together, i.e. in our cases: realized in one and the same theory. One is inclined to add, that, in a very loose sense, it is actually a matter of degree - that the common realization can be achieved only more or less, and this is the reason why I speak of 'reciprocities' here. But the main thing is the exclusion and the obvious limitation of our physical knowledge this implies: We would like to have true and general theories in physics with as much explanatory power and coherence as is possible. Unfortunately, however, there seem to be limits to this. The history of epistemology has witnessed such reciprocities almost from its beginning. Aristotle makes the distinction between the rrp01"EpOV or yvWPlfl&HEPOV rrpoc; ~flCic; and the rrp01"EpOV or yvwPlflW1"EPOV 1"~ cpucm - the prior or better known for us as distinct from the prior or better known by nature 2 , and he says about it: "the same thing is not prior by nature and prior to us, or better known by nature and better known to us. The things nearer to sense are prior and better known to us, those that are more remote prior and better known without qualification. The most universal things are farthest from sense, the individual things nearest to it; and these are opposed to each other" (transl. by W. D. Ross). This Aristotelian relation is still and perhaps more than ever relevant to the most recent developments of physics. With respect to his general theory of relativity, Einstein repeatedly deplored the situation by pointing out the reciprocity between a physical theory's closeness to experience and its logical simplicity. "It must be conceded", he says 3, "that a theory has an important advantage if its basic concepts and fundamental hypotheses are 'close to experience' ... Yet more and more, as the depth of our knowledge increases, we must give up this advantage in our quest for logical simplicity and uniformity in the foundations of physical theory." In our century, the analytical movement has discovered other limiting reciprocities that lie on the borderline of epistemology and the philosophy of language and concern not only our knowledge but also our understanding. In .. First published as Scheibe 1998 1 Cartwright 1983 2 Aristotle, Anal. Post. 71 b33 if; Metaph. Z 1029 b31 if 3 Einstein 1950, p. 15
289
290
IV.19 On Limitations of Physical Knowledge
his influential article "A Defence of Common Sense" G. E. Moore introduced a reciprocity between our certainty about and analysis of a statement 4 , and even earlier, in his lectures on the philosophy of logical atomism, Russell came up with a similar thing: He points to the "rather singular fact, that everything you are really sure of, right off is something that you do not know the meaning of, and the moment you get a precise statement you will not be sure whether it is true or false, at least right off" 5 . A particularly sophisticated notion in point emerged from the physics of our century. To overcome the epistemological difficulties that occurred in the new quantum mechanics, Bohr suggested and developed his concept of complementarity, according to which ''phenomena defined by different concepts, corresponding to mutually exclusive experimental arrangements, can unambiguously regarded as complementary aspects of the whole obtainable evidence concerning the objects under investigation,,6. According to Bohr, the two major complementary aspects characterizing the new situation in quantum mechanics are the complementarity 1) between space-time coordination and causality and 2) between the particle and wave picture in the description of phenomena. The union of the former characterized classical mechanics and became impossible in quantum mechanics. With the other pair it is essentially the other way round: The wave and particle pictures, excluding each other in classical mechanics, have been united in quantum mechanics - though with some changes of meaning. For Bohr, epistemological complementarity is not confined to physics but can also be found in other fields of human knowledge: in biology, in psychology, in anthropology and elsewhere. (An overview with references to Bohr's writings has been given by C. Chevalley in her edition of Bohr's "Atomic Physics and Human Knowledge") 7 The main goal of this speech is to acquaint you with the case of coherence and generality. But on my way to this, I also want to touch upon NC's case. I do this because I like to be in her company if the occasion arises and because it will widen our view of this kind of limitations of our knowledge, characterized by the pairwise exclusion of some of its components.
I Explanation vs. Truth Thus we first have to deal with the thesis that the two main tasks of a physical theory, namely to describe and to explain, exclude each other, so to speak, by degrees. Accordingly, in physics we would be confronted with two main types of theories: those that chiefly describe and those that chiefly explain. NC identifies these two types with phenomenological and fundamental theories 4
5 6 7
Moore 1959, pp. 33 and 53 Russell 1956, p. 179 Bohr 1939, p.24 Bohr 1991, pp. 396ff; see also Scheibe 1973c, Ch.l
IV.19 On Limitations of Physical Knowledge
291
respectively. "In modern physics," she says, "phenomenological theories are meant to describe, and they often succeed reasonably well. But fundamental equations are meant to explain, and paradoxically enough the cost of explanatory power is descriptive adequacy. Really powerful explanatory laws of the sort found in theoretical physics do not state the truth,,8 . The exclusion seems to be almost a matter of pure logic when she says: "I will argue that the falsehood of fundamental laws is a consequence of their great explanatory power,,9. This is a far-reaching thesis, and the question whether we can believe it will depend on our understanding of the meaning of its terms. Now NC is a little bit light-hearted, if I may say so, in matters of definition, and in the case before us we find ourselves almost in the situation described by Moore and Russell: With our usual understanding of the concepts of truth, explanation, phenomenological and fundamental law, the thesis seems to be quite plausible, but once we start a closer analysis more and more doubts come up. Our usual, historically-developed understanding of the matter is dominated by 19th century ideas on atomism: With his atomic mechanics, Boltzmann, for one, wanted to explain the phenomenological laws, as he himself called them, of continuum mechanics and thermodynamics lO . This task of explanation was evidently asymmetric: If it could be accomplished at all, it would show the fundamental laws of mechanics and not the phenomenological laws endowed with explanatory power. On the other hand, in matters of truth and falsehood, the latter had every chance to outdo the former ~ if not in their truth, at least in their empirical accessibility. Therefore, at face value the matter does not look paradoxical at all, and were it otherwise I would not present it as an instance of my epistemological reciprocities. Let us now look at NC's argumentation. As we have heard the fundamental laws for all their explanatory excellence are said to be false or ~ even worse - are said to lie. I am anxious to emphasize that, as I see it, NC's arguments are not meant to establish that all laws of physics are false. This is something almost taken for granted: In spite of the tremendous success of modern physics, it is safe to say and may be said even in this place that all our physical laws are, strictly speaking, false, i.e. their truth is only approximate. But they are false in different ways, and this is what NC wants to argue. There is, first, a kind of falsehood where we should not speak of a lie at all ~ as I do hope NC will accept. Laws are false here in all honesty: They are false in the sense that they are capable of an internal impmvement. This seems to be even the normal case which the physicists take care of as physicists. In this sense, even phenomenological laws can be false and can be improved. The ideal gas law, for instance, was replaced and improved by the van der Waals equation, geometrical optics was improved and refined by wave optics 8 9 10
p. 3; similar formulations on pp. 54, 56, 72, 73; references to pages only refer to Cartwright 1983 p.4; italics mine Boltzmann 1979
292
IV.19 On Limitations of Physical Knowledge
etc. Most importantly, moreover, fundamental theories can be corrected in this sense, e.g. in the case of gravitation where Kepler's laws were replaced by Newton's and these in turn were superseded by Einstein's. And Einstein himself made desperate attempts to get beyond his own theory. There is thus an honest falsehood in the sense that, for instance, it was humanly impossible for Kepler to foresee in which specific direction his own laws would go wrong one day but could be improved by some follower. There is, however, also the large realm of cases - the so-called idealizations - in which physicists, for some reason or other, make assumptions they either know at the time that they are false or at least foresee that, in a rather welldefined respect, something will go wrong one day. In some sense or other, all our theories are idealizations because in all of them something and indeed almost everything in the world is omitted. It is here where we begin to be deceived by the laws of physics. There is, for instance, the problem 11 of what Newton's gravitational law, in its simplest version where the force between any two bodies is specified, says if other, e.g. electrical, forces are also present. In this case, something seems to go wrong at least in the sense that the gravitational equations alone would not yield the correct motions. The physicist would perhaps see no real problem here, comparable, for instance, to the internal problem of the perihelion shift of the planet Mercury. He would simply suggest that the new forces have to be accounted for in our equations, and he would not expect to learn something specific about gravitation from this. However, the question of how one has to proceed in the presence of both kinds of force if so far one only knows how it is done with each kind separately is precisely the question NC wants to draw our attention to. It is not self-evident that, in this case, we simply have to add the forces, and this is a point on which many relevant physical text books are silent - a laudable exception being Mittelstaedt's "Klassische Mechanik" 12. In mechanics, then, we have a general procedure to handle the composition of causes - a procedure that is already applied within gravitational theory and reduces gravity to 2-body-forces. But mechanics is not the most general case. If we write down Maxwell's equations in electrodynamics and the Poisson equation for gravitation, nothing can be inferred from this about a possible interaction between the two fields. It was only a new theory, the theory of general relativity, that taught us that every electromagnetic field and indeed every energy carrier would produce a gravitational field. Moreover, the empirical rule that forces have to be added does not yet solve the philosophical problem of what it is, if anything, that force laws describe whenever their domains of application overlap. Even if, in our example, we now have the correct motions, the question still remains: what has become of the gravitational law and what of Coulomb's law insofar as they have in 11 12
pp. 56ff Mittelstaedt 1970, p.66
IV.19 On Limitations of Physical Knowledge
293
a sense become parts of the amended law. Some would say that even in this composition, each describes its proper force: the force due to gravity and the force due to electricity, respectively. But the fact that in this situation the forces are not detectable separately suggests that what is still present is something that, by its very nature, cannot be detected in that situation. And this are "the causal powers that bodies have" 13 . A follower of NC in this important argument is Andreas Hiittemann 14 . In his Heidelberg dissertation, he wondered about the way in which physicists calculate the specific heat of solids. Typically they think in terms of 'contributions' adding up to the total specific heat: one contribution due to the crystal, one due to the electrons, one to tunnel systems etc., according to whatever constituents the solid has. This understanding of the matter is hardly compatible with the usual empiricist attitude, according to which a certain behavior of an object, assumed in the theory, has to manifest itself under all circumstances. Hiittemann infers from this that physics in general is only about dispositions (or: tendencies) of objects. The term calculated for a crystal then becomes understandable as a contribution also in the amorphic body, because the body has always the disposition to behave like a crystal, even if this may be latent at times. NC has presented other cases in which, according to her view, the laws of physics lie, and it would certainly be worthwhile to classify them under a unifying viewpoint. I am saying this not without the reservation that in doing this we should be very attentive about the extent to which all these falsehoods really are more than variations of the normal internal failure of a theory, as I have distinguished it from the other more criminal cases. I am thinking here in particular of the so-called ad verum approximations, where instead of falsifying a law in the usual sense we verify an approximate consequence of itl 5 . We would also have to consider a less radical description of the whole affair - a description where falsehoods and lies are replaced by inapplicability. Heisenberg describes the idealizing method of modern physics by contrasting its inventor Galileo with Aristotle: Whereas Aristotle still would describe the real motions of the bodies, Galileo gives an answer to the new question how the bodies would fall if there were no air resistance. "The possibility," says Heisenberg, "to infer from the processes in nature simple, precisely definable laws has the price that we cannot any more apply these laws directly to those processes" 16 . But it is now high time to say a word also on explanations and their reciprocal behavior with respect to truth: the ''trade-off of truth and explanatory power" of a theory 17. In its strongest form, the thesis here is not only that the laws lie while having their great explanatory power but also - as quoted 13 14 15
16 17
p. 61
Hiittemann 1997 pp. 14f, 107fI Heisenberg 1943, p. 32, italics mine p. 56
294
IV.19 On Limitations of Physical Knowledge
earlier - "that the falsehood of fundamental laws is a consequence of their great explanatory power" 18. This is in evident opposition to the typical realist view that the explanation of true phenomenological laws by a fundamental law allows the (inductive) inference that the fundamental law is also true and, moreover, that the truth of the former is a consequence of the latter's truth. Now it is clear that this realist view strongly depends On the DN-like character of the concept of explanation that is involved. One is, therefore, inclined to conclude that NC's opposing view requires a new concept of explanation, such that the falsehood of fundamental laws really appears as a consequence of their great explanatory power. And indeed there is a suggestion for a new concept, the simulacrum concept of explanation l9 . On the other hand, much of NC's argument exploiting the excellence of explanations by fundamental laws seems to be quite independent of any particular concept of explanation and at any rate independent of the differences between DN-like concepts and the simulacrum account. A case in point is again given by Newton's gravitational law in the presence of other forces as compared with the, so to speak, "naked" case without any disturbing influences. Here it is correct to argue, as NC does, that, whatever concept of explanation we choose, the explanatory power of Newton's law given such a context is by far greater than in cases where we simply disregard the environment and consider only the pure gravitational situation. In physics we are interested, so to speak, in the voice of gravitation not only solo but also in the whole orchestra of all phenomena. So this is certainly a valid inference, even if I find that NC underrates the solo part of gravitation (and other interactions). Her own concept of explanation is used in a case that nicely illustrates the idea of reciprocity of truth and explanatory power. The central idea of the simulacrum account of explanation seems to be that the explanation of a phenomenological law by a fundamental theory is not given directly but is rather mediated by a model "which fits the phenomenon into [the] theory. The fundamental laws of the theory are true of the objects in the model... But [these] have only 'the form or appearance of things' and ... not their 'substance or proper qualities'" 20. Such mediating models can most readily be found in quantum mechanics, where special choices for Hamiltonians such as the square well, the potential step, the harmonic oscillator etc. take their role. The point is that ''we deploy a small number of well-understood Hamiltonians to cover a wide range of cases. But this explanatory power has its price. If we limit the number of Hamiltonians, that is going to constrain our abilities to represent situations realistically. This is why our prepared descriptions [i.e. essentially the models]lie,,21. I think the same point could be made for other open theories, i.e. theories such as quantum mechanics, classical mechanics, 18 19 20
21
p. 4; italics mine pp. 17ff and 151ff p. 17 p. 139
IV.19 On Limitations of Physical Knowledge
295
general field theories etc. where the dynamics is left open and where we are still free to find a Hamiltonian, a Lagrangian etc. that is as realistic as possible or, alternatively, to keep the number of these possibilities low in order to increase the explanatory power.
II Coherence vs. Generality Leaving NC's story now, I enter the second part of my talk, in which I want to present still another case of an epistemological reciprocity in the sense of the introduction. There is a mutual exclusion "by degrees" between the generality and the coherence or interconnectedness of a physical law. Put into common speech a physical law says that all physical systems of a given kind behave in a certain manner. All planets move in elliptical paths with the sun at one focus. All gases have pressure, volume and temperature in accordance with van der Waals' equation. All hydrogen atoms behave according to Schrodinger's equation with Coulomb potential etc. In a law reconstructed in this way, two things come together: 1) the content of the law, and 2) its universal form. The content is what the physicists have to find out by exploring nature. It concerns exclusively the individual system and is different from case to case in accordance with the different kinds of systems investigated. That besides this all systems of a given kind satisfy the same condition (according to the law) adds nothing to the law's content. Yet the second part - the universal form - does something immensely important: according to the common view it is only its generality - this kind of generality - that gives a law its dignity as a universal law and, therefore, as a law at all. So far, so good. The problem, however, is that the two parts into which we have dissected a law work against each other. It seems that a natural law, by its very universality and by the independence of the instances it implies, sets a limit to the proper task of the law, namely the establishing of a lawful connection between physical entities. Moreover, it could even happen, and indeed does happen, that the content of a law produces such an integral connection that it simply disallows (in principle) independence and universality in one and the same world. In such cases, the whole conception in question might lead to an internal contradiction, and at any rate an extreme coherentism, according to which everything is related to everything else, would in some important way not admit any natural law 22. My argument begins by bringing some piece of formalization into our consideration. The question is: What can be said about the logical form of a physical law? Orthodox textbooks in the philosophy of science answer this question by offering us the Aristotelean
B belongs to all A 22
Scheibe 1995c
(la)
296
IV.19 On Limitations of Physical Knowledge
or its Fregean version for all x: if Ax then Bx
(lb)
or even the counterfactual conditional for all x: If Ax were (or: had been) the case then Bx would be (or: would have been) the case
(lc)
as the proper and typical form of a lawlike statement 23 . But there are several reasons to reject this offer, one of which is that, if we take the trouble to look into the works of physicists, particularly systematically composed textbooks on theoretical physics, we do not by any means find formulations like (1). Neither the actual presentation of fundamental laws like Newton's lex secunda, Maxwell's equations or Einstein's equations nor that of derived and phenomenological laws like a gas law, Snell's law or Ohm's law remind us even remotely of anything like (1). Such formulations, therefore, are not a suitable starting point for any discussions about physical laws. What else is? The answer that seems to be closest to physics' usage is: Primarily - I do not say finally but primarily - a law is a formula, mostly an equation, indicating a description of a single physical system. Thus Maxwell's equations indicate a description of a system consisting of a current density and an electromagnetic field generated by it. I say "indicating" and not "giving" a description because in writing down Maxwell's equations, for instance, we do not want to associate a particular physical system with them. Rather, our equations contain variables - in Maxwell's case the variables for the field and the current - having no well-determined values but only a range of possible values assigned to them. Primarily, then, a law is no statement but a propositional form containing free variables. It is neither true nor false but rather tells us which physical systems of a preassigned class are physically possible and which are not. If a given system is the kind of thing that can be judged by Maxwell's equations, then it is physically possible if and only if its electromagnetic field and its current density stand in the relation that is formulated in the equations. Besides free variables, a law contains generally also constants - in Maxwell's case the velocity of light - and, if properly reconstructed, also bound variables. To find examples for these as well, I must recall that physical laws in the narrower sense do not hang in the air. They are, for instance, built upon certain assumptions about space and time or space-time, and these assumptions belong to the complete formulation of a physical theory. In a Newtonian theory, we would find bound variables for space points and others for time points in the formulation of the assumed geometry and kinematics - bound in this case because usually no particular space or time point would belong to the description of a physical system. 23
cf. Stegmiiller 1983, Ch.VI
IV.19 On Limitations of Physical Knowledge
297
Thus we find that a physical law such as those quoted earlier is not of the form (la-c) but may rather be symbolized by a formula (2a) where the constants ai give and the free variables Xk indicate a description of a physical system - the system of interest, so to speak. Besides, a physical law in the narrower sense (2a) would also contain all the basic assumptions made in the physical theory to which the law belongs. If I were asked what propositional forms (2a) are admitted a priori for our business I would answer: They are the forms by which we reconstruct a physical system as a structure in the technical sense in which this term is used in modern mathematics 24 . But this is not the occasion to dwell on this. To complete our trip to the more formal aspects of physical laws, I still have to mention that, after we have decided which formula (2a) we want to use for the description of our physical system, we can make proper statements of the form for all y: if Ky then 17(al"" ,am; xt, ... ,x~)
(2b)
where K is an intended range of application of our theory. The statement then says that all physical systems in K behave in a certain manner, namely such that their descriptions satisfy 17. I should not conceal from you that there is a difficulty connected with the identification of K. On the one hand, we do not want to say that all things behave according to the condition 17. There must be some true restriction brought about by K. On the other hand, this cannot be done in the language in which 17 is formulated - the language of the theory. If K were a predicate like 17, the answer to the question "which systems obey Kepler's laws?" could hardly be other than ''those that obey Kepler's laws". It follows that by K y we actually want to say something different in kind from what we say in the object language of a physical theory: We want to say, not what the systems in question, as described in physical terms, are like, but how they are given - are presented to us. This may be done by ostension, by enumeration, by pointing out paradigms, by informal pre-descriptions as they are obtained by words like 'bodies', 'atoms', 'stars', 'gases', 'crystals' etc. and by qualified pre-descriptions such as 'dilute gases, incompressible fluids, slowly moving particles etc. 25 The foregoing analysis will now be applied for the introduction of our two main concepts of coherence and generality. Statements of the form (2a), when used as axiom schemata of physical theories, reflect achievements of coherence of the theories. Physicists primarily think of connections between physical objects themselves. Causality as a connection between cause and effect certainly was and in some quarters 24 25
cf. Bourbaki 1968, Ch.IV
Stegmiiller 1976, Ch.lX.4; see also NC's "as if" operator in 1983, pp. 128ff
298
IV.19 On Limitations of Physical Knowledge
still is conceived of as being a connection between physical realities. And the same holds for interactions in the sense of contemporary physics. A case in point is Newton's theory of gravitation. What this theory has to say about one body as being a gravitating body cannot be said other than by relating it to every other body in the universe. Moreover, if we were to find a system of bodies moving exactly according to Newton's theory, this very same theory would permit us to conclude that the system is all-inclusive. No part of a Newtonian system being itself a Newtonian system, the part can only be understood by referring to the whole. This sounds very much like what we are told by the rationalistic coherentist philosophers26. Indeed, physical laws of the type of Newton's gravitational equations fulfill the rationalistic idea of coherence with an accuracy only dreamt of by other intellectual circles who boast that they think more holistically than the natural sciences. However, when it is asked what precisely the coherence of a physical theory is like, we are at a loss for an answer. And for the time being, it seems better to look for an answer by resorting to the descriptions we give of physical reality than to reality itself. Then a typical achievement of coherence is that a theory makes its contingent descriptions more or less redundant. A quasi-physical example is the age structure of our present assemblage. We can first give a totally contingent description of this system: We would note for everyone of us his or her membership in this assemblage, and for any two of us, say x and y, we would write down whether x is older than y or not. If there are N persons present we would need N(N + 1) statements for this. Alternatively, we could, as part of our description, use lawlike statements as they are given by the axioms for linear order. This would reduce the number of contingent statements still necessary for a complete description to 2N - 1. Thus if there were 100 persons now in this hall we could do this by means of about 200 statements from which, together with the laws, the 9900 other atomic statements of the unreduced set would follow. If we define the degree of coherence to be the quotient of the number of statements saved by the laws and the number of a complete contingent description, in the case before us this quotient converges to 1 if N becomes infinite. Examples less trivial abound in theoretical physics. The contingent description of the motion of a particle or of the space-time distribution of a field can be analyzed as being composed of (infinitely many) elementary propositions, totally independent of each other: no subset of the set of these propositions allows us to infer a proposition that is not already a member of that subset. A law, however, typically changes this situation by inducing dependencies between the otherwise independent elementary propositions: Depending on the law, various reductions to characteristic subsets become possible, the most famous ones being sets of initial and boundary conditions. The classical case is the determinism of systems of differential equations found in celestial mechanics, electrodynamics and elsewhere. Even the most abstract 26
Cf. Blanchard 1939
IV.19 On Limitations of Physical Knowledge
299
and formal theories of present elementary particle physics induce correlations between single contingent propositions. Most remarkable in view of our main argument is that, in all these cases, we infer conclusions from premises about the same system. Coherence is a matter concerning each physical system, independent of any other system. It is different with generality ~ the feature of physical law that is in some sense opposed to coherence. There are two kinds of generality: a systemconstituting and a system-transcending kind. And I am anxious to emphasize that it is only the latter that stands in the polar relationship to coherence under discussion. By contrast, system-constituting generality makes essential contributions to the coherence in the description of the single physical system. Take first, as a mathematical example, the structure composed of the integers together with their addition and multiplication. The statement that multiplication is commutative is, so to speak, a mathematical fact about this structure; it is, moreover, a general fact because it says something about any two numbers, and it is system-constituting because it is a coherenceestablishing contribution to the total constitution of the ring of integers. In Euclidean geometry, viewed as a theory of physical space, no theorem is quantifier-free and, therefore, all of them are system-constituting generalities leading to the unbounded interconnectedness of the structural elements of Euclidean space. Similarly, every dynamical law connects dynamical variables for all time points of an interval which again is a system-constituting generality. In these and many other cases, generalities contained in the property E of (2a) are at work in the description of a single physical system, adding to its coherence. However, in physics we find a second kind of generality that goes beyond the single system. It is the system-transcending generality exposed in (2b) where we say something about all physical systems in a certain domain. It goes without saying that the universal quantifier occurring explicitly in (2b) has exactly the same meaning as any universal quantifier that may appear in E. The difference I am talking about is not a difference in meaning but in context. The context of a system-constituting generality in E is the formation of a description of one physical system, and the entities quantified over are, so to speak, co-workers in this formation. By contrast, the context of the system-transcending generality is a domain of application containing several systems whose descriptions exclude each other. The essential statement defining a harmonic oscillator, for instance, is that, for all times, the acceleration is proportional to the elongation of the particle. This clearly is again a system-constituting generality: The pieces put together here ~ these momentary connections between acceleration and elongation ~ make up a reasonable oscillator only collectively, for a whole time interval. By contrast, the various solutions of the equation of motion defining an oscillator are competitors of each other. A given oscillator can be described only by one of them, and other solutions cannot add to or even complete this description
300
IV.19 On Limitations of Physical Knowledge
but represent other oscillators - other 'worlds', so to speak. The step from (2a) to (2b) is, then, a step that does not already occur during the formation of E. Rather it is a new and final step that cannot meaningfully be iterated as can the system-constituting generality. The situation of rivalry between the instances quantified in (2b) suggests the idea that the physical systems concerned are independent possible worlds. The independence means that data concerning one system do not allow us to infer data referring to any other system. In fact, as a consequence of this independence, the idea of coherence, dominating the formation of each single system, completely disappears as soon as it comes to the simultaneous consideration of several systems. There is a radical and inevitable fictionalism at work in the thinking of modern physics. Initiated by Galileo, it reaches far beyond the question of the universality of laws, and was already alluded to in the first part of this paper under the term 'idealization'. In our physical theories, we abstract from almost everything that constitutes the real world, and even the remnant addressed by the theory is often badly mistreated. But we have no other choice. If we could not know something before knowing everything, we would have no knowledge at all. And by applying Galileo's method of the as if - acting as if our little theatre were the total universe - we do get some knowledge. This by itself is a highly non-trivial fact about our world that justifies, to a certain extent, the assumption that the laws of nature are independent of the contingent shape of the universe. We must believe and I think we do believe that, if two moons of the planet Jupiter were removed from our world the rest would still behave according to Newton's law of gravitation. On the other hand, these circumstances forcefully contrast with a very important aspect of our physical science that brings us back on earth, or rather: to our world. They contrast with the requirement, indispensable for any empirical science, that the systems that a theory decrees to be physically possible, even if theoretically viewed as so many different possible worlds, should nonetheless be realized in one and the same, namely our, world at least approximately and in sufficient number. I need not emphasize that we would not have the physics that we do have if this requirement were not fulfilled to an astonishing extent. It is this fact, already pointed out, that allows us to replace, in theory, the physics of our world by a physics of possible worlds so attractive because of its simplicity. The question to be treated in conclusion, however, is: Do the two factors of coherence and generality, whose combination in a physical law was analyzed, still remain compatible if they are to be realized explicitly in the same world? To answer this question we have to find out how statement (2b) can be reformulated as an innerworldly statement or - equivalently - how sets of systems satisfying (2a) can be combined in one total system. It is important to see that this task has not been solved with either of the formulas (2). They give us no information whatever about the existence of a world in which two
IV.19 On Limitations of Physical Knowledge
301
or more instances of (2) can exist as subsystems. Now, conceptually, the formation of compounds presents no difficulty. For a mechanical theory, it is remarkably different in the two cases of classical and quantum mechanics. But in both cases having a dynamical law amounts to having a series Hm of Hamilton functions (operators) for which in all interesting cases
(3a) for all m, n 2: 1 whereas the case of independence would mean having equalities throughout. This was the case for Kepler's laws, but no two solutions of Newton's gravitational equations can be realized in one spacetime - if the theory is applied in all strictness. An analogous difficulty occurs in quantum mechanics already for the states of a system. Whereas in classical mechanics states of a compound system are Cartesian products of the states of the subsystems, in quantum mechanics, they are tensor products. As a consequence, the states of a compound system normally are not factorizable, i.e. we have again the inequalities
(3b) where ijj is a state of the compound system and 'PI and 'P2 are any states of the subsystems. In such a situation ijj does not provide us with any definite information about the states of the subsystems. We only know the state of one of them if we know it of the other one. This is a new kind of coherence totally alien to classical physics. And again we see it as a rival of independence. Against the foregoing argumentation it may be objected that the difficulties for an innerworldly interpretation of (2) in connection with interaction and inseparability do not have any practical importance. All fundamental interactions have a finite range allowing for practically independent and yet internally interacting systems. Similarly, we can prepare practically separable systems showing all the features of inseparability internally. On the other hand, we have to remember that we are investigating a matter of principle. Theories about gravitation and the mechanics of compound systems are of a fundamental character. Such theories are not proposed only to say afterwards that they are to be taken cum grana salis. And if such theories show us that the Galilean method, successful as it is, in the last analysis not only misses the factual constitution of the universe but also violates its laws, then this deserves to be recognized and understood. For the time being, I cannot see that such an understanding has been obtained.
v. Reduction
The papers in this chapter are superseded by the publication of a 2-volume monograph on the reduction of physical theories. 1 Reading this chapter may, however, still be worthwhile if one wishes to go the way oneself that finally led the author to the theory of reduction described in the book. As an introduction to the matter, paper [21] is recommended. The other papers may then follow in their numerical order. 2 [20] and [22] are case studies which, on account of their particularity, are proper supplements to the monograph. The remaining articles [23] and [24] are already sketches of the final theory and can be viewed as preliminary stages of it. Among the papers included in this collection besides those of Ch. V, [15] and [6] are also related to the subject of reduction. Reductions of physical theories always mark some progress in the development of physics, and the opinions of the physicists collected in [6] were of considerable influence on the formation of the theory under discussion. 3 With (the introductory) paper [21] I became engaged in the so-called Popper /Kuhn controversy at a time when Lakatos on the side of Popper and Feyerabend on the side of Kuhn, had entered the scene and delivered substantial contributions. Still without having a superior concept of reduction at hand, an obscurity in Popper concerning his reliance on the Hempel/Oppenheim explanations besides approximative explanations, both of theories, is clarified. On the other hand I argue against an excessive interpretation of the Kuhn/Feyerabend incommensurability of concepts and theories which allegedly paralyzes theory explanations. The base of the argumentation is an approximative version of the conditions of progress A1) and A2) in [21]. Together with certain contentual correspondences, approximative explanations are very well possible in many, if not in all, cases of theory succession and are not seriously affected by incommensurabilities. Confidence in the possibility and efficiency of approximative explanations (or: reductions) also comes from a detailed study of the Kepler/Newton case as it is started in [20] still without a definite and general concept of reduction Scheibe 1997b and 1999 Further stations on this road have been Scheibe 1971, 1975, 1976b, 1982a, 1986c, 1988d, 1988g and 1989a 3 Gell-Mann 1994, Ch.9, and Weinberg 1994, Ch.III, and 1995 deserve additional mention. E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001 1
2
304
V. Reduction
at hand. The attempt made with formulas (20)-(22) in 120] is in part found again in the concept of a limiting case reduction. 4 At any rate it turns out that the 2-body case can be reconstructed as a proper asymptotic reduction of Kepler's laws to Newton's theory.5 The n-body case is mathematically much more difficult and contains even singular exact (kinematical) reductions. In physics textbooks reductions frequently are performed only as partial reductions. In quantum mechanics, for instance, we get a reduction of the energy spectrum of a hydrogene-like atom with a nucleus at rest (= electron in a Coulomb field) to the energy spectrum of the same atom with the nucleus in motion. It is shown that the latter energy values are better approximated by the former the larger the mass of the nucleus is with respect to that of the electron. As a matter of course this is not a complete reduction of the theory of the atom with its nucleus at rest to the corresponding theory with the nucleus in motion. Rather the reduction is only partial because only a part of each theory is reduced to the corresponding part of the other - in the present case: the two energy spectra. In 122] paradigms of partial reduction are presented for the quantum mechanical harmonic oscillator. A general treatment of this subject is indicated in 123] and presented in greater detail in the monograph. 6 It is to be noticed that a partial reduction taken by itself is as good a reduction as any other. It is partial only with respect to more comprehensive theories from which parts are chosen to be submitted to a reduction, and this step can be taken whether or not the larger theories can be made the object of a reduction, too. But what is a reduction after all? Restricted to the case of theory reduction a new answer is given in [23]1 and in [241 with special emphasis on ontological reduction. 8 The general idea of a reduction of theory T to theory T' remains that T is redundant with respect to T'. But no attempt is made to become more explicit on the general level. No precise conditions of adequacy are proposed and a fortiori no definition is given. Instead a series of relatively special kinds of reduction are specified, and it is assumed that any two reductions A and B can be combined to yield reductions A . B of a kind different from that of A and B. In this way generality is achieved not as usual by analytic explication on a most general level, but by a method of successive synthesis of the reductions. Besides exact reductions for which no approximations are needed, the approximative reductions are particularly important because they allow theory reduction to be a process of correcting older theories by newer ones. Among the exact reductions we find generalizations, equivalences and refinements of theories, the approximative reductions may be asymptotic, local and limiting case reductions. The combination of two 4
5 6 7 8
Scheibe 1997b, Ch. V.2 Scheibe 1997b, Ch. V.1 Scheibe 1997b, Ch. VI, and 1999 passim Scheibe 1997b, Ch. 1.3, and Chs. IVff Scheibe 1999, Ch. IX
v.
Reduction
305
exact reductions is again exact, and an approximative reduction combined with any reduction is again approximative. Particularly important are the equivalences in their role as auxiliary reductions. With their help one obtains conceptual assimilations of theories to be reduced to the reducing theories. A famous example is the assimilation of Newton's field theory of gravitation to the theory of general relativity with the help of the Newton/Cartan theory. The application of the general theory of reduction to outstanding examples from physics is only indicated in [23) (§§IV and V). It is treated in more detail in the 2d volume of the monograph mentioned. 9
9
Scheibe 1999 passim
V.20 The Explanation of Kepler's Laws by Means of Newton's Law of Gravitation* Responding in a personal letter to a preprint of a paper which I had authored on the concept of physical explanation, C. F. v. Weizsiicker noted that there are solutions to the Newtonian gravitational equations according to which in a system of gravitating bodies one of the bodies performs an inertial motion. This possibility, which I had failed to see in the paper, complicates the explanatory connection I had sketched between Newton's gravitational theory and Kepler's laws. It is a particular pleasure for me to be able to dedicate my investigations since undertaken regarding this explanatory connection to the person who through his corrective comment gave the impetus to an important aspect of this work. I
Everyone knows that physics does not only gather facts, but that it also provides explanations for them. Yet it is not easy to say what these explanations consist in. The philosophy of logical empiricism advocates a model of explanation, the so-called deductive-nomological D-N model, according to which a scientific explanation consists in a logical deduction: The proposition expressing the fact to be explained is logically inferred from other propositions, where at least one of the premises employed must have the character of a law. l The D-N model of explanation has been criticized from various corners and for various reasons. In two previous papers, I myself have portrayed it as at least in need of supplementation in the case where what is to be explained is not simply a contingent fact to be explained by means of a law but is itself already a physicallaw. 2 For physics sometimes also offers explanations for its laws or even for entire theories. And thus the question arises, whether the D-N model can cover cases of this kind as well. It has been claimed that it can. 3 But even if one grants that the D-N model is an adequate and perhaps even the exclusively valid model for the case of the explanation of contingent facts, at the higher level of the explanation of theories, doubts will arise on both counts. In this regard, particular weight is accorded to the fact that, historically speaking, a supersession of a theory Tl (e. g. classical mechanics) by a theory T2 (e. g. quantum mechanics) is regularly accompanied by the phenomenon that T2 in this or that respect puts Tl in doubt. For in such a case it becomes questionable whether Tl and T2 can still be brought into a purely deductive relation or whether perhaps the concept of a D-N explanation loses its ap* 1 2
3
First published as Scheibe 1973a. Translated for this volume by Hans-Jakob Wilhelm See, for example, Hempel 1965, pp. 245ff and 331ff Scheibe 1970; Scheibe 1971 Hempel 1965, pp. 247f and 343ff
306
V.20 The Explanation of Kepler's Laws
307
plicability. One can ask, of course, why such cases would even demand an explanation, since Tl has been given up in favor of T2. The answer is that as long as a theory Tl remains uncontested and unsurpassed one does not have an explanation for it, and the need for such an explanation can only arise when something changes in this situation, i. e. when Tl begins to reveal certain deficient traits. Here ~ at the point of culmination of a scientific discipline ~ we find ourselves in a fundamentally different position with regard to explanation than in the lowlands of contingent facts: Inasmuch as the need for explanation arises at all, what is to be explained has already been recognized as defective. But that such a need can still arise in this case is due to the fact that Tl has proven itself empirically and that it is perhaps not without certain theoretical advantages. This can be reason enough for wanting to understand, on the basis of a new theory T2, to what extent Tl still remains intact and what in the end constitutes the advantage of T2 over T1 · Whenever physicists say that Tl is a limiting case of T2 ~ as, for example, classical mechanics is said to be a limiting case of quantum mechanics ~ we seem to be dealing with the kind of case just outlined. Yet, whatever physics has said about the relation between Tl and T2, when Tl is a limiting case ofT2, has been too sketchy or too general to be able to give us a clear picture of the problem at hand. This situation did not change substantially when the theory of science, aware of the inadequacy of the D-N explanation for accounting for the limiting case scenario, took up the cause. In the last fifteen years or so, several works were published on the subject, usually under the heading of an approximative explanation. 4 But once again, the arguments are, as a rule, too general. What is missing are detailed case studies which would prepare the way for a general conception of an approximative explanation. In what follows, I shall offer such a case study by means of an investigation into the relation between Kepler's laws (Td and Newton's theory of gravitation (T2). It will become apparent from the detail that in addition to the concept of an approximative explanation, towards which this investigation as a whole is geared, we must once more discuss the concept of a D-N explanation. II
Most physicists may be assumed to hold the view that in some sense of the word "explanation" 1) Kepler's three laws can be explained on the basis of Newton's law of gravitation, 2) while conversely Newton's law of gravitation cannot be explained on the basis of Kepler's three laws. 4
Popper 1958; Scriven 1963, esp. pp. 109fI and 123f; Hempel 1965, pp. 343f; Feyerabend 1962, pp.46fI; Feyerabend 1965a, pp. 228fI. Putnam 1965, esp. 206fI
308
V.20 The Explanation of Kepler's Laws
If, in the present case, it is possible to insert the concept of a D-N explanation for the sense of the word "explanation" in question, then - it seems - it should be the case that I') Kepler's three laws follow logically from Newton's law of gravitation, 2') while conversely Newton's law of gravitation does not follow logically from Kepler's three laws. As far as the logical relation between Newton's law and Kepler's laws is concerned, philosophers of science usually hold the view 5 that I") Kepler's laws do not follow from Newton's law, 2") while Newton's law does not follow from Kepler's laws either. And these statements are sometimes given the radical form of, and are deduced from, the Duhemian thesis of incompatibility, according to which DU) Newton's law and Kepler's laws simply contradict each other6. From this it seems to follow that although in the present case the concept of a D-N explanation may be employed with respect to 2), it cannot be employed with respect to the really important thesis 1) and certainly not for the view consisting of 1) and 2): 3") There is no D-N explanation of Kepler's laws on the basis of Newton's law. Instead, for the purpose of giving an explication of 1) and 2), it is claimed that
1111) Kepler's laws follow from Newton's law at least approximatively, 2111) but that conversely Newton's law does not follow approximatively from Kepler's laws,
and that in this sense 3"') an approximative explanation of Kepler's laws on the basis of Newton's law is possible7 . In what follows, I now want to show that this complex of questions cannot be dealt with as straightforwardly as the authors who write about it seem to imagine. First, we need to get clear about how a comparison in terms of 5
6
7
See the papers quoted in no.4, and Feyerabend 1963, and Nagel 1961, p. 58 This thesis goes back to Duhem where it appears, however, in a different context: Duhem 1962, Part II, Ch. VI. In the papers quoted in no.5 Popper, Feyerabend and Hempel put Duhem's thesis in our present context and infer DU) from it. See esp. the papers by Popper and Hempel quoted in no.4. I do not make anyone of the authors quoted in nos.4-6 fully responsible for the formulation of the statements 1') to 3"'). This formulation is detached from the particular aims of those authors and a reading of the total situation applied to the Kepler-Newton field. Feyerabend has rejected the idea of an approximate explanation altogether.
V.20 The Explanation of Kepler's Laws
309
an establishment of systematic connections between Newton's law of gravitation and Kepler's laws is to be rendered possible at all (Section III). The comparison itself will reveal that Duhem's thesis DU) is false, if it is not limited to suitable domains of application. It is true that the conclusions 1") and 2") (from DU)) remain valid, but not the conclusion 3"), at least not without further assumptions. For this conclusion still admits of the interpretation of conditional D-N explanations of Kepler's laws on the basis of Newton's law, i. e. explanations in which besides Newton's law there are additional conditions - a very common case in D-N explanations. But it is just the compatibility of Kepler's laws with Newton's law, i. e. the fact that DU) is false, which makes explanations of this kind possible. In that case, however, we shall have to ask whether the D-N explanations thus (formally) gained can satisfy as explanations. And since this must be denied, after this temporary return to the concept of a D-N explanation (Section IV), we shall find ourselves relying on approximative explanations after all. Here too, however, things are more complicated than past discussions were able to reveal. Thus 1111) is false precisely in the sense in which 1") is true, that is, in the sense of an unconditional inference. And finally, 2111) is false as well in that, as strange as it may initially seem, in this direction an approximative inference without additional conditions is possible. Yet, in a positive regard, it will become apparent - and oddly enough precisely because 2"') is not correct that properly understood the concept of an approximative explanation has a significant explanatory function which is at the same time able to express the superiority of the Newtonian conception vis-a.-vis that of Kepler (Section V).
III At the outset of my preparations for the intended comparison between Kepler's laws and Newton's law of gravitation, I want to note that it is impossible to link this comparison immediately to the usual formulations of these laws. It would certainly be instructive to demonstrate this in some detail. The constraints of this paper, however, require me to begin by introducing somewhat dogmatically two comparable formulations. Yet this will already reveal that what is to be compared are not two sets of laws or two theories of a determinate empirical content, but rather two forms of propositions which state that a system of bodies is a Newtonian or a Keplerian system. Afterwards, I shall at least sketch how these two propositional forms are connected with the usual versions of Newton's and Kepler's "theories". I shall continue to refer to these versions as "theories". To begin, we must find a common basis of comparison for the two theories. Without the explicit introduction of such a basis of comparison, any further talk about possible connections such as we are searching for here would only be idle speculation. Now, what is certain is that both Newton's theory as well as Kepler's theory have as their common subject matter the motions of
310
V.20 The Explanation of Kepler's Laws
bodies in space and time. Hence, a common basis of comparison will have to be looked for in a general kinematic theory. As such I choose a certain theory of space and time and of the motions possible within them according to which a class of Galileian inertial systems is characterized as spatia-temporal rest frames. It is imperative that this class be one and the same for both Kepler's and Newton's theories, since otherwise no comparison would result. If in a system E of N bodies 0'1, ... ,aN(N 2 2) one idealizes the ak as centers of mass, then with a given inertial system one can represent every ak by means of a vector function tk such that tk(t) represents the location of Uk at time t. I shall call every system of such functions tl, ... ,tN (subject to suitable conditions of differentiation) a possible spatio-temporal description of the state of E, and I shall think ofthese as united in the (Galileo-invariant) state-space (5 of E. The basis of comparison just outlined allows us to define the concept of a Newtonian system E. Besides the spatia-temporal description of state, the characterization of a Newtonian system must also include the masses mk of Uk. The decisive condition is that in an inertial system the mk together with the tk defined by the inertial system satisfy the Newtonian gravitational equations
tk == -
L ml(tk -
tl)ltk - t!l-3
(1)
l#k
These equations are Galilea-invariant and can therefore function as equations of motion in the underlying kinematic theory. Formally speaking, the concept of a K eplerian system is defined in the same manner. The masses mk are now replaced by the single positive constant p, which is likewise independent of the inertial system and which I call the Kepler-constant. In place of the equations (1), we have in an inertial system the conditions
ti == 0 tk == -p,(tk
o > ~Itk 2
-
tdltk - tll- 3
t'11 2
-
p,ltk - tll- l
(2:::; k :::; N)
(2)
(2:::; k:::; N)
which for the sake of simplicity may be called Keplerian equations, even though the last group of these conditions consists of inequalities8 . The Keplerian equations are obviously also Galileo-invariant and can therefore function as equations of motion in the underlying kinematic theory. Having stated these definitions, I briefly want to explicate and justify them. Newton's theory is usually presented in two parts. Besides the spatiotemporal concepts and the concept of mass, the concept of force and the 8
In (2) the matter, of course, is only that one of the bodies satisfies the first equation. This one is designated by "1".
V.20 The Explanation of Kepler's Laws
311
gravitational constant come into playas well. Within the framework of the fiction of a world consisting of mass points, these parts state: General mechanics: At any given time there is a total force acting on every mass point which is proportional to its (inertial) mass and instant acceleration. Law of gravitation: At any given time there is a force - i. e. the force of gravitation - emanating from every mass point on every other mass point in the direction from the latter to the former. The force is proportional to the gravitational constant and the two (inertial) masses and inversely proportional to the square of the momentary distance between the two mass points.
At first glance, both propositions give the impression of being well-determined, empirical propositions intended to be either true or false. On the other hand, it is also immediately clear that neither proposition by itself permits one to draw any conclusions regarding the possible motions in an arbitrary system of gravitating masses. Now, the real problem consists in the fact that even taken together these propositions do not lead to such conclusions forthwith. There are two factors which prevent this. First, in order to produce welldetermined equations of motion, general mechanics obviously requires a consideration of all kinds of forces acting on the mass points in the system concerned. If one leaves forces other than gravitational forces out of consideration, one thereby makes the assumption that other kinds of forces do not exist or that due to some contingent circumstances these may be disregarded in light of the force of gravitation. Second, for the same purpose the law of gravitation just as obviously requires a consideration of all the mass points present in the fictitious world. Thus, if one applies the law to some system of mass points, one again makes the assumption that no further mass points exist outside of this system or that due to contingent circumstances their gravitational action may be disregarded. We are thus led to a choice of a total of four assumptions 9 , of which in each case exactly one must be made an additional assumption such that in conjunction with general mechanics and the special law of gravitation we arrive at equations of motion which concern only gravitation. Mathematically, this would yield for each of the four mentioned cases of application (except for the gravitational constant) the gravitational equations (1). One could not claim point-blank, however, that any arbitrary system of mass points moves according to these equations, but merely that it moves in this way if one of the four mentioned assumptions is fulfilled. And moreover, in the three of these four cases in which something is expressly disregarded, this fulfillment of the equations of motion would only be approximatively guaranteed. Leaving this difficulty aside, the introduction of one of the four premises would 9
I. e.: a) no other forces or masses, b) no other forces and neglect of other masses,
c) neglect of other forces and no other masses, d) neglect of other forces and masses.
312
V.20 The Explanation of Kepler's Laws
nevertheless give us as the core of the Newtonian theory a determinate general proposition which explicitly contains the gravitational equations and hence the concept of a Newtonian system mentioned above. If we are to compare this with Kepler's theory, we must first remind ourselves of the fact that in its original formulation the latter referred to a single system of bodies: to the solar system consisting of the sun and the six planets known in Kepler's time. It is clear, on the other hand, that the three Keplerian laws formulated for this system can be formulated for any arbitrary system E of bodies 0"1. ... ,aN. In a given inertial system, these laws state:
K epI: a2, ... ,0"N move in ellipses with a common focal point in which 0"1 is at rest K ep II: 0"2, ... ,0"N move with constant aerea velocity K ep II I: the ratio of the cube of the major semi-axis to the square of the orbital period is the same for the elliptic orbits of all 0"2, ... ,O"N. Now, if we begin with the (so-called) Keplerian Equations (2) and use the reference frame in which 0"1 is at rest (t1 == 0), then KepI to KepIII will follow and the ratio stated in KepIII is J.L/47r 2. This is found in every textbook of mechanics. Conversely, it can be proven that the equations of motion in (2) with t1 == 0 follow from KepI to KepIII, where J.L in turn is equal to 47r 2 times the ratio stated in K ep II I. 10 Employing once again K ep I, the inequalities in (2) can be seen to follow: They simply exclude the hyperbolic and parabolic paths still permitted by the equations of motion. Thus it is shown that, applied to an arbitrary system E, the Keplerian laws are capable of a Galileo-invariant generalization in the form ofthe Keplerian equations (2), a generalization which is required if a comparison with the Newtonian theory is to be achieved. The abandonment of the reference to a particular system of bodies, however, turns Kepler's theory, as a proposition with a determinate empirical content, into a concept, that is, in conjunction with the Galileo-invariant generalization, this abandonment yields the concept of a Keplerian system mentioned above. Yet, in contrast with the Newtonian case, from the point of view of the Keplerian theory as such, it is impossible to see how this concept in turn could be transformed into a general proposition. For the purpose of a comparison with Newton's theory, this leaves us with no other option but to take from the latter likewise only the concept of a Newtonian system and to compare this concept with the concept of a Keplerian system. From the Newtonian theory and on the basis of the above consideration, we know in which cases it would be claimed that an actual system is a Newtonian system. And the comparison of the concept of a Keplerian system with the concept of a Newtonian system might teach us what - again from the perspective of the Newtonian theory - the conditions of validity look like for this concept 10
See, for example, Born 1949, Appendix 1
V.20 The Explanation of Kepler's Laws
313
of a Keplerian system. But first we shall disregard the Newtonian concept of force and, accordingly, let the gravitational constant be equal to 1, as is already provided for by (2). IV
I now begin the comparison by limiting myself to the most simple and yet most important case in which the concepts of the Newtonian and Keplerian systems are applied to one and the same system E having state-space 6. To do this, we must remove one final obstacle. It consists in the fact that the conceptual characterization of E in terms of Newton's theory contains the concept of mass which does not occur in Kepler's theory and that conversely the conceptual characterization of E in terms of Kepler's theory contains the concept of a Kepler-constant which does not occur in Newton's theory. Both characterizations share the concept of a spatio-temporal description of the state of E. Thus the comparison will have to proceed from the latter, while the other two concepts will have to be eliminated. This can be achieved by defining in each case a Galileo-invariant property of a spatio-temporal description of the state of E: Let {tdk E 6 be called Newtonian if there exist masses mk such that tk and mk describe a Newtonian system. And let {tdk E 6 be called Keplerian if there exists a Kepler-constant f-L such that tk and f-L describe a Keplerian system. We thus arrive at two Galileo-invariant subsets
(3) and everything else will depend on the nature of the relationship between these two sets. My first thesis concerning the relation between 6 New and 6 Kep is expressed in 6(.1)
New
6~ew and
n 6(.1) .../. 0 Kep r
(4)
are the set-theoretic complements of 6~ew and 6Kep in 6, and the bracket around ".i" indicates that while this symbol may occur in (4), it is not necessary that it occurs. (4) thus gives us a total of four propositions which together state that the concepts of a Newtonian system and a Keplerian system are in their application to one and the same system E logically completely independent of each other. Two of the propositions (4) are obviously explications of the propositions I") and 2") in II, while a third proposition (one that is only valid for N 2": 3) negates the incompatibility thesis DU) in II: 6Rep
o is the empty set,
(4')
314
V.20 The Explanation of Kepler's Laws
For each N 2: 3, we arrive at one class of cases {tdk E 6New n 6Kep by means of the following merry-go-round models ll . Let E consist of bodies IJl, ... , IJN. IJl moves uniformly in a straight line. The N - 1 other IJk move with the constant angular velocity w in the rest frame of IJl on a circle with the radius R around the center IJl, thereby constantly forming a regular (N - l)-gon. Each one of these models displays a i-parameter family of inertial systems of: IJl is chosen as the initial point and the plane determined by the (N - l)-gon of the remaining IJk is chosen as the xy-plane of the inertial system. Let IJ2 at time t = 0 have the coordinates (R, 0, 0) and let the rotation be counter clockwise. If we take the xy-plane as a complex plane and accordingly describe the IJk by means of complex-valued time-functions Zk, then on the basis of the definition of the inertial system given so far the description is given by
(5) in conjunction with (1
= 0,
(k
= exp { 27ri ~ -=- ~ }
for 2 :S k :S N
(6)
q(t) = Rexp(iwt) Now it must be shown that everyone of these models can be completed to form a Newtonian as well as a Keplerian system. For the Newtonian case, we choose m2
= ... = mN(= m)
for the masses of the circulating
IJk,
(7)
and ml and m in accordance with
(8) with the pure numerical function
1(2) =
~ 1 xCN)
1(N) = 2 L
sin -1 (7r ; )
n=l
where X(N)
=
for N 2: 3
(9)
lY.- 1 for even N { ~2l for odd N
With (7) to (9), (5) satisfies the Newtonian equations (1). For the Keplerian case, we choose the Kepler-constant J.L in accordance with 11
It was the simplest of these models (for N=3) which C. F. v. Weizsacker had
pointed out to me as the case of a Newtonian system with one body moving inertially. This remark drew my attention to the possibility of (4').
V.20 The Explanation of Kepler's Laws
315
On this basis, (5) also satisfies the Keplerian equations (2). The merry-go-round models are quite trivial even with respect to the Keplerian laws in the sense that in the rest frame of (}1, all other (}k move in one and the same circle with one and the same angular velocity. Yet this class of models can be generalized in the following manner. Using again a complex notation, we see that the functions (5) satisfy the Newtonian equations (1) under the conditions
L ml((l -
(k)[(l - (k[-3
=
-P'(k
1#
(I ij
(11')
-I- (k == _p,q[q[-3
valid for a particular J.l > 012 . If, in addition, we assume for N 2: 3 that for 2 ::; k ::; N
(11")
then the Zk satisfy the Keplerian equations (2) as well. The merry-go-round models turn out to be special cases of these with the (k and q stated in (6), provided that (7) and (8) are valid for the mk and J.l. But the conditions (11) reach further: Already for N = 3, the most general solution is such that (}1 is at rest in the common focal point of two congruent ellipses with a collinear major axis, while (}2 and (}3 move with equal masses m on these ellipses pointsymmetrically with respect to (}1 and J.l = m1 + r; - corresponding with (8) and (10). The merry-go-round model is here merely the special case in which both ellipses coincide (in a circle). The fact thus demonstrated - that a system E can be a Newtonian as well as a Keplerian system - has consequences for the possibility of the application of the concept of an exact D-N explanation. Of course, an unconditional explanation of this kind, say, an explanation of the proposition that E is a Keplerian system by means of the proposition that E is a Newtonian system (or vice versa), is not possible, for (4) also tells us that these two propositions do not follow logically from each other. But in analogy to (3), one can form new subsets
(12) in which Zus (Zus') are, relative to Kep (New), contingent relations between {tdk' compatible with New (Kep):
mk (J.l) and 12
Siegel 1956, pp. 74f
316
V.20 The Explanation of Kepler's Laws 6{ Zus,} Zus
~
6{Ke p } New
6{NeWI\ZUS} Kepl\Zus'
#0
(13)
With an appropriate choice of Zus (Zus'),
6{NeWI\ZU~} ~ Kepl\Zus
6{Ke p }. New
(14)
is valid in addition to (13). Here one exploits the fact that the Newtonian and Keplerian equations are deterministic. If, in the case of the merry-goround models, we choose, for example, initial conditions for positions and velocities that are compatible with (5) and (6) and if we take (7) to (9) into consideration for the masses mk, then we obtain a condition Zus for which (13) and (14) (the first case) are valid. Something analogous holds for Zus', if we consider (10) instead of (7) to (9): We then obtain the second case of (13) and (14). Now, does (14) together with the additional conditions (13) really offer us schemas for D-N explanations? Already at the beginning of this paper we noted that, although the concept of a D-N explanation is also intended with a view to the explanation of laws (and not merely of contingent propositions), precisely in this respect it has so far not been sufficiently tested. With (13) and (14) we now have a case for which such a test is possible. I begin with the assumption that the propositions" E is a Newtonian system" and" E is a Keplerian system" are lawlike propositions about 17 13 . In (14) we are then dealing with the schema of an inference to a lawlike proposition. The premise is a conjunction of a lawlike and a contingent proposition provided that, as was done above, initial conditions are expressed in Zus. The first line of (13) shows that the lawlike premise is necessary for the stringency of the inference. The second line of (13) together with (14) shows that the (lawlike) proposition to be inferred is not already valid for logical reasons, i. e. that it is empirical. Except for the lawlike character of the proposition to be inferred, it is just conditions of this kind which were envisaged as the conditions of adequacy for the D-N model of explanation. All this becomes again very clear if we suppose for a moment that the lawlike propositions on the right side of (14) are replaced by certain contingent propositions about the state of the system, where these propositions may refer to a different point in time than those appearing in Zus on the left side of (14). It is just the cases generated in this manner which Hempel took to be paradigmatic for D-N explanations 14 . Thus, inasmuch as the D-N model reveals how it is to be applied also to the explanation of laws, (13) and (14) represent such explanatory possibilities. And with this we would have refuted not only the incompatibility thesis DU) 13
14
They are lawful not in so far as E is a single object, but still in so far as the statements are Galileo-invariant. See Hempel 1965, esp. p. 351
V.20 The Explanation of Kepler's Laws
317
but also the thesis 3/1) which is based on it, i. e. the thesis regarding the impossibility of D-N explanations in the Kepler-Newton-realm. Of course, this is not the final word about Duhem's thesis, nor is it the final word about the question regarding the possibility of D-N explanations. Duhem's position can be defended within the framework of a dynamic comparison between Newton's and Kepler's theories 15 . Within the framework of a purely kinematic comparison, this thesis can only be saved if it is relativized to suitable contingent domains of application: There are obviously countless domains of application 6 0 ~ 6 such that 6 New n 6Kep n 6 0 =
60
0
1= 0
(15)
and from an empirical point of view our solar system belongs to such a 6 0 , As far as the concept of a D-N explanation is concerned, our demonstration of the applicability of this concept is subject to the following objection, one that can be directed either against our reconstruction or against the usefulness of the D-N model for the explanation of laws in general: A {tdk E 6 is explainable as a Keplerian case in accordance with (14) (upper index-line) precisely when it is explainable as a Newtonian case in accordance with (14) (lower indexline) and it is explainable as a Newtonian case in accordance with (14) (lower index-line) precisely when it is explainable as a Keplerian case in accordance with (14) (upper index-line). This can be demonstrated, and in addition it turns out that in these explainable cases we are dealing precisely with the {tdk E 6Newn6Kep, a fact that again demonstrates the complete symmetry. Yet this symmetry contradicts our intention regarding the one-sidedness of the explanatory relation formulated in 1) and 2) ofII. Thus it seems that we cannot say that the two lawlike propositions explain each other. Rather, one could understand the existence of their common domain of validity in such a way that within it both lawlike propositions can equally be drawn upon for the purposes of deterministic explanations of propositions about states. If E, for example, is a merry-go-round in the sense defined above, then a state of E at time t1 can be explained through another state at time to by means of the proposition that E is a Newtonian system as well as by means of the proposition that E is a Keplerian system 16. For a true explanation of the Keplerian theory through the Newtonian theory, however, we must look elsewhere.
v For our purposes, we must look towards the idea of an approximative explanation. The turn towards this type of explanation does not merely have a 15 16
The possibility of such a dynamical comparison as different from the more natural kinematical comparison drawn here will be treated in another paper. This view lowe to a written communication by Hempel.
318
V.20 The Explanation of Kepler's Laws
negative motivation, that is, the inadequacy of the demonstrated possibilities of a D-N explanation. Such a failure does not yet point into a definite direction where a suitable substitute might be found. We can be steered into a new direction, however, once we realize that already the single proposition that this determinate, empirically given system E is a Newtonian or a Keplerian system, as an empirical proposition, as a rule cannot be stated with the same mathematical precision as the mathematical proposition (tl, ..• ,tn E 6 New (or 6 Kep ) which represents it. The reason for this might be the incompleteness and inaccuracy of possible measurements ~ even when, and especially when, E in itself is a Newtonian or Keplerian system. But the reason might also be that the latter is not the case at all and that instead E is only approximatively Newtonian or Keplerian 17 . This latter case is often coupled with the former case of the inadequacy of the available measurements in such a way that a particular empirical proposition is at first confirmed within the framework of measurements taken with such and such a precision and subsequently put into question by further and more precise measurements. But even if in such a case one is eventually led to regard the proposition as false, one would not want to regard it as grossly false. One might even continue to have an interest in it as having an approximative validity and expect to find an explanation for this approximative validity. This is precisely what happened with regard to Kepler's laws, and this case has not remained an exception. We are fundamentally in a position such that a theory which has proven itself well empirically and has been taken up into the stock of physical science is not immune to being one day empirically refuted and ~ sooner or later ~ replaced by a better theory. But even then one will try to understand within what limits it remains valid, and this justification will obviously have an approximative component. The approximative comparison of Newton's and Kepler's theories proceeds from the same basis as that which was assumed for the exact deductive comparison. In particular, we shall again juxtapose the two concepts of the Newtonian and of the Keplerian system. Limiting ourselves as before to the case in which we are dealing with one and the same system E, we shall again proceed from the sets 6 New , 6 Kep , and 6, as they stand towards each other in the relation (3). For the approximative comparison, however, we shall now need a topology on 6. Any approximative explanation will at a certain point have to appeal to a topology, and this topology must be expressly introduced, since the entire explanation will proceed relative to it. In the present case, I 17
Internal reasons for the merely approximate status of a Newtonian system follow from the conditions of validity of Newton's theory mentioned in sect. III. External reasons may follow from Einstein's gravitational theory. Corresponding internal reasons for a Kepler system do not exist: A purely kinematical view does not permit the occurrence of "other forces", and "other bodies" cannot make trouble since every subsystem of a Kepler system including the central body is itself a Kepler system. Corresponding external reasons for this case evidently come from Newton's theory, see below.
V.20 The Explanation of Kepler's Laws
319
choose the topology for 6 defined by the system of neighborhoods
(16) with the arbitrary number E: > 0 and the neighborhood parameter {t~h. This topology corresponds to the intuition of an approximation of a system of paths of the N-body-system E = (al,'" , aN) by means of another such system of paths. Presumably, the situation is then governed by the following relations: 6 Kep <:;;; 6New 6New
g6
Kep
(17)
Since in (17) the line on top signifies the topological closure (in the topology (16)), the propositions (17) obviously go beyond the purely set-theoretic relations of type (4). In accordance with (17), 6 Kep lies within the closure of 6New, but not (as we know already from (4)) within 6 New itself On the other hand, 6 New is no more contained within the closure of 6 Kep as within this set itself. The second line of (17) will no doubt be accepted. Yet a strict proof of the inclusion above it should be difficult in the case of an arbitrary N. This proof is simple only in the case of N = 2, and it goes as follows: Let tI, t2 and /-t be an arbitrary solution of the Keplerian equations (2). Furthermore, let E: > 0 be arbitrarily given. Then we choose the masses ml and m2 in accordance with
(18)
The latter can be achieved for all t, since It2(t) - tl(t)1 is limited on account of the inequality in (2). Furthermore, we assume
(19) In that case t~ and t; fulfil the Newtonian equations (1), and (16) is valid as well. From (16) we can see that the smaller the E: that is given, the smaller we must choose m2 in comparison with mI. In the limiting case, ml approaches /-t. This will also have to serve as the basic idea of the proof for the general case. For the construction of the Newtonian solution, we are free to vary the masses in particular. Since, according to the Keplerian equation, the a2, ... , aN do not interact, and al describes an inertial path, m2, ... ,mN will have to be small compared to mI, and ml will have to be chosen as
320
V.20 The Explanation of Kepler's Laws
approximately equal to J-L. FUrthermore, m2, ... , mN will all have to be of the same order of magnitude so as to exclude the possibility of a "capture". As was already mentioned, we shall not provide a general proof of the first half of (17) at this point. Here we are merely concerned with a (plausible) conjecture 18 . In any case, with (17) as a whole we now have a suggestion for a conception of a relationship between the concept of a Newtonian and the concept of a Keplerian system which can form a suitable starting point for the derivation of a satisfying concept of an approximative explanation. To begin with, the first line of (17) can also be written in the form (20) for all c. We stipulate that generally for a subset 9J1 ~ <5 the set (9J1)g consists of all the state-descriptions of <5 lying in the c-environment of a state-description of 9J1. We can further expect that an inversion of (20) is valid in the form
(21) and <5 zus A Zus. <5New A Zus A Zus.
rz.
(<5Kep)g
i= 0
(22)
Here Zus and Zus g are suitable relations between the masses mk and the state-descriptions tk which - as in (14) - are added as additional premises to the conditions that define a Newtonian system. In the case of N = 2 it is again readily apparent what the additional conditions look like. If t~, t~ with the masses ml, m2 is a Newtonian solution with t~ as its center of gravity, then for all t Zus is given by
~It~(t) - t~(tW -
ml
(1 + ::) -2It~(t) - t~(t)l-l < 0
(23)
and Zus g by
(24) If we then define tl, t2 by means of a resolution of (19) and J-L in accordance with (18), we obtain a Keplerian solution, and (16) is valid. This is the proof 18
In case the first line of (17) does not hold for the topology (16) one would have to retreat to the following weaker topology: (16) is restricted to a finite time
interval, and this time interval together with {t~h and c is used as parameter for the new neighborhoods. With respect to the asymmetry to be introduced below this topology would be empirically significant since only finite time intervals are admitted.
V.20 The Explanation of Kepler's Laws
321
of (21) and (22) for N = 2. The procedure for an arbitrary N is again difficult. This time, the basic idea will be to obtain the equations in (2) from the equations (1) by eliminating the terms of interaction between the 0'2, ... ,O'N. Roughly speaking, in a generalization of (24), the condition Zus e will be that these terms are small. In order to arrive at the inequations in (2), a further condition Zus must obtain which generalizes (23). With (20) on the one hand and (21) and (22) on the other, we have only gathered the mathematical facts decisive for the approximative connection between 6New and 6 Kep ' In a final step, it is important to restate these facts together with their consequences in a terminology which will demonstrate their significance for the problem of explanation. It is immediately clear that, with regard to the question of the approximative inference of the one theory from the other and vice versa, the situation is just the reverse of the situation that was initially stated in 1"/) and 2" /): Inasmuch as one wants to regard (20) and (21) as such inferences, Newton's theory follows approximatively from Kepler's theory, while in the opposite direction, that is, in the direction that is really intended, additional conditions must obtain. If, on account of this, one is in principle prepared to accept additional conditions into the context of explanation, one must furthermore observe the fact that as for the rest, (20) has the direction from Kep to New and (21) the opposite direction. Hence it seems that, as in the case of the exact D-N explanations, with approximative explanations one can also explain in both directions: from Newton to Kepler and the other way around. And in a certain sense this is true. In contrast with the former case, however, we can now note an essential asymmetry with regard to the domain of the respectively explainable cases. With regard to the exact D-N explanations, we noted earlier a complete symmetry in the sense that every Kepler-case that is (conditionally) explainable by means of New is also a Newton-case that is (conditionally) explainable by means of Kep and vice versa. With regard to the approximative explanations, by contrast, we have the asymmetry that the overwhelming majority of IO-environments of Newton-cases contain no Kepler-case, while conversely every IO-environment of a Kepler-case contains a Newton-case. In technical detail, these relations can be expressed as follows: We define that a Newton-case, that is, a (t~, ... ,t~) E 6New, will be explained up to 10 in accordance with (20), if there exists a Kepler-case (t1,' •• ,t N) E 6Kep with (16). And we shall say that it is explained approximatively in accordance with (20), if for every 10 there exists a Kepler-case with (16). Accordingly, we would define that a Kepler-case (t1,'" ,tN) E 6 Kep is explained up to 10 in accordance with (21), if there are additional conditions Zus and Zuse as relations between elements of 6 and systems of masses mk or between such elements and 10 such that (21) and (22) are valid, and if in addition there exists a (t~, ... ,t~) E 6 New /\ Zus /\ ZUS e with (16). And we shall say that a Kepler-case is explained approximatively in accordance with (21), if all this is valid for every 10. The first consequence is that the smaller 10, the smaller
322
V.20 The Explanation of Kepler's Laws
the number of Newton-cases that are explained up to c by means of (20) and in the limit c H 0 possibly only those Newton-cases remain which are at the same time Kepler-cases. On the other hand, with regard to (21), all Keplercases will survive the process of the respective limit: Let (tl' ... ,tN) E 6Kep be such a case, and let c be arbitrarily given. Because of (20) there is then a Newton-case (t~, ... ,t~) E 6New with (16). Let m~ be its masses. We choose Zus as the conditions (for an arbitrary (ti, ... ,t'N) E 6 with masses mk)
m'k = m~, t'k(O) = t~(O), t'k(O) = t~(O)
(25)
with Zus c as the condition
(26) Hence (21) is valid because of the determinist character of the Newtonian equations (1). And (22) is trivially valid. According to our construction, we further have (t~, ... ,t~) E 6New!\ Zus!\ Zus" as well as (16). Yet that is to say that the arbitrarily given Kepler-case is explained approximatively in accordance with (21). Notice that for this proof the conditions Zus and Zusc are chosen completely on an ad hoc basis, i. e. in dependence on the given Kepler-case. Yet this makes no difference to the proof. In case there are universal additional conditions - as (23) and (24) in the case of N = 2 these may be relied upon in the proof. In the case of N = 2, for example, one would define the Newton-case by means of (18) and (19), and this in itself would fulfil (23) and (24). Thus, in the sense of the definitions provided, it has been shown: Some (a few) Newton-cases can be explained approximatively on the basis of Kepler's theory, that is, according to (20). By contrast, all Kepler-cases can be explained approximatively on the basis of Newton's theory. With c still open, this result can be expressed as follows: If someone swears by Kepler's laws, and another comes along who has identified a system E up to a certain c as a Newton-case, then the first will not be able to offer him a general explanation for this case, because he will fail to provide the specification of the premise of (20). With the help of Newton's theory, however, one can satisfy anyone who puts forward a Kepler-case identified up to c - whatever case and whatever c this might be. In this sense, the Newtonian conception is superior to the Keplerian conception. The fact that it is only in the direction from Newton to Kepler that one can capture all individual cases by means of an approximative explanation - thus making oneself independent of the individual cases - might prompt one to say that only here are we dealing with an approximative explanation of one theory through another theory. We would have to bear in mind, however, that, strictly speaking, one would have to talk of concepts (of the Newtonian and of the Keplerian system) instead of theories. By means of this concept of explanation, we can now also state the conditions of the validity of Kepler's laws: They consist of the conditions of
V.20 The Explanation of Kepler's Laws
323
validity of the Newtonian gravitational equations stated in III together with any two conditions Zus and Zuse: for which (21) and (22) are valid for all c. These conditions may be chosen either in an ad hoc manner (as (25) and (26)) or universally (as (for N = 2) (23) and (24)): Every possibility may in principle be considered. Yet, one would say that the more universal the conditions the better, especially for the purposes of gaining an insight into the limits of the applicability of Kepler's laws. But in this regard there is no simple solution either. For, as was shown, there are strict common solutions of the two equations (1) and (2), as atypical as they may be, as well as the other extreme of a case, e. g. the system of sun-earth-moon, in which Kepler's laws are not suitable at all, that is, not even approximatively. The typical case in the present context is neither the completely indecidable case (from an approximative point of view) nor the easily decidable case, but rather the intermediate case, the decision of which is possible, but - because of a good approximation - practically difficult. The motions of the planets in the solar system represent such a case. For a certain time one was in a situation where already within a practically small co-environment one could not decide empirically whether, strictly speaking, one was dealing with a Keplerian system or with a Newtonian system. Yet, a priori it is clear: If we had been dealing with a strict Keplerian system, no lower limit c < co would have resulted for the applicability of the Newtonian theory. Conversely, in the case of a strict Newtonian system, the possibility of such a limit C1 regarding Kepler's laws was to be anticipated, as indeed it was later discovered empirically. Up to co, this Newton-case could still be explained by means of Kepler's laws, i. e. according to (20), but not up to C1. Accordingly, the additional conditions in (21) and hence the conditions of validity specific to Kepler's laws can no longer be chosen in such a way that the case would belong to 6 New /\ Zus /\ zus e1 •
V.21 Are There Explanations of Theories?* A piece written by Popper in 1957, "The Aims of Science," serves as the point of departure for the following considerations l . As far as I know it is the earliest essay in which a certain type of explanation of theories is hinted at. My own exposition will follow along the same lines, with some modifications. Since Popper's article seems to be little known, my first task will be to present the main features of its argument. Right at the start Popper poses the question as to what is the aim of science, and he replies to the effect that, in his opinion, ''the aim of science is to find satisfactory explanations of whatever strikes us as being in need of an explanation." Immediately following this reply, Popper clarifies what he wants to be understood by a 'satisfactory explanation.' It turns out that he understands it to be more or less what is today well-known as the so-called Hempel-Oppenheim model of an explanation or - as I will call it following Hempel's practice - the deductive-nomological (abbreviated as: D-N-) model of explanation 2 • Popper presents the following points, albeit in a somewhat different order than I do now: 1. An explanation is a class of sentences, one of which (the explicandum) is explained by the others, which together are designated as the explicans. 3 2. The explicandum is a logical consequent of the explicans. 3. The explicans contains sentences which rank as laws of nature. 4. The explicandum is generally known to be true. 5. The explicans is true, even if generally it is not known to be such; in no case however may it be known to be false. 6. The truth of the explicans is testable independently of the explicandum. This is Popper's presentation and it is precisely conditions of more or less this type which today have already found their way, frequently in a standardized form, into the textbooks of philosophy of science as an exposition of the concept of scientific explanation4 . It is especially conditions 2) and 3) that have led to talk of 'deductive-nomological' explanations. In addition to what Popper explicitly introduces as constitutive of an explanation, he naturally also intends much else that remains unexpressed. Here we can include the idea that the explicandum of an explanation - as distinct from what is explicitly required of part of the explicans - is indeed not a law of nature but a proposition which describes a completely contingent state of affairs. That Popper in the first instance intends things in this way is clear * Originally published as Scheibe 1976a, translated by J. A. Novak 1 2
3
4
Popper 1957 (1958) Hempel/Oppenheim 1948; Hempel 1965 Popper employs 'explicandum' and 'explicans' in place of the more usual 'explanandum' and 'explanans' respectively. As long as I am considering Popper's article, I will follow this usage. For instance, Stegmiiller 1969. See also its bibliography. 324
V.21 Are There Explanations of Theories?
325
from an idea which, although initially interwoven with his elaborations on the concept of explanation, subsequently gets freed from these and becomes the dominant theme in his later expositions. The idea is that an explanation will prove to be more satisfactory the more testable is its explicans and especially the law of nature contained therein. Further, a greater testability is obtained as science rises to theories - Popper will so express himself more frequently from this point onwards - of ever richer content, of an ever higher degree of universality and precision. Finally, in conjunction with this advancement, explanations are no longer explanations by means of laws, but become explanations of them; thus, laws constitute whatever it is that is to be explained. For Popper this path of science results in part from the fact that, if the aim of science is to explain, it will also be its aim "to explain what so far [that is, up to this point in his exposition] has been taken to be an explicans, such as a law of nature. Thus, the task of empirical science constantly renews itself. We may go on forever, proceeding to explanations of a higher and higher level of universality ... ,,5 We now finally reach the point - as indicated - where it becomes quite clear that it is not in the later part of his writings, as will be shown, but rather in his initial formulation (if it can be called such) of the notion of explanation that Popper is thinking exclusively of explanations the explicandum of which is a proposition that describes a contingent state of affairs. This agrees, on the whole, with the customary notion that D-N-explanations explain contingent states of affairs. Consequently, in the usual discussions of the matter, at least a glimmer of the notion shines through that precisely this difference, namely, between the explicans containing a law and the absolutely contingent explicandum, belongs to what constitutes an explanation as such. I am thinking above all about the frequently found exemplifications by means of which an author makes clear the necessary occurrence of a law in an explanation by showing a deductive connection in which not only the conclusion but also all of the premises are contingent and then says that these are not explanations. Popper thus finds himself in complete agreement with the orthodox opinion, in that he understands the notion of explanation from which he proceeds in his work as involving a contingent explicandum. However, insofar as he then, as noted, describes the path of science (precisely as explanatory science) as a path to ever deeper laws and explanations, he violates, for the first time, the notion of explanation laid down at the start. I wish to show now, while keeping Popper's article in view, that this is not the last time he does so. On the contrary, in the elaboration of his basic thought, Popper sees himself forced to a notion of explanation which violates not only a tacit supposition, but also two of his aforementioned explicitly posited adequacy conditions for a notion of explanation. One should also note that all of this happens without his indicating this situation even once, that is, without stating explicitly that 5
Popper 1957, p. 23 (1958, p. 26)
326
V.21 Are There Explanations of Theories?
he is henceforth saddled with a concept which at decisive points contradicts the one from which he started out. I will omit a passage which for my purposes (if not for Popper's) is insignificant, namely, one in which he rejects ultimate explanations which complete his rational hierarchy. I will pick up his remarks again at that point where he goes on to say more about the deepening of theories and explanations - about precisely that step whereby a heretofore unexplained theory is explained by a new and better theory. Popper is not convinced of a complete logical analysis of such an increase in depth, but he claims nevertheless to have found a sufficient condition for it. He illustrates this condition in the context of the transition from Galilei's law of free fall and Kepler's laws of planetary motion to Newton's gravitational theory. Contradicting the assertion (of anonymous origin) that the latter theory can be deduced from the former laws or even be their conjunction, he puts forward the contrary claim that the Newtonian theory stands in contradiction to both its predecessors: these, therefore, must be false if we assume (unterstellen) the truth of the Newtonian theory. Moreover, he further asserts that Newton's theory also explains its two predecessors. Popper describes what this explanation looks like in the following way: "Newton's theory unifies Galilei's and Kepler's. But far from being a mere conjunction of both these theories - which play the part of explicanda for Newton's theory - it corrects them while explaining them. The original explanatory task was the deduction of the two older theories. It is completed not by deducing them but by deducing something better in their place: new results which, under the special conditions of the older theories, come numerically very close to these older results and at the same time correct them. Thus the empirical success of the old theory may be said to corroborate the new theory; in addition the corrections may be tested in turn - and perhaps refuted or corroborated. What is brought out strongly by the logical situation which I have sketched is the fact that the new theory cannot possibly be ad hoc or circular. Far from repeating its explicandum the new theory contradicts it, and corrects it." 6 It is obvious - although Popper, as noted, does not expressly raise it that the concept of explanation which is sketched by him here through an example is quite irreconcilable with the notion of deductive-nomological explanation. While it is characteristic of, even essential to, the latter that the explicandum follow logically from the explicans, we now hear that these are contradictory. While it is a further characteristic of a D-N-explanation, at least usually, that its explicandum is recognized as true, we are now told that it is, in most cases, false. This is because we are told that new results can be deduced from the explanatory theory which correct the theory to be explained and at the same time confirm the new explanatory theory. Less euphemistically phrased this simply means that these new results are the 6
ibid. p. 29ff (p. 33f)
V.21 Are There Explanations of Theories?
327
results of observations which falsify the old theory (the explicandum) even if they perhaps do so only narrowly. We thus find ourselves confronted here not only with another notion of explanation but with one which is inconsistent with the concept of D-N-explanation; one which, moreover, is introduced into a context in which the consideration is no longer about the explanation of contingent states of affairs as in D-N-explanations, but about the explanation of law-like theories. The question yet to be treated is: what is to be thought both of this concept of explanation itself, as well as of the context within which Popper introduces it. This is precisely the question which, at the same time, is an initial specification of my opening question: "Are there explanations of theories?" The metascientific (wissenschaftstheoretischen) context within which, following Popper, I would like to situate the concept of the explanation of theories is a context which has special relevance today. It deals with the problem whether a demarcationism - restricted, if need be, to the natural sciences but conceivably more universal - can be made dominant. By 'demarcationism' I understand a scientific position which believes in the possibility of the formation of a demarcation criterion, that is, a criterion which is capable of distinguishing in an objective and universal fashion between what is science and what is not. Speaking more precisely one can say it distinguishes between good and bad science, or as one commonly says, between science and pseudoscience. It is an old dream of philosophers to develop criteria for doing good science, but it has acquired a peculiar relevance in our century. Aroused by the significant successes of the natural sciences - especially of physics which have characterized the first half of our century, the philosophy of science, resting on the contemporary upswing in logic, has made considerable effort to solve the demarcation problem. It is known that Popper was a dominating figure in this effort. On the other hand, this enterprise has had to confront internal difficulties from the start. The controversy between Popper and Carnap or - to speak of '-isms' - between falsificationism and inductivism is a notable example. 7 In addition, an external struggle soon allied itself with these difficulties, as a clamor arose placing in question the whole demarcationist program, even when restricted to the natural sciences. We can mention here, first and foremost, the emergence of T. Kuhn and Feyerabend who more or less tried to put down as hopeless a self-contained logical solution to the demarcation problem. 8 In order to give a better description of the situation, I would like to recall that recently at least a certain direction within demarcationism, which I myself am inclined to follow, has acknowledged an historical dimension. This direction is traceable to Popper, and it was, for a time, defended by Feyerabend as a so-called theoretical pluralism; it most recently emerged in Lakatos in the form of the so-called "Methodology of scientific research pro7 8
Compare Schilpp 1963, and Schilpp 1974, pp. 62ff and 976ff Kuhn 21970; Feyerabend 1970b, 1970c and 1973
328
V.21 Are There Explanations of Theories?
grams."g The basic idea is that the scientific nature of a human enterprise is not immediately apparent but becomes obvious - if at all - only after a time and then as progress. Thus, the unit of appraisal of a criterion of demarcation is no longer - as previously viewed - a single theory proposed at a certain time, but rather a temporal succession of theories or even of different versions of the basic thought pertaining to one and the same theory. Accordingly, the criterion of demarcation has to specify characteristic conditions so that with such a development progress is achieved. Beyond that Lakatos was the first seriously to consider that historical change is to be noted and is unavoidable not only in science but also on the level of methodology, that is, exactly where the criteria of demarcation are formulated. In order to render demarcationism successful he had thus to seek out criteria of rationality even for the development on a methodological plane. Understandably, he tried to resolve this problem in such a way that the solution made his methodology of research programs superior to the late version of Popper's falsificationism, its predecessor. lO Although these investigations are very impressive, I prefer not to pursue them further here nor to enter into Lakatos' methodology. Rather, I will take a step back and pick up some thoughts from the later period of Popper's attempts to establish a criterion of demarcation in order to connect them with some explicitness to Popper's initially proposed modification of the notion of explanation. In this way two interdependent aspects of the notion of progress - the progressive nature of scientific development on one side and its continuity on the other - should be seen in their interdependence, and thus as stronger than they would appear in Popper and in other authors. Among Popper's later ideas modifying his falsificationism in which his criterion of demarcation is increasingly seen under the aspect of progress, I want to concentrate on the idea that - to speak quite loosely - progress does not take place when we simply add new knowledge to old but rather first when through new knowledge we considerably modify (if not overturn) what we had already accepted. Thus, as a result of this view, progress is not merely cumulative but has a function of transforming accepted knowledge. Merely cumulative progress is excluded, and an additional advantage is made possible, to a certain degree, insofar as two consecutive theories T and Tl are genuinely competitive with respect to their truth and Tl even has de facto empirical success in a domain in which T empirically fails. Already at the outset we found Popper guided in this matter of explanation by an example in which he emphasized these characteristics. Even if it is not the case that Newton's gravitational theory absolutely contradicts Kepler's theory when rendered comparable in the appropriate way, it remains true that the two theories contradict one another when limited to one or another part of their 9
10
Popper 1963; Feyerabend 1965c; Lakatos 1970 Lakatos 1971
V.21 Are There Explanations of Theories?
329
common range of application and that in some of these parts Newton's theory holds up while Kepler's fails.l1 The notion that progress is tied to contradiction and conflict can thus be shown in an initial way in this historical example. The notion loses the paradoxical appearance which it might have at first glance, if one considers that it is not a question here of the replacement of an initial fully verified statement by another that is equally true, but rather a question of the relation of successive law-like theories which, all the same, we can never definitively confirm. The profit brought from the combination of contradiction and progress perhaps becomes immediately clear, however, when a new theory, which already stands in a priori conflict with its predecessor, contributes to the creation of facts which one would never hit upon if one had only the notion of a cumulative expansion of the existing theories. Still it would be incorrect, or at least insufficient, to build only on progress understood in this way. The development of physics, for example, is in no way a permanent whirlpool of mutually contradictory theories. Rather, a certain continuity dominates by means of empirical verification. Accordingly I here have in view, in the simplest case, pairs of theories which indeed stand in the competitive relationship mentioned earlier but in which, at the same time, the first, older, theory has been verified empirically, perhaps to a considerable degree. It is well known to what a significant degree, for instance, Newton's theory of gravitation, Newton's universal mechanics and finally the entire classical physics were verified before they were replaced by subsequent theories. One is thus forced to make certain that the empirical success of the superseded theory shows itself as such in the new theory. Only in this way does the genuine asymmetry come into the relationship of the two theories, namely, that on the one hand, failures of the old correspond to certain successes of the new theory, while, on the other hand, however, the converse does not hold. Only in this way can the old theory be overcome also in principle and can there justifiably be any talk of progress. Imagine the converse: empirical successes of theory T which are failures of Tl are juxtaposed to the empirical successes of theory Tl which are failures of T. T and Tl would thus be found to be in a deadlock, so to speak, and no one would be able to say which of the two theories represented progress over the other. Progress always requires a link with the existing situation and proof that the new does not mean a step back where the old has been verified. This consideration of the continuity of the scientific development still needs, of course, some precautions in order to secure it against such insistent objections as were raised by Feyerabend 12 . Feyerabend proposes that in certain cases which, by the way, he takes from physics: 11 12
Compare Scheibe 1973b; Scheibe 1973a (this vol. ch. V.20) Feyerabend 1970a, see esp. pp. 296ff
330
V.21 Are There Explanations of Theories?
1. the qualified scientists do not seem to feel themselves bound to prove that all empirical successes of a theory T be found as such in a successor theory T 1 ; 2. beyond this, certain arguments make it probable that in fact this relationship between T and Tl does not hold without exception. Feyerabend offers in consideration of the first point, for instance, that the non-relativistic portion of the perihelion shift of Mercury has been calculated up to today not on the basis of Einsteinian gravitational theory, but exclusively with the help of the old Newtonian theory. Regarding the second point, Feyerabend recalls, for instance, the many arguments in favor of the impossibility of obtaining the classical behavior of macroscopic objects (for example, in light of the dichotomy of the existence or non-existence of properties) from quantum theory alone. Now concerning the first point I think it is understandable that the demonstrations in question are at first omitted (for instance, on account of enormous mathematical difficulties) in favor of that research which might, with a new theory, obtain new empirical results. But this forward-looking attitude is still generally linked with the belief that the connection with the old theory is basically possible. A decisive proof that, for instance, the Einsteinian theory of gravitation was not capable of reproducing the empirical successes of Newton's theory would, in my opinion, cause a shock among physicists. One must, however, be careful here, especially in light of the situation in quantum mechanics. This can be done at least roughly in such a way that, to meet the second objection, one makes the proposed relationship of progress between two theories explicitly relative to an empirical domain. For instance, in the case of gravitation it is the cosmological domain; in the case of the remaining interactions it is the atomic domain. Accordingly, I want to summarize in a somewhat formal fashion what has been said thus far about the notion of progress as follows: In order that a theory Tl show progress over a theory T in an empirical domain B, two conditions must be fulfilled: A 1 ) For some empirical successes of Tl in B there are corresponding empirical failures of T. A2 ) For all empirical successes of Tin B there are corresponding ones of T 1 . It is to be emphasized in this that these conditions are only necessary conditions for the progress of Tl vis-a-vis T (in B). They can be made stronger in various ways and I will later refer to this. Al and A2 suffice, however, as minimal conditions for making clear what concerns me next - an argument to be introduced primarily only in a general way - namely, that these two conditions already provide a basis for a relationship of explanation between theories Tl and T. Stronger conditions will then have to be considered, if the further query arises as to what particular types of explanations here come into question.
V.21 Are There Explanations of Theories?
331
My first reason for anticipating the possibility of explanations in connection with the conditions of progress Al and A2 is the quite generally current opinion that explanations always come into play where we have the possibility of inquiring (with some hope of success) about a thing - heretofore accepted only in terms of what it is - as to why it is the way it is. A scientific theory which is still in the stage of acquiring status is a good example of something concerning the possible explanation of which we are not initially inquiring. It itself still has, at this stage, the active role of explaining something else by its help and thereby is itself proved useful. Still, this position is changed as soon as a new theory emerges which stands to the old in the relationship expressed through Al and A 2. This is the first possibility of explaining the old theory through the new, because with the latter one has found at least the kernel of a possible explanation. Further reasons that the new theory will be sufficient for an explanation can be provided through a separate consideration of Al and A 2. As far as A2 is concerned, it implies that the old theory T is empirically verified and is therefore a worthwhile object for an explanation. Over and above the case of the explanation of single states of affairs, there is, in the case of a theory to be explained, an additional factor: that it has been employed successfully towards the systematization of a possibly wide range of empirical facts. Such an undertaking will not be rejected wholesale or even be left only to the past as soon as there are definitive indications (namely in the form of Ad that a decisive further development of the science can no longer be supported by it but only by a new theory. This will even less be the case to the extent that what was at one time in the past distinctive of the theory now in principle superseded is the very same thing which, when it was new, was sought in order to establish its future status, namely, its empirical verification. The special statement of A 2 , namely that at least in a certain domain the successes of TI correspond to the successes of T, gives reason to hope, in addition, that the new theory TI will really provide an explanation. With a theoretical description of this domain one will know the conditions which make it understandable why, if they are fulfilled, theory T can be put in place of TI and why, in a temporal perspective, it once first held the same position.1 3 Besides condition A2, the condition AI, which I now emphasize in passing, provides, in an especially characteristic way, a basis for the relevant explanatory relationship of two theories in so far as the explanandum also is of a universal or law-like nature. With respect to the explanation of singular states of affairs in daily life we are quite inclined to think in terms of their truth or falsity and to hold the truth of the explanandum as a condition sine qua non of an actual explanation. If anyone asks me why I have remained unmarried, I will reply that I am married. Everyone feels that this answer, to the extent that it is true, frees me, better than anything else, from the obligation 13
For a more detailed discussion see Scheibe 1970 and 1971
332
V.21 Are There Explanations of Theories?
to explain why I am unmarried. Someone who wants to be persistent could possibly go on to the new question: what would be the explanation of my being unmarried, if I were unmarried. The case of science is different from all these relationships well-known to us. For instance, in physics, the question of an explanation, say, of Newtonian gravitational theory is not considered as settled simply by a reference to the fact that this theory is empirically false. This is the case for two reasons. On the one hand, insofar as one would still like to present the matter in terms of truth and falsity, an allusion to the truth of the theory can be made with a far greater right. On the other hand, the condition that in fact even here empirical failures are to be noted creates an additional attraction for the explanation of the theory. Given that a superior theory (in the sense of condition Ad is available, one would like to understand the old theory not only in so far as it was successful, but also to explain the limits of this success. In distinction from the explanation of single states of affairs, where the question is not posed at all in this way, the explanation of a law-like theory, if it did not also set the boundary between success and failure, would not be at all complete. That this boundary lies somewhere is implied precisely through the condition Al vis-a.-vis A2: within the domain of success of theory Tl there is a transition from the successes to failures of T. It is therefore to be hoped that the boundary in question can be derived from theories Tl and T themselves by means of eventual help from a conceptual grasp of the relevant domain B. After these general considerations it is time now to ask what sort of explanations we have here to deal with. On the one hand, these explanations guarantee continuity in the development of a science but, on the other, they are compatible with its progress, which appears perhaps at first glance even to be discontinuous. By this opposition I have already indicated the heart of the difficulties which the formation of a concept of explanation appropriate to the situation has to treat. Progress - in the sense of the introduction of perhaps basic innovations - is here in unavoidable conflict with its being linked back to something respecting which not merely some change but a progression has taken place and must also be justified as such. This tension is already found in the two given conditions Al and A2 insofar as Al is the progressive and A2 the conservative component, and insofar as one, after all, finds explanations primarily oriented toward A2 even though Al holds true. One can immediately clarify this polarity between Al and A2 by raising the obvious question concerning the usefulness of that concept of explanation which most recently has been a prominent theme in the discussions of conventional philosophy of science, namely, the concept of D-N- explanation. We have already seen by way of introduction that it is this concept which, on the one hand, is systematically introduced by Popper as the sole concept of explanation, only to be left aside again more or less unnoticed and to be replaced by another concept which is poorly explicated. The following argumentation should now show that this shift, at least as a shift away from
V.21 Are There Explanations of Theories?
333
the concept of D-N-explanation, is quite unavoidable. To present this argumentation as a strict proof would naturally require premises which there is no opportunity to present here. But even the following considerations which leave the finer details in the air will perhaps have sufficient persuasive force. As the simplest case of a D-N-explanation of theory T through theory TI I first take the case in which T follows logically from T I :
(1) If we assume that T and TI have the same empirical basis, then we follow not only the very close relationship between T and TI , but also the spirit in which empiricism advocated the concept of D-N-explanation. That would mean, then, that in Al an empirical success of TI not only corresponds to a failure of T, as I have cautiously formulated it in anticipation of what appears below, but that a success of TI is at the same time a failure of T. However, since according to (1) every failure of T is a failure of TI we immediately see that (1) is inconsistent with the progress condition AI. Now, the supply of possible D-N-explanations of T through TI is not exhausted by (1). We can still have the case
(2) where F is an additional premiss absent in (1). As can easily be shown, this weakening of the relation between T and TI makes room for failures of T that at the same time are successes of T I . However, if the D-N-explanations are not already excluded automatically on logical grounds through Al the experimental situation can still effect this exclusion. In the very example Popper employed as an introductory focus two things are demonstrated. On the one hand, as can be rigorously shown, Kepler's and Newton's theories have common, physically relevant models and this permits D-N-explanations (2) of Kepler's theory through Newton's.I4 On the other hand, the astronomically relevant experiences factually based on the solar system can never constitute an additional premiss in Newton's theory which would have Kepler's theory as a consequence. Otherwise the solar system (if it can be assumed that it is a Newtonian system) would also be a model of Kepler's theory and that is impossible due to the distinctive failures of the latter theory regarding this system. More important, however, than the exclusion through the experimental situation is the condition already mentioned above: that in important cases stronger conditions than Al specify the progress from theory T to theory T I , for example, that besides Al and already prior to experience it is valid that T and TI are contradictory. 14
See the works cited in no. 11.
334
V.21 Are There Explanations of Theories?
This relationship can be reconstructed, for example, for classical mechanics and quantum mechanics as well as for pre-relativistic and relativistic kinematics. It is therefore fundamental to two of the most important cases for which the problem of explanation is to be solved. BI is, however, plainly incompatible with schema (2) for D-N-explanations. What Popper had in mind in order to meet conditions AI, A2 and BI is the concept of approximative explanation. A complete explanation of this concept cannot be given here due to the metatheoretical apparatus needed for it. However, I will give the following, albeit grossly simplified, illustration. Instead of the two theories T and TI imagine two curves in a plane which represent two functional dependences of two physical magnitudes. The condition for a contradiction BI is expressed in this picture by the two curves having no point in common. Still the two curves can be so related to one another that the conflict between the two theories exemplified by them becomes practically meaningless in certain domains. This is the case if the two curves approach one another in infinity. On account of the limited accuracy always present in measuring we must exemplify the results of measurements through small but finite squares. In areas where the curves approach one another very closely the measuring squares will never be intersected by only one of the two but either by none or by both. Condition A2 is thereby satisfiable insofar as all measuring squares which are intersected by the curve belonging to theory T also contain a piece of the curve belonging to TI . On the other hand, there can also be measuring squares where the curves diverge, intersecting only the curve belonging to TI, and thereby condition Al would be satisfied. Finally, in this intuitive analogy an approximative explanation of T through TI can be found by explaining, in the sense of a D-N-explanation, only a strip (of finite width) around the curve belonging to T with the help of additional conditions which depend on an approximation parameter just as does the strip. When this parameter approaches zero, for instance, the strip around the T-curve narrows and at the same time moves away into regions in which the two curves asymptotically approach one another. The conception of an approximative explanation of a theory which - as was mentioned - can be formulated in quite a universal and rigorous fashion, is compatible with the conditions Al and BI, and also permits the proof of A2 in concrete cases. 15 In addition to the Kepler-Newton case there are mathematically much simpler cases such as the explanation of the ideal gas equation through the van der Waals equation or the explanation of the radiation law of Rayleigh and Jeans through that of Planck. Still the notion of approximative explanation, as it is presented here and insofar as it can be generalized without basic difficulties is linked to a presupposition which is not satisfied, or at least not satisfied without qualification, precisely in those cases in which we are inclined to see the transition of theory T to theory TI as a major or significant advance. The presupposition is that the theoreti15
ibid. and Scheibe 1975
V.21 Are There Explanations of Theories?
335
cal range of application of the older theory T can be essentially taken into the new theory TI and, if need be, expanded through the addition of new concepts. By the theoretical range of application of a theory, I understand a choice of theoretical entities (structures) determined by the initial understanding of the basic concepts of the theory such that only within this range, do the laws of theory lead, in turn, to their restriction to the theoretical range of validity of the theory. The theoretical range of application, for instance, of Newtonian mechanics consists of the set of all possible forces, masses, and spatio-temporal movements of bodies. One must first have this range before he can limit it through Newton's axioms to the theoretical range of validity of mechanics. By it is determined - and this also holds generally - with what sort of objects a theory is dealing. It becomes immediately clear that the extent to which one can rely on a theoretical range of application common to both theories is of decisive importance for establishing an explanatory relation between them. Thus, in my preceding illustration of the notion of approximative explanation it was presupposed that the Euclidean plane is the common theoretical range of application of both theories. Only in this way could the two curves drawn on the plane be placed in that relationship which made possible the satisfaction of conditions AI, A 2 , and B I , as well as the approximative explanation. As was mentioned, the presupposition in question is not satisfied in those cases in which theory TI is quite a considerable advance over its predecessor theory T. From the viewpoint of a physicist, especially Heisenberg has discussed this phenomenon. 16 For it he coined the expression 'closed theory' by which, for our purposes, is understood a theory which can no longer be improved through small modifications. For Heisenberg, Newtonian mechanics is the classical and historically oldest example of a closed theory of physics. He sees quantum mechanics, as one of the successor theories, as an improvement on Newtonian mechanics; he immediately adds, however, that, "it is not a question of a minor improvement, but of the radical revision of the conceptual basis. The behavior, for instance, of electrons in the atom cannot be understood with the conceptual equipment of Newtonian mechanics, but rather with the quite different conceptual apparatus of quantum mechaniCS.,,17 Also in the terminology of Heisenberg it is a question - in the case of such a conceptual revision - of the change of the theoretical range of application, "which," as he himself says, "is essentially already determined through the concepts employed in the theory." In fact, the difference of the theoretical ranges of application or of the systems of basic concepts of classical mechanics and quantum mechanics extends so far that even the notion of a contingent property of a physical object is affected by it. While Heisenberg's reflections emphasize more the finality of certain parts of physics and correspondingly have the notion of a closed theory for their 16 17
Heisenberg 1948 and 1973 Heisenberg 1973, p. 141
336
V.21 Are There Explanations of Theories?
focus, the same issue is treated in quite a different way by T. Kuhn and by Feyerabend from the perspective of the history of science and from a perspective of the philosophy of science. 18 Their notion of the incommensurability of two theories stands immediately for the relation between two theories which can surely be related to the same object in a sense which at first appears to be merely empirical, but which still show a more or less thoroughgoing difference in their conceptual content. I cannot examine more closely here the possible differences in the conception of the notion of incommensurability as they may exist between Kuhn and Feyerabend. Nor can I examine to what extent each of them has offered us something sufficiently exact with this concept. It is clear enough that they have before their eyes, in historical perspective, cases of greater revolutions in the development of the natural sciences. It is likewise clear enough to everyone who has ever attempted it that - considered now from a systematic point of view - the explanation of Newtonian mechanics through quantum mechanics, for instance, gives rise to difficulties at precisely that point which the notion of incommensurability would indicate. Undoubtedly one has to consider in the explanation of a theory T through a theory T1 that, although it is not necessarily to be taken as a condition of progress, still it is occasionally an important historically factual condition of progress that T and T1 are incommensurable.
Plainly C 1 by itself cannot constitute a notion of progress. If, however, T1 represents progress vis-a.-vis T, for instance in the sense of conditions Al and A2 while C 1 is added, then - it is clearly evident - it will be a matter of significant progress. The question is, however - and I would like to say a few words about this in closing - what is the situation with regard to the compatibility of all conditions propounded thus far, and can C 1 actually be inimical to the thought that we can and should explain successful but superseded theories - which is the idea under investigation. Leaving aside the less interesting question - discussed elsewhere - of the compatibility of C 1 and B 1,19 I would like to note that at least as regards the fundamental conditions of progress A1 and A2 the most necessary requirement in order to guard against difficulties of a possible incommensurability of the two theories T and T1 has already been anticipated. We carefully distinguish between failures or successes of Theory T on the one hand and the successes of Theory T1 on the other hand. Accordingly, the by nature necessary relationship between them is grasped not as an identity but only as a correspondence. This measure permits the conditions of progress to be maintained also in the case of different ranges of application, i.e. in the case of incommensurability. So it can be said in reference to A 1, for instance, that the so-called perihelion shift of a planet resulting from the Schwarzschild so18 19
See the works cited in no. 8. See the works cited in no. 15.
V.21 Are There Explanations of Theories?
337
lution was a success of Einsteinian gravitational theory. It cannot be said, however, that precisely this result was a failure of the Newtonian theory, because it is represented in a non-Euclidean geometry and already on this account cannot be a constituent of the Newtonian theory. There does indeed correspond to it an Euclidean description of which it can be said positively that it is a failure of the Newtonian theory. What is generally required in Al and A2 is, consequently, only the formation of a correspondence, broadly speaking, in the form of a representation of the empirically relevant part of the theoretical range of application of theory T in the corresponding range of TI . The incommensurability objection of Kuhn and Feyerabend directed against the possibility of the formation of criteria of progress imnlanent to science can thus be met if the noted correspondences are established. To be sure, this problem has not been solved completely so far. But in light of the present proposals there is no reason to maintain a priori that the situation is hopeless. The same measure promises, beyond its relevance to the problem of compatibility, some success in the formation of a concept of retroactive explanation of a superseded theory. I readily admit that the concept of D-Nexplanation drops out here, and we have seen above that this has its basis already in condition Al or, in any event, in B I . Regarding the incommensurability condition C I it can be argued further that schema (2) requires the same interpretation of T and TI and that the incommensurability of T and TI stands in the way of this. The same argument is likewise still valid if it is directed against approximative explanations which rest on mainly common theoretical ranges of application of T and T I . On the other hand, Kuhn and Feyerabend concede certain states of affairs which I would emphasize much more strongly than they do. For instance, Kuhn offers some considerations which are supposed to show that Newtonian mechanics cannot be deduced from relativistic mechanics - at least not in the usual sense of the word 'deduce,' nor indeed in any even approximate sense. In the course of this consideration he admits however: "Our [object-theoretical!]argument has, of course, explained why Newton's laws always seemed to work. In doing so it has justified, say, an automobile driver in acting as though he lived in a Newtonian universe ... ,,20 I would say, if one could really achieve this, he achieves all that he could reasonably hope for. In the same context of physics we hear Feyerabend saying, "It is of course true that the relativistic scheme very often gives us numbers which are practically identical with numbers we get from classical mechanics ... ,,21 or again, "Now it must be admitted that both theories have certain formulas in common and that one can derive from Einstein's theory ... a series of formulas which are identical with certain formulas of Newtonian mechanics.,,22 Again, such an admission is only made in 20
21 22
Kuhn 2 1970, p. 101, italics mine Feyerabend 1970c, p. 221 Feyerabend 1973, p. 102
338
V.21 Are There Explanations of Theories?
an argument which places the emphasis on the incommensurability of content of the two theories. Now, however, it must be admitted that the mere agreement of formulas or numerical values does not yield a sufficient basis for the explanatory relationships between theories. These agreements are in relevant cases by no means accidental, but they are accompanied by correspondences of content. Without these being present and without knowledge of them we would neither consider the aforementioned correspondences as important nor even discover them at all. In fact, certain relativistic concepts correspond to the Newtonian concepts, which, despite their difference from the former, still can be unequivocally coordinated to them so that the relativistic mechanics assumes the position of succeeding the Newtonian mechanics not only as a whole but even down to detailed concepts. Should we co-ordinate the concepts incorrectly ( verkehrt), we would not get the aforementioned agreements at all; the fact that (the latter) are only approximatively valid merely reflects the fact that corresponding concepts have different content. Still, a correct co-ordination completely guarantees an explanation of even the entire conceptual structure of Newtonian mechanics. Finally one can claim what Feyerabend states in passing when he says: Besides, why should the notion of explanation be burdened by the demand for conceptual continuity? This notion has been found to be too narrow before (demand of derivability) and it had to be widened so as to include partial and statistical connections. Nothing prevents us from widening it still further to admit, say, "explanation by equivocation".23 And although I would not go so far as to appeal here to equivocations - which, by the way, is not at all necessary - I would still say that correspondences of content justified through quantitative approximations yield a sufficient basis for a future rational notion of explanation of the type sought here.
23
Feyerabend 1970c, p. 227
V.22 A Case Study Concerning the Limiting Case Relation in Quantum Mechanics* I In recent years, there has been an increased interest on the part of philosophers of science in the inter-theoretical relations in physics. Attention was especially paid to those relations between physical theories in which the theories have overlapping domains of application and are to that extent in competition with one another. The main question in this regard was what, if anything, would justify us to claim in such a case that one of the theories in question is better than its competitors, and which conceptual apparatus would need to be developed for the formulation of such a claim. This question has been tackled at different levels: at the level of an optimistic ideology of progress that integrates, on a grand scale, the whole development of physics 1 ; at the level of dry and precise, yet also very general concept-formations 2 ; and finally in the lowlands of special case studies of correspondingly more limited range 3 . Although this is not precluded by the difference in the levels of attack, the approaches hitherto pursued do not yet form a coherent whole, and on the basis of various abstract paradigms from which they take their starting-point, they indeed give different names to their object such as progress, reduction, embedding, and explanation. Perhaps with the exception of the mentioned relation of competition, not even the unity of the object of investigation is secured, and hitherto no one has assembled a sufficiently comprehensive list of concrete model cases and checked to see whether the conceptual tools already developed can cover all these cases. The following investigation is a case study in the area just sketched, and as such it can afford to leave general conceptual questions somewhat in the balance. Yet the study is not conceived without relation to just these questions. The case with which it is concerned - the quantum mechanics of the linear harmonic oscillator in relation to a 2-particle generalization of the same is, considered by itself, rather uninteresting. Above all, we are not concerned with a historical case study, but rather with a systematic study in the classic sense in which one seeks to profit from the simplicity of the particular for the sake of mastering general difficulties. In this sense, there are especially two more general problems underlying the following considerations. They result from the popular view in physics that in important cases of a relation of competition between two theories, the losing theory T* proves to be a * First published as Scheibe 1981b. Translated for this volume by Hans-Jakob Wilhelm. The composition of this paper was made possible by a Visiting Fellowship at the Center for Philosophy of Science of the University of Pittsburgh. 1 Lakatos 1978 2 Sneed 1971, Ch. VII and VIII; Ludwig 1978, Sect. 8 3 Scheibe 1973a (this vol. ch. V.20)
339
340
V.22 Limiting Case Relation in Quantum Mechanics
limiting case of the successor theory T, e. g. classical mechanics as a limiting case of quantum mechanics. That is to say that through T, one comes to know the limits of the successful applicability of T*, and that, moreover, one can state these limits quantitatively through a mutual estimate of certain characteristic parameters and that by maintaining these limits, one can approximatively replace theory T with theory T*. It is true that this view which assumes an approximative relation between the two theories thereby from the start avoids the danger of wanting to reconstruct the relation in question in a purely qualitative-logical way. On the other hand, it is often not sufficiently emphasized (especially in certain general plausibility considerations of textbooks of physics) 1) that in such limiting cases, one should also expect a comparison of possibly very different conceptual structures and 2) that even the quantitative relations cannot so easily be stated generally in the sense that they would be sufficient in every particular case. Thus, for example, the properties of a quantum-mechanical object follow an unorthodox "logic", and the qualitative difference which appears in relation to classical logic cannot be represented by a real parameter alone. Furthermore, the smallness of the quantum of action always plays a role as well in the mentioned estimates for the classical limiting case of quantum mechanics. But this alone is not necessarily decisive, and the other quantities which enter into such estimates will vary from case to case and will scarcely permit a general statement. The following case study will illustrate especially the second problem. The first problem of conceptual incommensurabilities has been emphasized in recent times by Feyerabend in a number of studies4 . I take this matter to be more grave than textbooks of physics less troubled by science-theoretic scruples suggest, but by no means do I hold it to be as hopeless as presented by Feyerabend 5 . Yet, our example will also show in this connection how easily the conceptual problems can be veiled by the elegance of the formalism. The even more general question, to what extent the use of the empirical test of the theories involved is necessary or sufficient for the characterization of the limiting case relation (and especially for its highly desired asymmetry) must be completely disregarded in the present context. It will be useful, however, for the reader to bear in mind the schema inspired by this question according to which T must not drop back behind T* in terms of the latter's empirical successes (in other words, that it must be able to reproduce these successes) and that, on the other hand, it must demonstrate some successes in places where T* has failed. A direct transfer of this asymmetry could hardly occur otherwise than through an additional "marking" of the parts of T and T* with which one was successful or unsuccessful. Leaving completely aside the question of how this is to be done in detail, we note at the theoretical level that all the important theoretical relations are much closer than the mostly patchy empirical data would be able to express. Hence, for purely theoretical 4
5
Feyerabend 1975 (1976), Chap. 17 Scheibe 1976a (this vol. ch. V.21) and 1976b
V.22 Limiting Case Relation in Quantum Mechanics
341
considerations such as the following, it is a more appropriate approach, albeit one essentially in need of supplementation, to abstract in the above schema from the empirical and to consider the absorption of T* by T as well as the lack of an inversion of this relation just as such and with the full exploitation of the theories themselves. II In our case study, the central field approximation well known from the quantum mechanics of multi-particle systems is simulated by means of a simple example. The central field approximation is due to the fact that many calculations cannot be precisely carried out for multi-particle systems, and, as a consequence of this, one even loses the explicit control over the approximation. The said example, however, will be so simple that, within certain limits, control is relatively easy to maintain. Precisely this is also quite necessary in order fairly explicitly to create a limiting case relation between the original theory and its approximative simplification. For the simplest atomic case of the hydrogen atom, after treating the same as an electron in a coulomb field (CE), its treatment as a genuine 2-particle system consisting of proton and electron (P E) is text book material. In retrospect, the C E- theory then appears as a central field approximation of the P E-theory. As a rule, however, this view is only illustrated through a comparison of the energy spectra. This comparison states that the spectrum of the C E-theory arises from the spectrum of the P E-theory, if one allows the mass of the proton to approach infinity. This is without doubt a good limiting case relation of two theories, if one accepts that it is only partial and actually most incomplete. Yet, if one extends it, perhaps by also including the changes of state in accordance with the two theories, one will no longer be able to expect that even this extension will still be regulated by the simple limiting process which only involves the mass of the proton. For although we are dealing here with the comparison of two quantum theories, we are also dealing with a latent classical limiting case. The CE-theory turns the hydrogen atom of the PE-theory into a chimera of classical coulomb field and quantum-mechanical electron. Hence, an approximative condition which does not contain the quantum of action can by itself not mediate the limiting process - to say nothing of the conceptual problems of comparison. Yet, what will be that further condition? To answer this question, we shall, as soon as we are dealing with concrete calculations in this study, consider an (in comparison with the hydrogen atom) even simpler, that is, a I-dimensional case. Before this can occur, however, certain general definitions still have to be made regarding the question of how a comparison of two physical theories which stand in a limiting case relation should be approached and what, since we are dealing here with two quantum theories, is to be expected from a quantum theory as such. In the first regard, we shall, as far as possible, make use of the deductive-
342
V.22 Limiting Case Relation in Quantum Mechanics
nomological D-N model of explanation6 . And here it does not matter very much whether the usual meaning of the word "explanation" even partially covers the limiting case relation such that in a usual sense it can be said of the theory T* representing the limiting case that it is "explained" by the other theory T. What matters is rather that, as an explication of the usual meaning of explanation, the D-N model at least in part correctly describes the limiting case relation. It will be shown that it actually does - even if only in part. The logical core component of a D-N explanation is a deduction of the theory to be explained from the explaining theory together with certain further premises stating the special conditions under which the explanation is possible. Let a(a*) contain the parameters on which the two theories T(T*) depend, and let T(a)(T*(a*)) be their axioms 7 ! It will often be possible to express the partial identity of the object-relation of T and T* by the fact that the parameters a* can be defined from the parameters a. The corresponding definitions D(a; a*) will certainly belong to the mentioned additional premises. In addition, there may be a possible limitation of the domain of application of T the description of which will here be achieved by C(a). The mentioned deduction of the D-N model would then have the form
T(a)
1\ C(a) 1\
D(a; a*)
-t
T*(a*).
(1)
It is quite obvious that this model will fail in its immediate application when it concerns a genuine limiting case relation 8 . And this will also be the case in the following example. On the other hand, as was already said, (1) is never so false that a part of T* could not be gained with the help of it. Before this too can be shown by means of our example, we have to make the other general preparation: We must define certain framework conditions for a quantum theory and state how T and T* are to be identified as quantum theories. It will suffice for what follows to assume as dramatis personae of a quantum theory (of a closed system) a Hilbert space 1-1., a Hamilton operator H, a set of quantities \!S and a change of state W. The axioms QT(1-I., H, \!S, W) which these entities are supposed to satisfy are partly evident from the chosen designation. Inasmuch as this is not the case, the following remarks will complete the explanation. The set \!S which comprehends H as energy is provided for the purpose of the physical interpretation of the states in W. Thus, we are dealing with a selection of quantum-mechanical quantities (observables) which are to a justifiable extent experimentally accessible such as positions, momenta, angular momenta etc. \!S is to contain a basis of a maximally Abelian subset of all formally possible quantities so that the (instantaneous) description of state can be completely provided by \!S. The change of state W is intended as a temporal function with v. Neumann 6
7 8
Hempel, 1965, pp. 245 if. and 331 if. Scheibe 1979 (this vol. ch. III.ll). Scheibe 1973a (this vol. ch. V.20) and 1976a (this vol. cd. V.21)
V.22 Limiting Case Relation in Quantum Mechanics
343
operators as values. It satisfies the generalized Schrodinger equation
(2)
niW=HW-WH.
Finally, when an arbitrary theory T(a) is presented - as assumed in (1) for T and T* - then it is identified as a quantum theory in the sense sketched by means of the definitions D( a; 1-£, ... ) of the quantum-theoretical parameters (1-£, ... ) from the parameters a such that T(a) 1\ D(a; 1-£, ... )
-7
(3)
QT(1-£, ... )
is valid.
III After these general definitions, we can now turn to the two theories T and T* between which a limiting case relation is to be established. Let T be a 2-particle theory with the Hilbert spaces 1-£i, the position operators Xi, the momentum operators Pi, and the masses mi for the two particles ai. Further, let k be a real parameter as coupling constant and W a change of state in the Hilbertian tensor product 1-£101-£2 satisfying the equation of motion (2) with the Hamilton operator H to be defined shortly. Thus, we have indicated the parameters a and the axioms T(a) for T. The identification of T as a quantum theory in the sense of (3) occurs by means of definitions which are self-explanatory within the general context - e. g. 1-£ = 1-£1 011.2 and the combination of the quantities in ~ = {Xl 01,10 X2""} - as well as by means of the definition of the Hamilton operator H = - 1 (PI 0 1) 2 + - 1 (1 0 PI )2 2mI 2m2
+ -k (1 0 2
X2
-
Xl 0 1)2
(4)
which states the special interaction between al and a2. More precise definitions of the initial data will then make possible a perfect demonstration of (3). With regard to theory T* , one proceeds in a perfectly analogous way. T* is a I-particle theory with the Hilbert space 1-£*, the position operator X*, the momentum operator P*, and the mass m* for the particle a* as well as again a coupling constant k* and the change of state W* of a* satisfying the equation of motion (2) together with the Hamilton operator which, aside from the combination of quantities, is now the only thing left to be defined: H*
= _I_ p *2 + k* X*2 2m*
2
(5)
The identification (3) as a quantum theory turns out to be even simpler than with T. In addition, we see that T* is simply the quantum theory of the linear harmonic oscillator and that T improves upon this theory in the sense
344
V.22 Limiting Case Relation in Quantum Mechanics
that the classical center of oscillation of T* is turned into a particle subject to quantum mechanics. Accordingly, we must now begin the comparison of T and T* with the definition that the particle a* of T* is the other, let us call it, the second particle a2 of T. From this definition - which partially describes the object reference of T and T* very precisely - the form of the definitions contained in (1) follows almost unambiguously: The parameters in T* describing only the particle a* must be equivalent to the corresponding parameters of T. In the cases (6a) this even leads to an equating with certain independent parameters of T. Not so in the final case of changes of state to be considered here: the W of T refers to the system of both particles. As is generally known, however, by means of W on HI 0H 2, a W 2 on H2 is definitely determined as that Neumann operator which delivers the same probability distributions in H2 as does W. The condition for this is
Tr(W(10 P))
= Tr(W2 P)
(7)
(with Tr = trace) for all projectors P on H 2 . Thus, (6a) must be complemented by (6b) with the W 2 deriving from W 2 • Finally, as far as the coupling constant k* of T* is concerned, its definition does not follow plainly from the equating of the content of a* and a2 (a* = a2). For it links a to the classical center of oscillation at the zero point or - if you will - to a I-dimensional degenerated classical field. And indeed, the negative results to be mentioned in a moment will be independent of the choice of k*. Only the later, positive approximative comparison will lead definitely to the value k*
=k
(6c)
for k*. Although this is not necessary, we shall, for the sake of simplicity, accept (6c) already now as a definition. Since now all the independent parameters occurring in T* have been defined, we can introduce a theory T'; as the set of the inferences from T and the definitions (6) inasmuch as they can be expressed in the parameters defined by means of (6). The index "2" in "T';" is supposed to recall that here we are dealing with the consequences which the theory T has for the particle a2. Because a* = a2, the comparison of T and T* amounts to a comparison of T'; and T*. With a view to the usefulness ofthe D-N explanation (1), we begin this comparison with the observation that T'; indeed has many theorems in common with T*, namely, all those which only contain the parameters
V.22 Limiting Case Relation in Quantum Mechanics
345
defined in (6a). Because of the simple form of these definitions, this can be readily seen, and we do not need to apply any restrictions in the sense of the addition C(a) in (1). There is thus no question that in the present example a part of T* can be explained in the usual sense from the theory T. But by no means everything: The correspondence of T; and T* finds its definite limit in the changes of state W* of a* = a2 (and this is so - to say it once more - independently of the choice of k*, that is, especially of (6c)). If 8 2 (8*) is the set of the changes of state allowed by T; (T*), then the rest of T* would follow from T; as well, as long as 8*2 c - 8*
(8a)
would be valid. But this is impossible. It follows from the equations of motion (2) with the Hamilton operator (4) for 82 and (5) with the definitions (6) for 8* that if k # 0, then in 82 there are still many projections of changes of state of the total system which for t = have the form Wf 0 Wl' with a fixed Wr In 8*, however, there is exactly one W* with the initial value W2'. If k = 0, however, then 8 2 contains projections of changes of state W such that they at no time have the product form Wf 0 W2', and then it is these which cannot belong to 8*. Thus, (8a) is not valid. Moreover, it is very likely that for k # even
°
°
8; n 8* = 0
(8b)
is valid. And this is just how things stand - mutatis mutandis - with respect to the Kepler-approximation of the Newtonian gravitational theory in the case of two bodies9 . If (8b) is valid, then no additions C(a) in (1) would help to obtain from T; at least a part of changes of state envisaged by T. But even if (8b) is not valid, the D-N explanation of T* which would then be possible would not fully utilize, under some additional conditions of application, the essential connection between T and T*: schema (1) says nothing about their approximative relations. IV
Indeed, such relations exist precisely where, due to the fact that (8a) is invalid, at least certain gaps arise in the exact connection between the respective changes of state: Some (perhaps even all?) solutions of the equation of motion T* result from T not in an exact but in an approximative manner. Precisely this then also gives expression to the fact that T* is not only extended by T but that it is also corrected and that under certain boundary conditions these corrections become arbitrarily small (in relation to a topology). We shall not be able to provide a substitute for (1) which captures T* just as completely as (1) does where it is fully applicable. But we shall be able to create two very special limiting case relations which concern 9
Scheibe 1973a (this vol. ch. V.20)
346
V.22 Limiting Case Relation in Quantum Mechanics
(A) the spectrum of energy (B) the stationary changes of state for the particle a* in accordance with the theory T*. In the logical sense, (A) and (B) are to come into playas follows. In the case of (B), the contingent assumption C*(a*) that W is a stationary change of state will be added to the axioms T*(a*). Very explicit propositions Co(a*) about these changes of state will then follow from the extended theory TO'. In a further step, T( a) is then extended into a theory To by means of the contingent assumptions C(a). While the extension from T* into TO' is arbitrary, that is, it only rests on the decision to explain, say, (B) and not something else, the choice of C(a) must be adapted to this decision: certain (strict) inferences Co (a) from To are supposed to be such that, together with 1) certain limiting case conditions Gr(a) and, of course, 2) the definitions D(a; a*) (here (6)), they approximate the propositions Co(a*). In place of (1), we now obtain the following schema: corresponds in meaning through D(o;o*)
T(a) !\ C(a) -------'------'------+ T*(a*)!\ C*(a*)
1
1
deductively implies
deductively implies
approximates with
Gr(o)
1\
Co(a*)
D(o; 0*)
This schema is standard-setting for many limiting case relations in physics. If, omitting Gr(a), on the lower line, we were dealing with a strict deduction with the help of D(a; a*), then (1') would state that T and T* both deliver a D-N explanation of Co. In the case before us, however, this is impossible: due to the motion of al, the stationary changes of state of a2 in accordance with T* cannot be strictly explained from T. Strictly speaking, they do not even exist in T, and, at best, it will be intelligible that, under certain limiting case conditions, they are more or less adequate approximations. Thus, with the expectation that limiting case relations can be established here, we recall, for the further calculations in the 2-particle theory T, the transition from position coordinates to center of gravity and relative coordinates and their counterpart in momentum space. With regard to the quantities in T, we introduce the new quantities
(9a)
(9b)
Besides the new tensor product factorization 1-£ = 1-£8 (8)' 1-£R which was already used in (9), we obtain the important decomposition
V.22 Limiting Case Relation in Quantum Mechanics
H = Hs &/ I + I &/ HR
347
(lOa)
of the Hamilton operator (4) with I
2
Hs = --Ps
2ms I 2 k 2 HR=--PR+-X R 2mR 2
(lOb) (lOc)
which in a familiar way leads to the separation of the variables for the corresponding Schrodinger equation. Let us now first turn to the problem (A): We add to T* the proposition (C* above) that the set of the possible energy values is given by the spectrum of (5). It then follows for this set that (CD above) (11) In order to explain this, we add to T the proposition (C above) that the set of the possible energy values is the spectrum of the operator of the inner energy I &/ HR. It then follows (Co above) that the energy values are given by (11) with mR in place of m2. Since for
mR comes close to m2, the energy values from T come close to those calculated from T*. We thus have a limiting case relation for (A) before us in which - as in the case of the hydrogen atom - only the masses enter into the limiting condition. Things are not quite as straightforward in the case of (B). The expansion of T* with C* restricts W* to stationary states which in the position representation result in (CD above)
(12a) with the proper values (11) and the proper functions
where Hn are the normed hermitian polynomials and
is posited in an abbreviated form. For this we must find an expansion To which restricts the change of state of the total system in such a way that its
348
V.22 Limiting Case Relation in Quantum Mechanics
projections (7) lead to changes of state of 0"2 which approximate the changes of state (12). It was already stated that stationary changes of state do not exist for the total system. The approximation of the energy values (11) as proper values of (12), however, suggests considering maximal solutions 'P of the Schrodinger equation of (4) which are in addition proper states of 10' H R : (13a)
In addition, one will assume a center of mass of the total system resting in the mean, i.e. (<1>IXsl<1»
= (<1>IPsl<1» = 0
(13b)
since such is valid in TO' for 0"2 because of (12) as well as for the classical limiting case of 0"1 there. Finally, we must reckon with the indeterminacy relations for 0"1 which are dropped in TO', and here we want to assume the most favorable case, as far as the approximation is concerned, i. e. that at time t = 0, we have the extreme case
L1(Xs 0' 1) . L1(Ps 0' 1)
= ~.
(13c)
To defined by means of these contingent additional assumptions. Because of (13a), <1> must be a product
<1> follows explicitly from the theory
= 'Ps 0' 'PR
(14a)
nicps = Hscps nicpR = HR'PR
(14b) (14c)
<1>
where because of (10)
From (lOb) as well as from (13b) and (13c), it further follows that 'Ps is the Gaussian wave pack in the position representation
'P~(Xs, t) = (27r)-!0"-!d-! exp { -
G;)
2
d- 1}
(15)
d = 1 + int(2ms0"2)-1 Because of (lOc), however, the position representation of 'PR is simply (12) with XR for X2 and mR for m2 (altogether Co above). Although this explicitly gives us <1>, it is still not an easy task to compare its projections in 1£2 with the eigenfunctions (12). One must bear in mind that, unlike the latter, the former are not maximal and thus cannot be represented by means of vector functions in 1£2. Aside from the technical difficulties of calculation that this poses, for the purpose of comparison, one would first of all need to find a suitable topology for the set of all temporal functions of v.
V.22 Limiting Case Relation in Quantum Mechanics
349
Neumann operators in 1-1. 2 . So here we must rely on the hypothesis that (12) is a limiting case of (13). Yet, in conclusion, we shall at least want to test this hypothesis for the pertinent densities of location and momentum. For the position density of~, the integral
(16) needs to be evaluated. The calculation results in an expression of the form (a new C 1 )
a(mR)-l)' exp{ _),2 q(X2, mR)2} . [H~(ILn2q(x2' mR)) + Gn ()'2q ( y'2(Jldl, mR) ; ILn 2q(X2, m R)) )
(17a)
with (17b) and a polynomial Gn (a;;3) with
Go(a,;3) = 0 limG n (a,;3)=O forn21
(17c)
0-+0
(17) is to be compared with the position density (a new Ci)
of the theory TO'. It is immediately clear that this time (Grl) is not sufficient, but that rather the second condition
must also obtain. Except for the factor (2n + 1)~, we have on the left the indeterminacy of position of the relative motion (see below) and on the right the indeterminacy of the motion of the center of gravity. Together with (Gr1)' this means that the motion of (J1 must occur in a "classical" way as contrasted with that of (J2 - as was to be expected. It must be borne in mind, however, that because of the t-dependence of d - i.e. because of the spreading of the wave pack (15) - (Gr2) can only be maintained for finite stretches of time. These can be of any length, as long as m1 is sufficiently large. The situation is again more straightforward with regard to the densities of momentum if one considers them separately. In the representation of momentum, (15) is
350
V.22 Limiting Case Relation in Quantum Mechanics
(19) while
X2
with PR and of
The evaluation of the integral of the density of momentum (21) yields an expression of the form (a new O2 )
b( mR) -1'YI exp{ -'Y;r(p2, mR)2}
. [H; ('Y;r(P2, mR)) + G
n
(J-th;r ( V2T, mR) 2 j 'Y;r(p2' m R)) ]
(22a)
with (22b) This is to serve to approximate the density of momentum delivered by new O2)
TO'
(a
(Gr1 would suffice for this purpose, if we were only dealing with this approximation. The densities of momentum, however, must be approximated together with the densities of position, since, properly speaking, we are even concerned with the approximation of two changes of state about which the former densities only provide partial information. Thus, so that 'Y ~ 1 and 'YI ~ 1,
must be considered with regard to its compatibility with (Gr2)' This is ensured, however, since (Gr2) requires, besides a small a, a large msa. And (Gr3) does just the latter as well. In conclusion, it should be emphasized that with these results for the densities of position and momentum, our hypothesis - that the changes of state (12) of the theory TO' are approximated by the changes of state (14) of the theory To in accordance with (Gr) - has not been proven. Nevertheless, we
V.22 Limiting Case Relation in Quantum Mechanics
351
have already been able to make clear what should be the most important issue in this regard: As was already mentioned in the introduction, T* conceals a classical limiting case. So it seems at least, if one regards T* in light of the theory T, i.e. when one considers T to be the better theory. Since T* is still a quantum theory, its classical "sins" are not immediately apparent. It is true that there are also independent indications that T* is incomplete such as the fact that this theory is not even invariant against spatial translations and that thus the conservation of momentum is not valid either. Yet, it is only from the perspective of T that one can judge what T* is lacking as a whole. Here it becomes apparent that the full quantum theory of a second particle (our 0"1) was neglected in T*: A whole Hilbert space has disappeared. Thus, it is no wonder that the limiting case condition (Grd is insufficient. The two other conditions (Gr2) and (Gr3) which we have come to know have the obvious generalizations .1(1
rg/ X R ) » .1(Xs ®' 1)
.1(1 ®' PR )
»
.1(Ps ®' 1) . J.L2
Together with (Gr1), they express a relatively classical behavior of 0"1 as compared to 0"2. When we generalize our question to include the approximation of further changes of state of T*, these generalizations also allow us at least to articulate possible limiting case conditions. Whether or not these are also sufficient, remains a completely open question.
V.23 A New Theory of Reduction in Physics* Science is patchwork. Even physics, perhaps the most successful part of science, did not achieve its goal in one attempt. It cannot yet be presented as one unified theory. Students of physics are introduced to various theories: mechanics of mass points, continuum mechanics, electrodynamics, thermodynamics, quantum mechanics, and so on. Moreover, there are other natural sciences besides physics: chemistry and, most importantly, biology. Sometimes it is said that by the invention of quantum mechanics chemistry could be reduced to physics, very much as, within physics, optics was reduced to electrodynamics or thermodynamics to statistical mechanics. Similarly, it is claimed that quantum mechanics has superseded classical mechanics and that, at least in principle, the latter could be replaced by the former and thus be eliminated from the foundations of physics. Thus, in addition to the vast number of scientific theories and disciplines, which today can hardly be comprehended by anyone individual, there are also intertheoretic relationships, some of which seem to have a reducing or unifying effect. Galileo's law of free fall and Kepler's laws of planetary motion could be unified in Newton's gravitational theory. Although not always the consequence of a proper unification, these relationships show the totality of theories as being an orderly net rather than a randomly accumulated heap. Do we have a satisfactory theory describing or explaining the reduction net of the natural sciences or even of physics? I am afraid we do not. The various attempts to find one always proceeded in the way usual for enterprises of this kind. First one looks for a sufficiently (if not excessively) general concept and only then, by way of specializing it, gets down to cases of application. In this manner the unity of the concept in question is given priority: Insofar as there are different kinds of reduction at all they are at once recognized as being only so many special cases of the one 'true' idea of reduction common to them all. 1 Although this is the usual analytic strategy in concept explication followed up in all the attempted explications of metascientific concepts well known from the positivistic tradition, in the case of theory reduction, I want to argue, it was a mistake. In the following, by turning the tables, I am going to develop what might be called a synthetic theory of reduction. In this theory we start from a couple of special and widely differing concepts of reduction and gain the most general case by their recursive combination. The most general reduction thus always appears as a certain combination of those elementary, pure cases from which the recursion started, and this feature seems to meet most happily the demands of the actual cases known from science. The various reductions occurring in physics are most conveniently analyzed, * First published as Scheibe 1993a 1 The "structuralist" theory of reduction is the most recent example. See Balzer et al. 1987, chaps. 6 and 7. Note, however, that in another sense the structuralists are more restrictive in their usage of the term "reduction" than is suggested in this approach (see note 9).
352
V.23 A New Theory of Reduction in Physics
353
not by subsuming them under one general preconceived concept, but by recognizing them as so many combinations of a finite number of elementary reductions. 2 We can at once indicate the new approach by correcting another mistake due to the received view. Philosophers of science seem to hold a conviction that theory reduction is a binary intertheoretic relation: One theory T is reduced to another one T'. In fact, however, a third element is always involved: the one that actually does the reduction. The binary relation view seems to be a consequence of the positivistic theory of reduction according to which a reduction of T to T' is essentially a logical deduction of T from T,3. We will see in the sequel that this is not the case without important qualifications. By these qualifications, among other things, the third element, the vehicle v of the reduction - as I will call it - is introduced. An explicit notation for a reduction statement would then be: redi(T, T'; v)
(1)
expressing that theory T is i-reduced by v to theory T'. Here the variable i indicates the various kinds of reduction that come into play, for example, (as we will see) "generalizing". Starting with initial reductions i = 1, . .. ,n, the most general reduction is generated by the following rules of recursion: red 1 ,··· ,redn are reductions; if red i and red k are reductions then also red(ik) is a reduction.
(R 1 ) (R 2 )
Here "redik" stands for the reduction generated by first i-reducing any theory T to theory T' by means of vehicle v and then k-reducing T' to a third theory T" by means of vehicle w. Accordingly, if v· w is the new vehicle, the operation of combining reductions to yield new reductions has to satisfy the condition If redi(T,T',v) and redk(T',T",w) then red(ik)(T,T",v.w).
(C)
Evidently, from this requirement and the recursion rules, the vehicle of reduction is all important for the generation of new reductions from given ones and its usual repression is a fatal destruction of the whole concept of reduction. In the following the synthetic theory of reductions based on rules (R 1 ) and (R 2 ) is presented mainly by way of illustration. Some general remarks, however, are in order. First, my main emphasis is on the various kinds of reduction - elementary and compound. Once reductions are indexed and an i-reduction is understood to be a (ternary) relation, different from the relation given by k-reduction for k of. i, the emphasis is hardly necessary. Accordingly, in formulating the above rules the simpler expression "reduction" is used instead of "kind of reduction." Second, the number of elementary reductions is 2 3
Some considerations in Ludwig, 1978, chap. 8, are suggestive of a synthetic theory of reductions. A comparison with the present approach is an urgent demand. Nagel 1961, Ch. 11, sect. 2; and Scheibe 1989a
354
V.23 A New Theory of Reduction in Physics
deliberately left open: We simply do not know what new kinds of reduction will have to be introduced in order to understand the further development of physics. Only combinations of reductions already given are a priori known to be reductions again. At the same time, the complex reductions make our general concept of reduction nontrivial even if the list of elementary reductions is not closed. Third, even more illusory than to know all kinds of reduction is the aim to precisely characterize any given kind in the sense that all its instances are genuine reductions of this kind. Genuineness here is a question of more or less, and often there will be degenerate cases, for instance, of selfreduction whose exclusion would be completely ad hoc. Fourth, the domain where a reduction of a given kind is defined depends on this kind, and this has to be observed in applying rule (R2)' Finally, (R 2) seems to be idempotent in the sense that red(ii) is always identical with rcd i . In this essay considerations are mainly restricted to the formalisms of physical theories, neglecting questions of interpretation and application beyond the absolutely necessary. With some formalization of first order set theory as our invariable background, the formalisms reduce to the axiom systems of theories. The reduction of physical theories thus becomes the reduction of set theoretical formulas functioning as physical axioms, and the vehicles of the reductions also become certain syntactical entities like formulas or terms. Accordingly, so far the status of reduction statements is essentially that of set theoretical theorems. Despite this purely syntactical background, the following presentation will occasionally use the "material mode of speech." Formulas and terms are then replaced by classes of structures and mappings between them in a fictitious set universe. 4 In reconstructing physical axioms as set theoretical formulas we can go one step further and take them to be species of structures in the sense of Bourbaki 5 . By using mathematical structures for representing physical systems, a physical axiom says about a structure what we want it to say about the physical system represented in (or even: by) that structure. In this way we reconstruct a physical theory as being a concept of physical systems, and this nicely meets the terminology of physics according to which we speak of Newtonian, Hamiltonian, quantum mechanical systems according to whether a system is submitted to Newtonian, Hamiltonian or quantum mechanics, respectively. It is then only with respect to a well- determined range of application T that a theory makes an empirical claim, namely, that all systems belonging to T arc represented in mathematical structures satisfying the axioms of the theory. 4
5
It has become fashionable to replace the syntactical approach of (the earlier) Carnap by a so called "semantical" approach (see note 1; Stegmiiller 1979, sec. 1 and 2; van Fraassen 1980, chap. 3, sec. 6). Apart from the elimination of some irrelevancies I can see no essential advantage of this approach. If we come down to special cases, it is hard to see how we can do without having to resort to the languages. Bourbaki 1968, Ch. 4; for details see Scheibe 1986d, this volume ch. VIII.36
V.23 A New Theory of Reduction in Physics
355
I After these preparations, let us now introduce reductions by direct generalization. In direct generalization we pass from a theory T to a theory T' more general in the sense that every physical system admitted by T is also admitted by T' but (as a rule) not vice versa. We reduce T to T' by knowing a contingent condition such that T' together with this condition leads back to T. The idea thus is to give T' a wider range of application than T. A very condensed formulation of direct generalization is (2a) where E' and E are the axioms of T and T', respectively, c is the additional condition restricting T' to T, and all three statements are formulated in the same terms. The (set theoretical) equivalence == is, therefore, an ordinary reaxiomatization. In the terminology introduced previously, condition c is the vehicle of this kind of reduction. Systems admitted by E' but not by condition c are possible advances of T' over T, typical for direct generalizations. The formula New 1\ Eny
== Kep
(2b)
illustrates the general situation: Kepler's three laws are reduced by generalization to Newton's gravitational theory (in its central field approximation) with the help of the condition that the energy of the planet is negative. The new possibilities introduced in this case are the unbounded paths admitted by Newton's but not by Kepler's theory and again excluded by the energy condition. 6 Prior to any comments on our first type of reduction I will now introduce another: reductions by refinement. Theory T' is a refinement of T by means of P if P is a surjective mapping assigning to every system possible according to T' a possible system of T in such a way that for any two systems related by the mapping the description of the T' -system (according to T') is more complete (or more refined) than the description of the T-system (according to T). In this case, therefore, not new physical systems are the possible advances of T' over T but rather a more refined description according to T' of each system already allowed by T. Accordingly, T' has the same domain of application as has T. In syntactical terms we may write
E'(8') 1\ 8 = P(8') E(8)
f--f---
E(8) 38'.E'(8') 1\ 8 = P(8'),
(3a)
for reduction by refinement where besides the axioms E' and E of T' and T, respectively, P is a term that "defines" a model 8 of E out of any given model 6
In (2b) both sides of the equivalence are to be understood as statements about the path of the planet.
356
V.23 A New Theory of Reduction in Physics
S' of E'. In semantical terms, P stands for the mapping mentioned and is the vehicle of the reduction. The second line of (3a) makes P surjective. 7 As an example from black body radiation let us take the reduction of StefanBoltzmann's law SB to Planck's law PL: PL(p,O) 1\ u = Ll(p) f- SB(u, 0) SB( u, 0) f- 3p.PL(p,O) 1\ u = Ll(p).
(3b)
Here p and u are the spectral and integral energy density respectively and 0 is the temperature. The refinement is evident from the defining relation Ll: Whereas in SB the radiation is only described by the integral energy density u, in P L the entire spectral distribution p as function of the frequency v of the radiation is part of the description. 8 If we compare (3b) with the corresponding deduction in (2b) we can realize how little it means to characterize reductions as derivations; we have derivations in both cases. Still the effect is entirely different in the two cases according to the different nature of the additional premise - the vehicle: In the Kepler-Newton case it restricts the totality of Newtonian systems to a subset; in the other case it coarsens the description of each single system. The difference in question - in contrast to the mere derivation - can also elucidate the different kinds of progress that are brought about in the two cases. In reduction by generalization possible progress lies in the discovery of new physical systems allowed by the reducing theory but excluded by the reduced one. Experience has convinced us that hyperbolas are physically possible orbits in a central gravitational field. It was just a mistake to exclude them. On the other hand, in reduction by refinement nothing of this sort is involved. The more refined description of each system made available in the reducing theory can be an advance if we are provided with confirming experiences. Let us point out that both kinds of progress would mean nothing if the reductions would not allow us to recover the reduced theories in some sense. If they had been empirically adequate theories and could not be recovered we would hardly be in the position to talk of progress. It is here, in the conservative part of reduction, where the importance of deduction lies: By being deduced the reducing theories are recovered in a most literal sense. Let us now look closely at the work done by the vehicle of a reduction if the latter is to fulfill its explanatory role. For generalization, (1) alternatives to the special theory T are available within the more general theory T', and (2) a condition c is distinguished under which, given T', theory T and not any of its alternatives is bound to hold. In this sense, we know why T. The explanatory function of generalization transpires most clearly from the situation where the old theory T once had been declared to be valid on a priori grounds. This, of course, not only means that no alternatives to T were known but that on the 7
8
The first line of (3a) is a deduction in the sense of Bourbaki (1968, chap. 4). In (3b) the temperature 9 is unaffected by the derivations. This case is covered by (3a) with partially trivial terms P.
V.23 A New Theory of Reduction in Physics
357
alleged evidence of those grounds no alternatives could exist. Consequently, no explanation of T apart from self-explanation could be given. But as soon as alternatives are made available through generalization the major question becomes: Why T and not any of those alternatives? This question is answered by means of the condition c. A story of this kind is given by the development of geometry in the time between Kant and Einstein (to be discussed later)9. An explanation different in kind is provided by reduction by refinement. Here the terms P take over the burden of explanation by telling us how the basic concepts of theory T depend on those of theory T' and therefore the axiom E holds if E' is assumed to hold. lO
II Before considering combinations of generalization and refinement let us address in what sense, if any, these kinds of reduction are elementary. If this is meant in an absolute sense, I do not know the answer. However, interesting as the answer may be, it certainly need not be given in order to establish our theory; for we are free to declare these and further reductions to be elementary in the relative sense that by fiat they are chosen to be the initial settings of the recursion (R) and thus made the building blocks of the theory. In the case before us we can distinguish subkinds of direct generalization and refinement so that in this sense these reductions are simply not elementary. An obvious case in point is equivalence as a subkind of refinement: T and T' are equivalent if they are mutual refinements of each other. This implies that the terms P and p-l transforming T into T' and T' into T according to (3a) are inverses of each other. As reductions these equivalences may be called "degenerate" because equivalent theories can be reduced to each other. Still in physics we cannot do without them. Being changes in the conceptual basis of a theory they teach us to view one and the same physical system in different perspectives, and their application sometimes leads to unexpected insights into intertheoretic relations. A first type of combination of two reductions, one of which is an equivalence, may confirm this. Reductions by direct generalizations are much too special in order to be an adequate explication of our usual understanding of generalizations. A case in point is geometry. One of the momentous generalizations in the history of mathematics was the step from Euclidean to Riemannian geometry. Its application to physics by Einstein led to the analogous generalization of 9 10
Despite the evident explanatory and reductive features of generalization, it is not classified as a reduction in Balzer et al. 1987 Equations (3a, b) involving the vehicle is here taken to be an analytical statement thus allowing the whole reduction statement to be analytical in character. This situation may occur even in the proper case of a reduction in which the reduced theory T is known prior to T'. The equations in question then are a real (as opposed to a nominal) definition. The case where the equations are a so-called "synthetic identity" has to be treated separately; see Nagel's 1961 treatment and more recently Sklar 1967 and Scheibe 1988a.
358
V.23 A New Theory of Reduction in Physics
Minkowskian to Lorentzian geometry of spacetime. 11 In both cases it was a step from a metrically flat to a curved manifold. And it was a genuine reduction of the special to the general case with respect to the vanishing of the Riemann tensor as the reducing subsidiary condition (the vehicle). But it was not a direct generalization. Direct generalization assumes - as is implicit in formulas (2) - that the theories related are formulated in the same language. This, however, was hardly the case for the present example: Before Riemann nobody thought of defining Euclidean space by means of a metric tensor field. To include this case and many others known from the history of physics in our theory we have to allow for reductions by generalization involving equivalences. More precisely, "generalization" is direct generalization preceded or followed by an equivalence. A Euclid-style version of Euclidean geometry has first to be equivalently transformed into its Riemann-style formulation which in turn leads to Riemannian geometry by direct generalization. Furthermore, there is also the important case of combining refinement with (direct or indirect) generalization. This type frequently occurs as forward reduction beginning with specialization followed by coarsening. In the systematic sense a forward reduction is just the inverse of a reduction proper. There is, however, a pragmatic difference between the two. In a reduction proper we assume that the theory to be reduced was known prior to the reducing theory. Many applications of our concept exist, however, where it is the other way round. In particular specialization and coarsening as the relations inverse to generalization and refinement are frequent steps taken in physics. If, for instance, Hamiltonian mechanics is put to use, the first step is always a specialization leading to a fairly definite phase space and Hamiltonian. In a second step this theory is coarsened by the definition of some quantities for which an empirical law is derived from the equations of motion, for example, Kepler's third law from Newton's equations. Another example is the scattering theory in quantum mechanics. Here, on a very general level, in a first step (specialization) the bound states, if any, are eliminated which in a second step (coarsening) allows the definition of the S-operator and the derivation of its characteristic properties. The importance of equivalences as constituents of complex reductions is conceptual assimilation. We have already seen this in the case of geometry in which Riemannian geometry assimilates basic concepts of Euclidean geometry. Stressing the heuristics in the process one could also say that conceptual transformation is part of the insight as to how Euclidean geometry can be fruitfully generalized. Such assimilation and transformation is brought about by equivalence. A well-known lesson that things, looking entirely different at face value, eventually turn out to be the same was taught by quantum mechanics: the equivalence of matrix and wave mechanics. An even more amaz11
Although the geometry of general relativity is a generalization of the geometry of special relativity, the relation between general and special relativistic physics is more complex.
V.23 A New Theory of Reduction in Physics
359
ing case was Cartan's geometrization of Newton's gravitational field theory12. He and his followers could show that under suitable boundary conditions Newton's theory is equivalent to the differential geometry of a 4-dimensional flat Galilean metric together with a nonvanishing symmetric connection compatible with the metric. The equivalent of Newton's field equation is a field equation expressed with the help of the Ricci tensor of the connection. The equivalent of Newton's equations of motion are the equations for the geodesics of the connection. The proof of this equivalence was the first step in the attempt to reduce Newton's theory to Einstein's theory of general relativity. However, to obtain this reduction no combination of the elementary reductions introduced so far would suffice. Rather our list of elementary reductions has to be enlarged. 13 III We here meet with the weakness of the elementary reductions considered in the foregoing sections: They as well as their combinations always allow for an exact recovery of the axioms of the theories superseded by their historical successors, and the progress, if any, does not include any essential correction of the earlier theories. Such cases, however, rarely occur in the history of physics. As a rule a theory and its presumable successors are rivals, their axioms contradict each other, and the progress that is made does include a modification of the axioms of the theory superseded. Such is the case with Einstein's theory of gravitation vis a vis Newton's; it is even the case with the latter vis a vis Kepler's laws of planetary motion, and indeed we can find very simple illustrations of this kind of progress. The main difficulty in finding new types of reduction taking account of rivalry between theories is to reconcile this situation with the demand that the predecessor theory, insofar as it had been confirmed, must be recovered from the theory superseding it in some sense; otherwise, we were not entitled to speak of any progress attached to the reduction. The essential idea to find reductions relating incompatible theories is to allow for approximations: Instead of an exact recovery the reduced theory has to be reproduced only approximately. Let us see what this can mean by looking at the simple case of the reduction of the ideal gas law
p·v=R·T
(4a)
to van der Waals' law
a (P+-)·(v-b)=R·T
(a,b>O). (4b) v2 It is obvious that these laws contradict each other in the sense that, given the constant R, there is no solution (a, b,p, v, T) of (4b) such that (p, v, T) is 12 13
see, for instance, Kiinzle 1972 Conceptual assimilation can also be brought about by refinements proper if they concern the reducing theory. See the case of microreductions in section IV.
360
V.23 A New Theory of Reduction in Physics
a solution of (4a). As to reduction we can first specialize the van der Waals' law to be a law for one sort of gas, say, oxygen only. In this case we have numerically fixed values a o and bo for the van der Waals' constants. The parameters left, that is, p, v and T, are then the same in the two theories, and it is evident that for (4c) given values p, v, T satisfying (4b), though not also satisfying (4a), are located very close to values that do satisfy (4a). Whereas in this case of, as it might be called, asymptotic reduction the two theories are built over the same conceptual material, the general van der Waals' law with variable a and b in fact includes a refinement of the description of a gas as compared with the ideal gas law. In this case the approximation is most conveniently reconstructed as an ordinary limit process: If we look at a and b in van der Waals' equation as parametrizing the solutions p, v, T then the union of these 2-dimensional hypersurfaces (in the 3-dimensional space of all p, v, T > 0) has the hypersurface made up of the solution of the ideal gas law as its boundary. Thus here the reduced theory appears as a limiting case 14 of the reducing theory for a, b -+ O. The foregoing example is typical for a large class of reductions in which the two theories involved are not essentially different in their most fundamental, that is, topological or geometrical, parts but have contradicting laws in the proper sense of the term. This holds in particular for the difference between asymptotic reduction (I) and limiting case reduction (II). We give a parallel description of these two kinds in three parts according to the three components of their respective vehicles 15 . (1) The first component appears as the common part Eo in the assumed decompositions
E'(So, s) == Eo(So) 1\ s E M'(So) E(So, s) == Eo(So) 1\ s E M(So)
(5a)
where M'(So) and M(So) involve typifications by the same scale set a(Xo), with Xo the principal base sets of So. In case (I) E' and E are the reducing and reduced axioms respectively while in case (II) E' is already a ''refinement''
E'(So) == 3Sb, s'.E"(So, Sb, s') 1\ s = q(So, Sb, s') 14 15
(5b)
The expression "limiting case" is used by physicists. The earliest usage I could find is the German equivalent "Grenzfall" in Hertz ([1892]1914, 21ff.). For details concerning species of structures the reader must be referred to Bourbaki 1968. We use a vector notation for the structures: So is a tuple of base sets and typified sets, s is a tuple of typified sets, and so on.
V.23 A New Theory of Reduction in Physics
361
of the proper reducing axioms E", and the term q also belongs to the vehicle. Though E' and E contradict each other - see (2) - they have common partial models in the sense 16 that for all So such that Eo(So) there are s, s' such that 8'
E
M'(So) and
8
E
M(So).
(5c)
Evidently, these assumptions allow us to define a relation between any two structures satisfying E' and E, respectively: They stand in this relation if their fragments typified according to Eo are identical. The further reduction only concerns M'(So) and M(So) for any given So and we drop the argument. (2) The second component of the vehicle is a topological space (Mo, to) depending on and deducible from So by means of Eo such that
M, M' ~ Mo ~ a(Xo).
(6a)
Moreover, with the bar indicating topological closure,
MnM' = MnM' = 0
(6b)
MnM' =0
(6c)
in case (I) and
in case (II) hold as expressing mutual inconsistency of the laws. In case (I) we actually assume to to be a uniform structure. (3) Finally, the third component of the vehicle is a monotonously decreasing family of sets Co ~ Mo (8) 0), independent of M' in case (I) and subsets of M' in case (II). With their help the reduction statement in case (I) is for all u E to there is Co > 0 such that M' n Cli ~ Mu
(7a)
(where Mu is the u-neighborhood of M) and in case (II) is
M = {s
E
Mo I s
tt M'
and for all u E to, s E u there is s' E Mo such that 8' E Cli n u}
(7b)
What does this mean? Because of (6b) in case (I) there is no question of any solution of law M being a limiting case of solutions of M'. There are, however, common accumulation points (possibly in the infinite) in the sense that for all neighborhoods u E to : Mu n M', M n M~ #- 0. Equation (7a) 16
The class of physical systems admitted by a physical theory is in general much smaller than the class of corresponding structures satisfying its axioms. The further restriction is then brought about by fixing one particular structure as, for instance, So in (5a). Such is the case in the gas examples given. This qualification is suppressed in our treatment for the sake of simplicity.
362
V.23 A New Theory of Reduction in Physics
then says that under the special conditions of Co any solution of M' comes very near a solution of M. Thus we have a typical asymptotic behavior in this case.1 7 By contrast, in case (II) the solutions of M are limiting cases of solutions of M': It follows from (7b) that M is a subset of the boundary of M' in the narrow sense that M ~ M' - M'. The function of Co in this case is to pick out that part of the boundary that coincides with M: In general, a law may have several limiting cases. Obviously, from (7a) asymptotic reduction is an approximate variant of direct generalization. In view of (5b) it seems that limiting case reduction, if E" is the reducing axiom, may even properly be decomposed into an exact refinement preceded by a truly approximate reduction as described in (7b). So why not just omit (5b) in (II) and take E' to be the reducing axiom, thus making full use of the very idea of a synthetic theory of reduction? From a purely formal point of view we would be entirely justified to do this. However, there is the difficulty that in successive reduction of axioms E to E' and E' to E" and thus of E to E" it may happen that, whereas E and E" represent ordinary physical theories T and T", respectively, no reasonable theory T' corresponds to E'. In our above example the ideal gas law (E) and the van der Waals' (E") are reasonable physical laws. But E' in this case would be the statement (about p, v, and T) that there are (constants) a and b such that van der Waals' law holds. And this, though a perfectly clear statement, is not the kind of statement that we would like to call a physical law. Thus the insertion of E' into this reduction is only "virtual" in the sense of introducing virtual particles in a physical interaction. A solution of this problem cannot be given in the present essay. IV Asymptotic reduction as described before is still not general enough to cover cases in which the approximations concern only partial descriptions of the respective physical systems. Such is the case, for instance, with the reduction of the Rayleigh-Jeans radiation law to Planck's law. But an appropriate generalization is easily obtained, and I will leave this case in order to address more important ones as we find them in relativity and quantum theory and microreduction. In this essay, I cannot even touch upon the deep questions that can be and have been raised in connection with intertheoretic relations in these fields. It goes without saying that my work on theory reduction was motivated by the well-known difficulties concerning the folk views on progress and unity of science as they were pointed out by Kuhn and Feyerabend and, even earlier, by the physicists themselves 18 . As regards these difficulties I have nothing to add to the main general idea on which the present approach to reduction is founded. But I hope to show in the last two sections that the 17 18
It seems that in fact we always also have M n C8 ~ M~ which makes asymptotic
reduction an almost symmetric affair. see Kuhn 1983; Feyerabend 1975, chap. 7; and Scheibe 1988b this volume ch. 11.6
V.23 A New Theory of Reduction in Physics
363
greater flexibility that this idea gives to theory reduction is an advantage in the treatment even of those difficult cases. The main lesson taught by a comparison of the various (special and general) relativity theories with their nonrelativistic predecessors is the effect of equivalence transformations. As mentioned earlier, the main reduction statement was either a statement of equivalence itself or of equivalence combined with direct generalization or refinement. Now that approximate reductions are also at our disposal, new combinations with equivalence are possible. Their major effect in the case of rival theories is that conceptual assimilation allows not only the very formulation of rivalry but also the wanted approximate reduction with respect to the conceptual bases obtained. On account of their rich equivalence structure the situation is best illustrated by physical theories based on geometries. There is, for instance, a common formulation for field theories based on either Galileo or Lorentz manifolds. In it each metric is described by a triple consisting of two tensor fields 9 and h, roughly standing for time and space, and a real non-negative parameter A. This may be symbolized by gabhbC
=
-AO~
and other axioms'
Galileo Lorentz
(8)
where it is understood that the entire difference between the theories is expressed by the different values of the parameter A19. The formulation signalizes that approximation as far as it is possible at all is of the limiting case type (A -+ 0).20 At the same time it is not to be expected that the possibility of an approximate reduction is invariant under equivalence transformations. This remark is, if not all, then at least a great deal of what can be said about the alleged difficulties with conceptual incommensurabilities. Their major source is that axioms of theories contradicting each other on a common conceptual basis allow us to define new concepts in either theory whose definition is not possible in the other. Thus, starting with the given version of field theory, the Galilean axioms allow the introduction of an absolute time while this is not possible in the Lorentz-Einstein case. It is then natural to look for an equivalent formulation of the Galilean theory using the concept of absolute time as primitive. If this formulation is compared with the original version of the Lorentz-Einstein theory, incommensurability of the theories becomes evident. In general, however, it amounts to no more than that contradicting theories T and T' allow for equivalent formulations T1 and T{ using concepts that are not definable in T{ and T1 , respectively. From this I draw the con19 20
see Ehlers 1986 It is still not clear what kind of approximate reduction has to be applied in reducing Newton's gravitational field theory to Einstein's, let alone Newton's n-body theory of gravitation. See Kiinzle 1976, Ehlers 1981, and Lottermoser (1988).
364
V.23 A New Theory of Reduction in Physics
clusion that so far it is not incommensurability as such that we have to be puzzled about. It is rather particular instances of it that happen to be of a fundamental character. However, this conclusion is not final as long as we have not taken quantum theory into consideration. Perhaps the most recalcitrant reduction problem within physics is the one of reducing classical mechanics to quantum mechanics. The present approach offers no solution to this problem. The following observation may throw some light on why the problem is so difficult. On the One hand, the divergence of quantum mechanics from its classical predecessor begins already at the deepest possible level of state descriptions: Quantum mechanical observables and states in their usual representation on Hilbert space seem to have almost nothing in commOn with their classical counterparts On phase space. On the other hand, some identification On this level was the basis of all approximate reductions considered so far. In (5c) we have sharpened this condition as the existence of (sufficiently many) common partial models of the two theories. In the case now before us, what would the common theory Eo occurring in (5c) be like? Configuration space is a suitable commOn partial model of the two theories 21 . There are formulations of the theories in which configuration space plays just this role. But if in these formulations we look at the other Concepts to be compared the situation seems hopeless. In classical mechanics the observables, for instance, are represented by real-valued functions On the cotangent bundle over configuration space. In quantum mechanics they are represented by self-adjoined operators on a Hilbert space of complex functions on that space. We are thus left with two classes of entities of completely different types with respect to configuration space, suggesting no comparison. Moreover, in this reconstruction the theories do not even contradict each other. If, on the other hand, we follow the formulation, very popular for quantum mechanics, which makes the set of observables the principal base set of the describing structure, then we easily find contradicting consequences of the two axiom systems. But in this reconstruction the two theories seem to have nO common partial models including such basic entities as observables (and states). Under fairly restrictive assumptions to be specified in a moment the difficulty may be overcome by means of the so-called Weyl-Wigner-transformation. It provides us with two real Lie-Algebras of observables having the same underlying vector space A but two different Lie-products: one classical [A, B], the other quantum mechanical [A, BJh' depending on h (= Planck's constant) In the classical phase space representation, [A, BJ is the Poisson product; in the Hilbert space representation [A, BJh is (hi)-l times the commutator product. The first representation of [A, BJh and the second of {A, B} are more complicated. At any rate, the situation permits a topological comparison of the two products (as subsets of A 3 ), and it turns out that in a sense 21
see Ashtekar 1980
V.23 A New Theory of Reduction in Physics lim [A, B]h
h-+O
= [A, B].
365
(9)
In its topological features this approximation may be reconstructed as a limiting case reduction in the sense of the previous section. However, it has to be emphasized that the result (9) has been proven only for Hilbert-Schmidt operators (in Hilbert-space representation) and does not include as important observables as position and momentum 22 . Thus, while in general relativity the remaining reduction problems seem to be mainly of a mathematical nature, in quantum mechanics the situation still is beset with conceptual difficulties having an ontological root. They cast their shadow also on microreduction. The major problem here is, of course, the explanation of the essentially classical behavior of our macroscopic environment from nonclassical, quantum theoretical premises about the microscopic constituents. Evidently this is a very specific problem whose solution will hardly follow from the idea of the recursive nature of reduction in general. However, it seems that the solution will hardly be found without that idea. At any rate we should carefully study the contributions coming from the various components of a complex reduction. It has been mentioned already for relativity theory that the definability of concepts may significantly depend on the axioms of a theory: Concepts that are not definable in a given theory may become so in one of its specializations under additional contingent assumptions. Trivial as this step may appear from a purely logical point of view, it can have amazing effects in a field where the weight of additional contingent assumptions is considerably high when compared with the weight of the original laws. Such a field is microreduction. Many, if not all, of the so called emergent properties of a physical system may come about by such restrictions. Equally important in the field is the admission of and even emphasis on coarsening and approximate reductions, in particular, their combination as it occurred in the limiting case reduction of the previous section. A case in point is the reduction of the Boltzmann equation to statistical mechanics with the usual collision dynamics of hard spheres. In the reducing theory (E") a single system receives a microdescription essentially consisting of Boltzmann's J.Lspace X, the particle number n, the dynamics Hn mentioned and a (timely) trajectory (qi,Pih
(10) 22
see Emch 1983; Pool 1966; Baker 1958
366
V.23 A New Theory of Reduction in Physics
where Ll is any Borel set in X and XLl its characteristic function. This step corresponds to (5b) leading to the "virtual" theory E' of macrodescriptions (10) generated by any micro description as indicated. The essential idea then is to approximate the Boltzmann density f by the measures (10) for large n (and vanishing diameters of the spheres). It turns out that this is not possible for single systems but only for statistical ensembles (in xn) in the sense that for "almost all" members the approximation in question can be obtained 23 . This leads to considerable complications in the formulation of the reduction, and no reasonable generalization comparable with the one of the previous section is known. Without doubt, however, they will be found soon, and it may be noted already now that in addition to coarsening and approximation, a specialization of the probability distributions (in X n ), essential for bringing about irreversibility, enters the scene, again nicely illustrating our synthetic theory of reduction.
v For some of the more basic unsolved problems mentioned in the last section, I will offer a provisional solution. It is characterized by the concept of partial reduction. It is a fact that not merely most but indeed all statements to be found in the relevant text books to the effect that Newtonian mechanics is a limiting case of relativistic mechanics, that Newton's theory of gravitation is a limiting case of Einstein's and statements of a similar stature simply are not founded on any proof of a corresponding reduction in general but are just referring to so many partial reductions that indeed can be established in the field. At the same time the present synthetic theory of reduction is wellsuited for a treatment of just these partial reductions. In partial reductions several elementary reductions are put together to form a symmetric structure - a reduction square - thus exploiting the basic feature of our theory in a natural and elegant way.
Newton
- - - - - - - - - - ....
j
CFA
j
3d Kepler - - - - - - - - - ~ 3d Kepler corrected Fig. 1. 23
Lanford 1976 and 1975, esp. sect. 6 and 7.
V.23 A New Theory of Reduction in Physics
367
Let us first look at an example of a closed reduction square in figure 1. Suppose we want to explain Kepler's 3rd law from Newton's gravitational theory. We can either explain the law approximately from its well-known Newtonian correction and then reduce this exactly (indeed by a refinement followed by a generalization) to Newton's theory, or we can begin with an exact reduction (again a refinement and a subsequent generalization) of our law to the central field approximation of Newton's theory and explain this approximately according to its name. The procedure can be visualized in general by the square shown in figure 2 where the arrows point to the respective theory that is reduced. The vertical reductions ER are exact, and they correspond to each other in the sense that ER2 imitates ERl as far as the new circumstances permit this. By contrast, the horizontal reductions AR are approximate and may be different. The main question answered by a closed reduction square is as follows: Given an approximate reduction ARl of E to E', what about corresponding reductions AR2 of exact "consequences" () of E to exact "consequences" ()' of E'? The example given shows that under favorite circumstances the answer to this question is in the affirmative. Now our present interest in the reduction square does not apply to the closed but to the open one. A reduction square is open if we omit AR l . This omission may be deliberate. But, of course, our interest derives from cases where we do not know whether E can be reduced to E' or where we even surmise that it cannot be reduced. It is in such cases that we must have recourse to partial reduction of E to E' in the sense that merely special cases, coarsenings, or their combinations, derived from E, can be reduced (approximately) to special cases, coarsenings, or combinations of such, derived from E'. As mentioned earlier, general relativity and quantum mechanics are cases in point. Not knowing the solution of the reduction problem in general we may, for instance, specialize Newton's and Einstein's theory to static, spherically symmetric solutions in empty space and then approximately reduce Newton's theory thus specialized (the central field approximation) to Ein-
E' - - -ARl - - - - - - -... E
()' ---------- ... ()
AR2
Fig. 2.
stein's theory specialized correspondingly (Schwarzschild metric). Similarly, the position distribution of a classical harmonic oscillator in thermodynamic
368
V.23 A New Theory of Reduction in Physics
equilibrium can easily be reduced asymptotically to the corresponding quantum mechanical case, although we do not know how to handle the general case. Many, many examples of this kind could be given from the quantum and relativity domain showing the importance of partial reductions. However, the concept of partial reduction is essentially incomplete as long as one has not specified the correspondence between the two vertical reductions in the square: They have to be the same as far as possible in view of the difference between E and E. To make this precise is no easy task in the case of an open reduction square though it is certainly easier than the reduction itself. In this essay I have indicated a new approach to theory reduction. It was called "synthetic" because, on the basis of rules (R), the most general reduction is obtained recursively from certain initial reductions. The method of rule-generated reductions was chosen because the typical situation that we meet with in physics are reductions that are combinations of other reductions, sometimes widely differing in kind. Once one has analyzed a given reduction as the combination of, say, an asymptotic reduction followed by a direct generalization followed by equivalence it is hard to see what other description of the situation could be given that would be equally satisfactory as the one here suggested. Partial reductions as they were considered at the end of the essay obviously are not reductions in the sense of our notion, but their definition rests on this very notion, and again it is hard to see how else one could obtain this extension. On the other hand, there certainly are extensions of our concept in the sense that new elementary (initial) reductions are introduced. They may be useful in microreduction and also elsewhere.
V.24 The Rationality of Reductionism* The word "reductionism" is a dirty word. This is so not only because it is an "ism" word. As such, reductionism, or rather special reductionism, is easily defined as the belief in the possibility of reducing something to something else or as the advocacy of a program to reduce something to something else. General reductionism, then, is the positive attitude towards reductions independent of the reduction partners. Mechanicism was the belief that all of physics could be reduced to mechanics, and logicism was the belief that all of mathematics could be reduced to logic. In both cases, it turned out that the corresponding programs were impracticable, and today we no longer believe in them. Nonetheless, the work done on them showed that reductionist programs are a kind of enterprise that scientists like to embark on. Regardless of the successes of special reduction programs general reductionism seems rational because it fulfills an ideal of science with its theoretical economy and systematic unity and, as we all know, besides failures there have been successes in this field. One great and undeniable success was the unification of electrostatics, magnetostatics and optics in classical electrodynamics. A case like this shows that at least in physics, the subject allows reductions and that general reductionism is rational in this sense too. The cases mentioned so far have been failures or successes of reductionism almost independently of the meaning of reduction. The failure of mechanicism could not be rectified by the argument that we had not used the right concept of reduction after all. Similarly, Maxwell's success could not be shaken by the objection that his achievement did not conform to our normal understanding of what a reduction is. However, if we leave these paradigm cases in favor of more controversial ones, it is the meaning problem where reductionism raises its ugly face. A heatedly disputed case in point is the reduction of biology to physics (or to physics and chemistry, if you do not even believe in the reduction of chemistry to physics). In this field, people started to make distinctions between different concepts of reduction, and they argued that if reduction is taken to mean such and such, then they would believe in the reduction, but if it is taken to mean such and such, they would not. Hans Primas 1 for instance, has distiguished between strong and weak reduction and has denied the reducibility of chemistry in the strong sense, but affirmed it in the weak sense. At a famous conference on reduction in biology 2 one of the organizers, Francisco Ayala, introduced a trichotomy of ontological, methodological and epistemological reductions 3 , and a similar classification was defended by Ernst Mayr4. Both have subsequently used their distinctions to make clear that ontological reductionism is no longer controversial among * First published as Scheibe 1995a 1 Primas 1985 2 Ayala/Dobzhansky 1974 3 ibid., p. VIII 4 Mayr 1982, p. 59
369
370
V.24 The Rationality of Reductionism
biologists, whereas the other two cases still are. However, in spite of these attempts at conceptual clarification the controversy has continued, and since it is a typical meta-scientific issue philosophers of science have also interfered. A case in point is Ernst Nagel's treatment of the subject in his very influential book The Structure of Science 5 . In Chap 11, devoted to the reduction of theories, Nagel states the formal requirement that "a reduction is effected when the experimental laws of the secondary science (and if it has an adequate theory, its theory as well) are shown to be the logical consequences of the theoretical assumptions ... of the primary science" 6 . In the course of this paper it will become clear that: (a) from the general idea of what a reduction is, it is very tempting indeed to establish such a requirement, but (b) from the viewpoint of what a reduction can be reasonably expected to achieve, the requirement is much too demanding. In other parts of his book 7 , Nagel himself seems to be quite aware of this. However, as it is usually the case, these inconsistencies have mostly gone unnoticed, and the treatment of the matter in the said chapter has been standard for quite a while now. In particular, the passage quoted is perhaps the most frequently quoted passage in the whole field. Accordingly, in the introduction to the above mentioned conference8 , Ayala also falls victim to the situation and has little trouble in objecting that "at present there is no class of statements belonging to physics and chemistry from which every biological law could be derived"g. How muddled the situation has become can be seen from another contribution to the Bellagio Conference. Peter Medawar lO arguing in favor of reductionism, compares the well-known hierarchy of sciences corresponding to the ontological levels from elementary particles up to social groups with the hierarchy of geometries according to Felix Klein's Erlangen program. In this comparison, according to their respective degrees of generality, topology corresponds to elementary particle physics and Euclidean geometry to sociology. Obviously, this arrangement suggests that the matter is exactly the opposite of what Nagel would have it: elementary particle physics would follow from sociology as topology follows from Euclidean geometry and not the other way round. I do not want to make the situation look more confused than it is, but I should mention the trouble that has been added to the field by men such as Thomas Kuhn and Paul Feyerabend l l . The cases they presented belong to the dynamics of theories, i.e. questions of the comparability and reducibility of theories succeeding each other in the development of a discipline and of the progress connected with such a series. Once more, this intervention showed that we have here a field badly in need of conceptual clarification. I know that philoso5 6 7 8 9 10 11
Nagel 1961 ibid., p. 352 ibid., pp. 433f. no.2 ibid. p. Xf Medawar 1974 Kuhn 21970; Feyerabend 1975
V.24 The Rationality of Reductionism
371
phers are inclined to generate problems that are not problems for scientists. I was, therefore, positively relieved when I became aware of the controversy between Steven Weinberg and Ernst Mayr 12 on our subject, where at the beginning of his last attempt to convince his adversary Weinberg says: "I can't help but feel that if I could only express myself a little more clearly, Ernst Mayr and I would see that there is really no disagreement between us." This confession, I feel, justifies the following attempt at rationalizing reductionism also with respect to the meaning of its basic concept. In some quarters the term "reduction" is reserved for reductions where a concrete whole in a sense is understood in terms of its parts; the general meaning of reduction, however, is by no means restricted to ontological reduction. Physics is full of reductions of theories to their historical successors with no change in the original domain. Kepler's theory of the planets can in a sense be reduced to Newton's gravitational theory applied to exactly the same bodies. What it means when we say that A is reduced to B, then, is much more general. It seems to be that the reduced A somehow becomes incorporated in the reducing B or is preserved in B or, conversely, that B makes A redundant or superfluous. We could also say that the reduced A depends in its existence wholly on the reducing B, such that a complete inventory of the field in which the reduction takes place, if it includes B, need not also mention A. Moreover, the relata of a reduction, i.e. the entities related to each other by reduction, though they may be, as in ontological reduction, concrete things such as physical bodies, fields and particles, need not be. Rather they may be abstract entities such as concepts, theories, statements and methods. This is already implied by the threefold classification by Ayala and Mayr mentioned above. Thus, apart from being something quite general, the existence of reductions crucially depends on the existence of certain procedures or operations in the codification and systematization of our knowledge that, if applied, do not lead to anything new. The application of these reduction procedures does no more than to unfold what in a sense was already at hand. Paradigms of such procedures are the explicit definition of a concept from other concepts and the logical derivation of a statement from other statements, and it is precisely for this reason that Nagel and other philosophers have suggested that from a formal point of view reductions should essentially be logical derivations combined with definitions. Now, in principle, the search for a concept of reduction is the typically meta-scientific business of looking out for those analytic procedures, but the standard conception was in one sense too strong and in another sense too weak. Very roughly speaking, it was too strong because it did not allow for approximations and it was too weak because it did not sufficiently take care of the demands of the entities to be reduced. The following is a very brief sketch of a new approach to the matter13. 12 13
Mayr /Weinberg 1988 Scheibe 1993a (this vol. ch. V.23)
372
V.24 The Rationality of Reductionism
In this approach, the usual method of explication by logical analysis is dismissed in favor of a synthetic method closely accommodated to the nature of the case. Reductions admit of a natural product operation by simple concatenation; the product being again a reduction, the operation preserves the general feature of reductions that was described above. We all believe that if we had a reduction of biology to chemistry and one of chemistry to physics, then we would also have a reduction of biology to physics. On the other hand, an analysis of physics shows that there are very special and widely differing kinds of reductions such that the product of any two reductions of different kinds belongs to a new kind and that the reiteration of the product operation leads to still other kinds of reductions. The basic idea of the new method, therefore, is to generate the most general reduction or kind of reduction by starting with some elementary reductions and successively applying the product operation in any way we like. For comparison, you may think of the manner in which natural numbers are generated by successive multiplication of the prime numbers. As a matter of course there are differences: our product operation is not commutative, it is idempotent for the elementary or pure reductions, and these, therefore, are defined not simply by indecomposibility, but by indecomposibility except for idempotence. The main thing is that we can generate ever new kinds of reductions in the way described, and for the time being we have no other method of doing this, as we do have in the case of the number system, where a theorem of prime number decomposition is proved. The usual explication method of logical empiricists would be one such alternative, but it seems hopeless that it would succeed in the case of reductions. Let me now illustrate the foregoing general considerations by some examples. The illustration is confined to theory reduction and is mainly inspired, as I readily admit, by examples from physics. This has obvious disadvantages when we think of the whole of natural science, but it has the advantage of accuracy from a formal point of view. Indeed, the whole matter even allows a purely syntactical treatment. To make understanding easier, however I will mostly use the material mode of speech. A theory will then be viewed as a class of possible physical systems - the models of the theory. Electrodynamic is the class of possible electrodynamic systems, quantum mechanics the class of possible quantum mechanical systems, and so on. With this understanding, the first kind of reduction to be considered is generalization. In generalization, we pass from a theory T to a theory T' that is more general than T in the sense that every physical system admitted by T - every model of T is also a model of T', but (as a rule) not vice versa; we reduce T to T' by means of a contingent condition which together with T' leads back to T. This kind of reduction goes back to Aristotle. He asked, for instance, the question: What is man? And he gave the answer: man is a rational animal. Here, the step from man to animal is the generalization which is then turned into a
V.24 The Rationality of Reductionism
373
reduction by adding the condition of rationality characterizing men among animals. What is the importance of generalization? It is small, I think, as long as generalization is performed alone, but in combination with other pure reductions it may become extremely important. I shall discuss this importance tacitly assuming the relevant context. Firstly, if we look in the direction of generalization proper, the generalized theory widens our horizon by taking into account new possibilities not included in the old theory. Part of Newton's generalization of Kepler's laws was the insight that in addition to ellipses, hyperbolas are also possible curves for the motion of a celestial body in the gravitational field of the sun. This was a highly non-trivial achievement at the time, because on a priori grounds there is an uncountable number of other classes of curves that might have been the correct generalization. Secondly, if we look in the opposite direction, the direction of specialization, it is the specializing condition that becomes important. At the same time, it is here that the matter may and actually has become controversial 14 . On the one hand, as long as we are free to choose the condition in question as we like, we are engaged in a procedure which evidently does not carry us beyond the given general theory. Look at a textbook, say, on Hamiltonian mechanics and ask yourself what the book is based on and you will see that specialization is one of the most presentable cases of intra-theory development. On the other hand, if besides the general theory a more special one is also given, we are no longer free in choosing the specializing condition, having to find it instead, so to speak, by solving an equation. This task may become very hard. Earlier, we heard from Ayala that "at present there is no class of statements belonging to physics and chemistry from which every biological law could be derived." Relying on the standard conception of reduction, this is an utterly misleading statement, for on the basis of this conception, the reductions in question are impossible not only at present, but in principle. It is different, I think, with the new approach. With it we are on the right track in principle, the difference being that it (but not the standard conception) allows for specialization. We may, however, be faced with tremendous difficulties in practice, for solving the equation mentioned above means finding the specializing condition in terms of the more general theory but, at the same time, meeting the contingent situation presupposed by the more special theory. In the case of complex systems, this task may simply exceed all the ingenuity and inspiration of the scientist. A point is reached where the anti-reductionist will no longer accept any distinction between what it impossible in practice and in principle. Yet, the inventor of a reduction concept is not guilty of the complexity of nature, and the charge of impracticability of a reduction is not a charge against the concept of that reduction. It is conceivable that we are in possession of the right theory of reduction, but cannot know it for practical reasons. 14
Hoyningen-Huene 1985
374
V.24 The Rationality of Reductionism
Let me now come to a second kind of pure reduction. I call it "refinement" and its converse, accordingly, "coarsening". Theory T' is a refinement of T if, with respect to a surjective mapping P, every physical system admitted by T' is mapped onto a model of T such that the description of the T' system (according to T') is more complete (or more refined) than the description of the T system (according to T) that is its image. Syntactically, refinement amounts to a derivation of T from T' in terms of P. Planck's radiation law as a refinement of Stefan-Boltzman's law is a typical example. The latter law is about the integral energy density inside a black body, and the former is about the spectral energy density, which obviously gives a much more informative description of the system. Thus, in reduction by refinement, new physical systems are not the possible advance of theory T' over T, but rather a more refined description according to T' of each system already admitted by T. If we go on comparing refinement with generalization on the syntactical level, we see how little it means to characterize reductions as derivations. We have derivations in both cases, yet the effect is entirely different according to the different role played by the additional premises. In generalization, the premise restricts the totality of models of the reducing theory to a subclass, and in the other case it coarsens the description of each single system. Whereas in the first case redundancy, as we have seen, comes from specialization, in the second case it comes from loss of information. In both cases, if the premise becomes trivial, the derivation becomes equivalence, i.e. it becomes trivial, too. In other words, derivation proper plays no role in theory reduction as considered so far. As to the importance of refinement in physical reductions, the situation is quite similar to what we have found for generalization. In particular, we can imagine the importance of coarsening in every case where the problem is to simplify the treatment of complex systems and connect it with their known behavior on a phenomenological level. Generalization and refinement obviously imply some progress connected to the passing from the reduced to the reducing theory, but this progress does not include any essential correction of the reduced theory. The converse of a reduction is an exact recovery of the reduced theory, and at face value this seems to be a necessary condition in view of the redundancy or analyticity required of the process. It is an unwelcome feature, though, in view of the idea that reductions are the typical accompaniments of scientific progress and therefore have to allow for corrections after all. Almost all interesting examples of theory succession in physics involve corrections, in some cases even drastic ones. Such is the case with Einstein's theory of gravitation vis 11 vis Newton's and with quantum mechanics vis 11 vis classical mechanics. Here, the new theories are rivals of their predecessors, contradicting them or showing conceptual incommensurabilities of an even more serious kind. The major difficulty in finding new kinds of reduction taking account of rivalry is to reconcile this situation with the requirement that the preceding theory, as far as it has been empirically confirmed, must be recovered from
V.24 The Rationality of Reductionism
375
the theory superseding it in some sense. Otherwise we are not entitled to speak of progress, and reductionism would come to an end. The essential idea behind finding reductions relating incompatible theories is to allow for approximations: instead of an exact recovery, the reduced theory has to be recovered only approximately, and this is in accordance with the general idea of reductions, because we cannot require more than recovery to the extent to which the reduced theory is confirmed. By the nature of the case, theory approximation marks the points where the matter becomes mathematically more involved, and this is not the right occasion to present any details in this respect. The general idea along which to proceed is simple enough: theories are presented by model sets, and if we succeed in embedding two model sets in a common superspace, we can investigate their relative topological situation. It turns out that, as in exact reductions, there are several pure kinds of approximate reductions, most notably asymptotic and limiting case reductions. They can be instantiated by examples from physics and added to our stock of pure reductions. On several occasions I have emphasized that real life is breathed into the subject of reductions only when it comes to combine the pure reductions that have been reviewed so far. Here, I can consider such combinations only incidentally, and my major theme will be ontological reduction. This is a notorious subject that is even more confused than the general situation described above. Very much depends on formulation, and I can give only some hints on how ontological reductions can be incorporated in the present approach and how they cannot. One first thing that we have to bear in mind is that ontological reduction is not the reduction of theories to other theories, but the reduction of physical objects to other physical objects. Its general principle, though, is again one of economy and redundancy: the reduction is to objects being the constituents, possibly the ultimate constituents, of a given object, not that this object exists whenever the other ones exist. However, the reduced object certainly would not exist if the objects constituting it did not. In this sense, an inventory of the objects that could exist in the universe may dispense with the ontologically reducible ones. With this understanding, the major problem with ontological reduction is whether or not it implies theory reduction, i.e. whether or not you can consistently hold that - at least in certain cases - the theory of objects of kind B cannot be reduced to the theory of objects of kind A, although you believe that each object of kind B does consist of objects of kind A. In this so-called problem of emergence, both parties, the believers and the disbelievers, take ontological reduction for granted. The difficulties, therefore, seem to be difficulties in theory reduction vis a vis ontological reduction somehow involved in the former. In this situation, it is somewhat disturbing to see that most, if not all, contributions to the problem do not contain any closer investigation of how the premise of ontological reducibility comes to bear on the reducibility of the corresponding theories. The usual approach is by means of a linear hierarchy
376
V.24 The Rationality of Reductionism
of levels or layers, beginning with the layer of elementary particles and ending up with the layer of social groups 15 . However, apart from the fact that this picture does not answer our question, it is not clear how to assign theories on a one to one basis to the layers such that a sufficiently definite reduction problem arises. Rather popular is, for instance, the assumption of the two successive layers of atoms and molecules, but what would be the theory of molecules as distinct from a theory of atoms and to be reduced to it? It seems that in the treatment of atoms, we already make use of a general theory of many particle systems under mutual coulomb interaction, and we are not worse off with this very same theory in treating molecules. Both theories seem to be specializations of a common generalization, and a theory of molecules will at best appear as a special case of a theory of atoms. The matter can be more clearly discussed if we recall that there are fairly precise rules for the treatment of composite systems that answer the question of how ontological reduction is involved in theory reduction.' The rules are different, though similar, in classical physics and quantum theory. In the following, I have to confine my attention to the former case. For classical dynamic systems, the totality of contingent properties of a system consisting of two subsystems is essentially, i.e. apart from topological questions, the Boolean product of the Boolean algebras of contingent properties of the two subsystems 16 . Here, the contingent properties of any system are just sets of possible states of the system, namely of those states in which the system would have the property in question. Secondly, the dynamics of the total system, represented by a set of changes of state admitted on account of its theory, is the Cartesian product of the dynamics of the subsystems, again each dynamic being represented by a set of changes of state admitted by the theory of the respective subsystem. Now, of these two rules the first, concerning the contingent properties, can almost always be enforced and even defines the case to be one of a system composed of subsystems. Insofar, then, the reduction is trivial: we know the possibilities of the total system when we know those of its subsystems, at least in principle. Exactly the same is true of the dynamics, yet there is a profound difference: whereas we would not even know what a composite system is if there were not something like the Boolean product structure at hand, we do not wait for the subsystems to have their own dynamics in order to be subsystems. Rather, experience teaches us that though subsystems may develop independently, there are very important cases where they do not. These are the cases in which we say that they interact, and the question of what any interaction is like cannot be settled by looking at the subsystems separately. We cannot learn what a system of gravitating bodies is by investigating each body by itself. 15 16
Oppenheim/Putnam 1958 Sikorski 1969, § 13
V.24 The Rationality of Reductionism
377
The reconstruction of ontological reduction underlying the foregoing argument was meant to replace the rather vague usual formulations of the whole-part type. If this is accepted, then the conclusion is, as we have seen, that ontological reductionism is either trivially true or trivially false. In particular, we cannot understand the behavior of composite systems from the behavior of the components if any theory of the latter presupposes a theory of the former. The only escape from this is that in certain limiting cases, the features peculiar to the subsystem level become irrelevant on the next, higher level. Is this the end of the matter? I think not. For a brief outlook, I may come back to my favorite idea of recursively generated reductions to be applied in various combinations. A paradigm case is the reduction of some hydrodynamic equations via the Boltzmann equation to the laws of classical particle mechanics. I am aware of the fact that this reduction has not yet been rigorously established, but it seems good enough to illustrate how combinations of pure reductions can be used in microreduction 17 . Although here we have the two levels of particles and macro-observables, there is no theory of the particles other than the one that from the very beginning includes the interaction between them. Accordingly, the theory to be reduced is not a theory about objects different from the level corresponding to this theory. Rather, the reduced theory is about the very same system which the reducing theory deals with, viewing it, however, under quite different aspects according to the different epistemic situation, and these aspects come in via the various pure reductions into which the Boltzmann reduction can be decomposed. Here we meet our old friends again, each one with an important function. Refinement takes over the function of ontological reduction, specialization, being the converse of generalization, shows the emergence of new concepts having meaning only under assumptions not admitted on the general level of particle mechanics and the approximations involved correct the picture of continuous matter implicit in the hydrodynamical model. It goes without saying that the relative success of the Boltzmann reduction analyzed in this way is no more a rigorous disproof of the whole-part conception of theory reduction than the previous argument had been, but it is a proof that this conception is not the only viable one in the field.
17
Lanford 1976, Mintzer 1967
VI. Foundations of Quantum Mechanics
In the six papers devoted to the foundations of quantum mechanics four subjects are treated. They are distributed over the papers as follows: [25] Quantum logic [27] The Copenhagen interpretation and Bohm [26], [28]-[30] Hidden parameters and Bell's theorem [28]-[30] EPR-situation and Bell's inequality In the following introduction [27], being merely an account, is touched on only occasionally. [25] is a short critique of the state of the so-called 'quantum logic' around 1984. The core of this deviant logic is the mathematical fact that the closed subspaces of the Hilbert space of quantum mechanics form an algebra with respect to intersection, linear union and complementation which is not boolean. The criticism is not directed against the thesis that the probability-free contingent propositions of quantum mechanics obey a non- classical logic but against certain defects in its presentation and treatment. The first criticism is that all calculi presented as generalizations of the Hilbert space algebra are passed off for quantum logic without a completeness proof, i.e. a proof that not only all formulas derivable in the calculus are valid in Hilbert space but also, conversely, that all laws valid in Hilbert space are derivable in the calculus. Such a proof was not available in 1984 and still does not seem to have been given. Moreover, as a rule not even the question of completeness is raised. Perhaps for this reason quantum logic often is not viewed as a calculus at all but rather as a model of one - namely the Hilbert space algebra mentioned at the beginning. But this view does not escape blame either, because the relations valid in some model need not be logical in character. The criticism is also directed against the neglect of the important fact that with the unique exception of the probability-free contingent propositions (corresponding to the closed subspaces) all other propositions occurring in quantum mechanics obey classical logic. The proper task, therefore, is the establishment of a logic combining both classical and quantum elements. In particular, all propositions expressing relations between those contingent properties (or: propositions) as, for instance, the transitivity of implication
E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001
380
VI. Foundations of Quantum Mechanics
are classical propositions. Nevertheless in quantum logic they serve to formulate the quantum logical laws. Moreover, one would have to tell us which status is ascribed to mixed propositions like
Another group of propositions obeying classical logic is formed from propositions by which we express the performance of measurements and their results. With the help of such propositions and, of course, the usual probability statements, likewise obeying classical logic, quantum mechanics can be formulated without the use of a quantum logic. Quantum logicians would have to tell us how such a (complete, for that matter!) formulation of quantum mechanics fits into their schema. The classical character of the quantum mechanical probability statements certainly is the reason why most physicists are so thoughtless about quantum logic. A quantum logician, however, has to explain how the classical probability statements are built up from probabilityfree, contingent statements obeying a non-classical logic. Among the attempts to find an alternative to the Copenhagen interpretation of quantum mechanics re-introducing classical modes of thought, the attempts to establish a classical theory of hidden parameters (cf.[27]) have been particularly informative. The first achievement of such endeavours that drew some attention was the (first) theory of Bohm (1952). It already had to stand its ground against a proof that such a theory, and even the hidden parameters themselves, could not exist. In this proof, given in 1932 by v. Neumann, it is assumed that the hidden states are functions of the quantum mechanical observables whose values are objective in contrast to the expectation values of proper quantum mechanics. However, Bohm could clarify that his hidden parameters were something different so that v.Neumann's proof was irrelevant for his theory (cf. [27]). By contrast, in the investigations of [26] and [28]-[30] v. Neumann's condition on the hidden parameters is accepted. We adopt it in the (discrete) form that a hidden state assigns to every alternative, i. e. to every decomposition of the Hilbert space into pairwise orthogonal closed subspaces, one of its elements as the 'value' of the alternative in this state. In this way we arrive at a state space S with the classical feature that in every state s E S the value of every alternative is uniquely determined. S, with no further conditions put on it, is not empty (the axiom of choice!). But in the following proofs of the non-existence of a classical theory of hidden states, S will be restricted in one way or other. In the case of v. Neumann's proof the restriction is that given any state s E Sneu for all quantum mechanical observables A, B E Ob q s(A) E B V s(B) E A=} s(A)
= s(B)
(1)
holds. In other words: the result of a measurement is independent of any alternative the measurement of which can have this result. Under this assumption the restricted space Sneu is empty. The attempt to form a theory
VI. Foundations of Quantum Mechanics
381
of hidden states already fails at this point. The emptiness follows solely from the 'logical' structure of Obq , namely from the existence of sufficiently many incommensurable alternatives. A second proof, not occurring explicitly in the literature, is founded on Heisenberg's indeterminacy relations. In this case S is restricted by the condition
(2) for all A E Ob q • Result: Shei. Unlike the v. Neumann condition this condition still allows for many hidden states, and the impossibility of a corresponding theory must come from elsewhere. If we represent an observable A E Ob q by the function XA : Shei -+ JR,
XA(S)
= s(A)
(2a)
then it follows from (2) that (2b) Now, to every expectationvalue function Eq there should exist a classical Ec such that (2c) for all A E Ob q . Because of (2b) this has the consequence that for the corresponding standard deviations (2d) holds. However, because of the indeterminacy relations the q cannot show the same behaviour as the c. (For the details of this proof the reader is referred to [28], §III, and [30], §3.) The proof which makes use of Bell's inequality is quite similar. In this case we have a physical system consisting of two subsystems I and II. The condition restricting S is the factorization
s(F 0 G) = s(F (1) . s(10 G)
(3)
for all F E Ob~ and G E Ob~I where· is the intersection. As in (2a) we introduce the classical quantities A on the restricted Sloe C S, and we get corresponding to (2b) XF0G
= XF01 . X10G
(3b)
If we now try to reduce the Eq to the classical Ee as in (2c) the result is
(3c)
382
VI. Foundations of Quantum Mechanics
and thereby, according to [28], 11.(1) for the Bell function - corresponding to
(2d) -
L1(Eq; F, F'; G, G')
= L1(Ec; XF®I, XF'®I; Xl®G, X1®G') :S 2
(3d)
for all F,F' E Ob~ and G,G' E Ob~I which is impossible. (Bell's theorem) Papers concerning Bell's theorem frequently mention the famous argument by Einstein, Podolsky and Rosen which shows that the quantum mechanical ,¢-function does not give a complete desription of the state of a physical system. Such is the case also in Bell's relevant papers. In his case the reason seems to be that, the EPR-incompleteness interpreted as the absence (in quantum mechanics) of classical parameters which, if duly considered, would yield a complete description, Bell's theorem has now to show that such parameters cannot be introduced. Now, whatever this argument can reveal, it makes no use of the crucial construction on which the EPR-argument itself is based. But there is indeed a close connection between this construction and, if not Bell's theorem, then at any rate his inequality. With the same setting as before the classical inequality IS
L1(E; F ® 1, F' ® 1; 1 ® G, 1 ® G') :S 2 for an arbitrary (statistical) state E and quantities F, F' of I and G, G' of II. The corresponding inequality in quantum mechanics is violated for appropriate choices of the variables. This situation invites us to look for additional premises about the variables such that, if they hold then the inequality is valid also in quantum mechanics. Such premises have indeed been found by using appropriate classical features of quantum mechanics (cf. [281 and [29]). As it turns out a case to the point is the EPR-situation where F of I and G of II as well as F' of I and G' of II are EPR-correlated. This means that the values Ii of F and gi of G and the state E are such that the conditional probability in E for the occurrence of gi immediately after the occurrence of Ii is 1, and the same holds for F', G' and E. If these conditions are satisfied then the inequality also holds in quantum mechanics. 1 This theorem does not tell us how many cases of EPR-correlations there are - not even whether there exists at least one. EPR have constructed one case in the continum: canonical variables q}'PI of I and Q2,P2 of II and a common eigenstate of Q
= Ql + Q2,
P = PI
- P2
are EPR-correlated. In the discrete case, pairs of spin operators with opposite spin directions are EPR-correlated in the state cp
1
1
= y'2 . (>1 ® '¢2 - >2 ® '¢1)
As conjectured first in [29] together with a proof for the dimension 2 of the Hilbert spaces. A proof for the general case is found in Schlieder 1995
VI.25 Quantum Logic and Some Aspects of Logic in General* In a book on the contingent propositions in physics, published twenty years ago, I have given a rigorous formulation of quantum theory based on classical logic. 1 The formulation included and was actually centered around propositions about the behavior of single particles, and it was intended to show that there is no need for the introduction of a new logic - quantum logic - into physics. I don't mention this past endeavor of mine in order to argue that, if only you had carefully read my book you would not have wasted your time in developing quantum logics. I rather mention it to apologize that, having drawn this consequence for myself, I am not on top of the work done in this field in the meantime and, therefore, had perhaps better remained at home. However, one cannot help being impressed by the ever growing amount of papers devoted to the subject, and it is already this amount that makes me more and more uncertain whether I really succeeded in showing that there is no need for quantum logic, let alone the question whether such a logic might not be desirable after all. You will therefore understand that I take this occasion to make a fresh start in getting clear about what is at stake. My own contribution will be confined to a few remarks and questions concerning the relation of quantum logic to some general views on logic and to ordinary logic itself. In due course I shall also say something about myoid approach. But I will start with a different matter. As you all know the first argument in favor of quantum logic had been an argument by analogy: Von Neumann argued that the lattice of closed subspaces of a Hilbert space has to be taken as the quantum mechanical analogue of the lattice of Borel sets of a classical phase space. The latter is boolean and insofar mirrors the laws of classical logic. The former not being boolean v. Neumann concluded that in quantum mechanics we have a new non-classical logic before us. 2 That this argument is still in use can be seen from a recent paper by Putnam in which he argues: "There are operations approximately answering to the classical logical operations, namely the V, . and - of quantum logic. If these are not the operations of disjunction, conjunction, and negation, then no operations are.,,3 With respect to this analogy my first remark will be that for classical mechanics the problem of the logic of its contingent propositions has been solved in a way in which it has not yet been solved for quantum mechanics. Saying this I do not want to suggest that it cannot be solved also in this case. I only want to point out that in spite of the 381 books and papers drawn up in a recent bibliography on quantum logic 4 the problem in question, to the best of my knowledge, has not been solved. * First published as Scheibe 1985. 1
2 3
4
Scheibe 1964 v. Neumann 1932, Ch.III.5; Birkhoff and v. Neumann 1936 Putnam 1969, p. 235 Beehner 1980
383
384
VI.25 Quantum Logic and Some Aspects of Logic in General
To make clear what I mean let us assume that a particular structure S consisting of the Borel sets of a classical phase space provided with the usual operations of intersection, union and complementation is given. We may then look at this structure as a set of possible propositions that can be made on the state of a mechanical system at any time together with operations to be performed on these propositions. When I said a moment ago that the problem of the logic of these propositions has been solved I meant to say the following: We succeeded in subsuming the particular structure in question under a whole class of structures, the class E of boolean algebras, satisfying conditions like the following: 0) E is an elementary class definable by an open (first order) theory. 1) The sets of identities satisfied in S and in E respectively are the same. 2) There is a calculus in the strictest sense of the word that enumerates all identities satisfied in E or ~ equivalently ~ in S. 3) The algebra of w-placed polynomials in S is a free algebra over E with the projections as its free generators. Again, having said that the quantum mechanical analogue of our problem has not been solved I meant to say that, given a quantum mechanical structure S' corresponding to S, i.e. the set of closed subspaces of an infinite ~ dimensional Hilbert space provided with the canonical operations, we don't know the (unique) class E' corresponding to E, i.e. having the properties 0) ~ 3). Thus I don't claim that this class doesn't exist in the quantum mechanical case but only that we don't know it to exist. 5 Why do I say that, given a structure S of propositions, we have solved the problem of their logic when we have found the class E being related to S by conditions such as 0) ~ 3)? The first reason, implicit in 2), is that the laws of a logic should be enumerable by a calculus, i.e. a strictly effective procedure the rules of which are rules of logical inference. It doesn't matter that the laws, as it is the case in 2), are laws for logical equivalences. They could equally well be laws for logical truth, logical implication or what not. The important thing is the effectiveness of their presentation. A second point is the presence of a language of propositional forms as it is guaranteed by 3): The polynomial algebra of S, if it is a free algebra over E, can uniquely be represented as the quotient of an absolutely free algebra modulo a congruence relation of logical equivalence. The latter is the algebra of propositional forms. In the classical case the former is essentially the Lindenbaum algebra. A third reason is shown by condition 1). As you know text books in mathematical logic don't start their business by assuming some particular structure of propositions being given. They rather start with aspects connected with properties 2) or 3): with 5
In the discussion Prof. van Fraassen and Prof. Kalmbach informed me about the following result recently obtained by Goldblatt and related to the problem posed in the main text: The class of orthomodular frames is not elementary, indeed not even Ll -elementary. See Goldblatt 1973 and 1984. See also Kalmbach 1983, p.348, problem 29
VI.25 Quantum Logic and Some Aspects of Logic in General
385
calculus and language. But if our starting point is a particular structure of propositions, as it actually has been the situation in quantum mechanics, then the problem of abstracting a logic from this structure would be hopelessly ambiguous unless this structure is assumed to be representative for the logic we are looking for. And if we ask ourselves what that means we are led to condition 1) saying that the laws of our logic and only these are realized in the particular structure given. Here again it doesn't matter that I have chosen logical equivalence to formulate 1). Finally, as regards condition 0) the three other conditions would still allow E to consist of S alone. Although this would trivialize only 1) it is better to have a condition of maximality and uniqueness of E, and this is accomplished with 0). Leaving this matter I now come to the second part of my paper in which I want to protest against a piece of terminology that has come into use in the field of quantum logic and general axiomatics of quantum theory.6 I am referring to the usage according to which a logic is a member of a certain class of (first order) structures in the same sense in which a group is a member of another such class, a topological space a member of yet another class etc. I am quite prepared to accept that there is not only one logic. But I find it utterly misleading to use the word 'logic' as a name for a huge class of structures, indeed a class that is not even a set, suggesting thereby that there are as many different logics as there are non-isomorphic structures in this class. Let me give one argument to show you that my quarrel is not only about a word. Although every boolean algebra is representative for classical propositional logic and although the corresponding Hilbert space algebra may be representative for a logic still to be found we have to distinguish conceptually between logical laws of a relation of implication or equivalence and, on the other hand, a relation of logical implication or equivalence. If every member of a class of lattices having some further nice properties is called a logic this terminology suggests to think, say, of the fundamental implication of the lattice as being eo ipso a logical implication. But although the implication in a boolean lattice of propositions satisfies all the laws of classical propositional logic it is not always itself a logical implication. It is a logical implication if and only if the boolean lattice, viewed as an algebra, is free. For if it is free and if a set of free generators is specified then, given any two propositions p and q, these propositions have polynomial representations in terms of the generators. These representations are not unique. However, given any two of them, one for p and the other for q, p implies q in the sense of the lattice relation if and only if q can be inferred from p on account of the representations chosen and the generally acknowledged rules of classical logical inference. If, on the other hand, our boolean algebra is not free than it is essentially the quotient of a free boolean algebra modulo a non-trivial congruence relation. 7 Again 6 7
The anthology Hooker 1975 and 1979 may be taken as representative for the usage in question. See Monk 1976, p. 160, Theor. 9.60, and Sikorski 1969, p. 44, Theor. 14.4
386
VI.25 Quantum Logic and Some Aspects of Logic in General
we have represent ability in the sense mentioned before. But this time, if p implies q in the sense of the lattice relation there always are representations admitting no logical inference. Rather some additional, non-logical 'axioms', entering the scenario via the non-trivial congruence relation, have to be invoked. And these axioms may very well be ordinary empirical statements of the 'all ravens are black'-type. Well known examples are the Lindenbaum algebras of first order theories. If a first order theory is complete then its Lindenbaum algebra consists of two elements and the implication relation reduces to material implication as opposed to logical implication. Now, the second half of this argument is not quite correct: Although every boolean algebra is isomorphic to the quotient of a free boolean algebra modulo some congruence relation, it may in general not be given in this form. Indeed the classical phase space algebras are neither free boolean algebras nor are they introduced as quotients of such algebras. Most frequently they are introduced as set algebras (fields), and it may be that some people feel justified in looking at the implication in question as logical because it is a set inclusion: one proposition implies another one if and only if the second is true in every state in which the first is true. This sounds very much like a semantical definition of logical implication. However, on this view we would completely relativize logic. This is evident from Stone's representation theorem: every boolean algebra is isomorphic to a set algebra. 8 But it may also be illustrated in physics. A fairly complete formulation of Newtonian mechanics gives rise to as many boolean set algebras of propositions as one wishes. From a logical point of view the phase space algebra is in no way distinguished, and everyone of those algebras could be declared to be a logic with equal right. It is now easy to see that the habit of calling any propositional structure a logic is even more misleading with respect to quantum logic than it is in the classical case. This time we only need to think of a fairly complete formulation of quantum mechanics. Again we would find us in the midst of an infinity of propositions that everybody - even quantum logicists - would subject to classical logic. At the same time, among all the resulting boolean propositional algebras there would appear one very perspicuous singularity: the Hilbert space algebra. And since we don't yet know how to derive it from a proper quantum logic established pari passu with classical logic at the outset we are likely to identify the algebra itself as a logic. A moment ago I said something to the effect that in quantum mechanics, even if it were infected by a genuine quantum logic, classical logic would prevail. In the last part of my paper let me give some arguments in favor of this claim. I think there is general agreement that the propositions submitted to quantum logic, whatever their meaning may be, would be mathematically represented by the closed subspaces of Hilbert space and that the quantum logical operations and relations would be realized by the well known op8
Sikorski 1969, § 8.
VI.25 Quantum Logic and Some Aspects of Logic in General
387
erations on and relations between those subspaces. As I argued before, in my opinion these operations and relations are not themselves of a logical nature. Quantum logicists maintaining the opposite would at least have to make quantum logic infinitary in order to account for the fact that there are infinitely many incompatible propositions which, in their eyes, would be logically incompatible. 9 But whatever stand one may take in this respect statements to the effect that one contingent proposition A implies another one B or that A is compatible with B or that A is commensurable with B, - statements of this kind, I suppose, obey classical logic. I don't know how quantum logicists would prefer to express this fact. If the rules of quantum logic are constructed as universal statements then one could perhaps say that the metalogic of quantum logic is classical logic. The transitivity of implication, for instance, could be asserted by the classical universal statement
A < B 1\ B < C -+ A < C, the distributivity of implication by the statement
An (B U C) < (A n B) U (A n C). The first would be called true, the second false. At any rate we here come across a first class of propositions that although intimately connected with quantum logic, obey classical logic. There was a time when the creators of the Copenhagen interpretation were anxious to emphasize that the uncertainty relations do not express any ordinary disturbance of an object by a measuring instrument. We were warned not to think of an electron as being in such and such a position or as having such and such a momentum, and of a measurement as changing those values in a way that would only be too complicated to survey.lO This warning was the physicists insight in what only thirty years later was given a definite mathematical proof: There exist no 2-valued homomorphisms on a Hilbert space algebra of dimension > 2.11 This result made it quite clear that we cannot express ourselves in - as they might be called - antic propositions in talking about the state of a single quantum mechanical system. Now suppose that I ask a follower of the Copenhagen interpretation: Given the projection operator of the spectral resolution of the I-dimensional position operator, admitting values in the unit interval. What proposition corresponds to this projection operator as the precise analogue and successor of the classical, ontic proposition that the particle has a position located in the unit interval? I am afraid the answer would be either that there simply is no such proposition or - in a somewhat weaker version - that, although there may be one, no considerable portion of quantum mechanics could be formulated 9 10 11
For infinitary classical propositional logic see Karp 1964 So Bohr in all his relevant papers after 1935. See the presentation in Scheibe 1973, Ch. I. Kamber 1965, p. 167, Satz III
388
VI.25 Quantum Logic and Some Aspects of Logic in General
using exclusively these - as they might be called - quasi-ontic propositions. Since these propositions would be the very propositions to be subjected to quantum logic the answer implies a rejection or almost a rejection of this logic on the part of the Copenhagen school. As is well known the alternative chosen by this school was the proposal, as far as single particles are concerned, to describe their contingent behavior in terms of propositions about measurements. According to Bohr's view, although a measuring apparatus as any other piece of matter submits to the laws of quantum mechanics, if it is used as a measuring device it has to be and, at the same time, can be described in terms of classical physics. 12 It is, therefore, in the spirit of this view when we assume that propositions about measurements even if these are performed on quantum mechanical objects, obey the laws of classical logic. Realizing this we may look at the closed subspaces of Hilbert space as representing possible measuring results, and our contingent propositions would just spell out such results. As it turns out it is convenient to introduce also the somewhat weaker propositions saying only that a measurement with A as a possible result has been made but leaving open whether A or its opposite has been obtained. So here we come across a second class of propositions obeying classical logic. And this time these propositions can be used to characterize the state of a quantum mechanical system. Moreover, as I have shown in the book mentioned at the beginning of my paper, these epistemic propositions, as opposed to the quasi-ontic ones, permit a reformulation of the quantum mechanics of a single object that is worth mentioning. 13 As an example I give you the dynamical law of this formulation
A~=1'fe(tv,Av)'
!\ A~;:i . tv :S t v+1. !\ A~AA . t1 :S t :S tn !\ fe(t, A) -+ V~=1 ·t -+ EAnUtn-tn_l'" EA2Ut2-tlEAl .1£
}
= tv !\ A = Av.
(1)
i- 0
roughly saying that if certain measurement results have been obtained at different times and if these are the only results that have been obtained in the relevant time interval then a certain operator, involving the dynamical group of the system, is different from zero. 14 From this axiom, together with some other axioms of minor importance, an amazingly large corpus of consequences can be inferred, including predictions of possible results of measurements on the basis of results already obtained. No comparable formulation can be given on the sole basis of quasi-ontic propositions. The epistemic approach is further supported by the possibility of giving interpretations of the basic relations in the Hilbert space algebra in terms of 12 13 14
cf. n. 10. Scheibe 1964, Ch. 11.1 ibid. p. 90. Here fe(t, A) means that the result A has been obtained at time t by a suitable measurement. UT is an arbitrary element of the dynamical group, and EA is the projection operator associated with the result A.
V1.25 Quantum Logic and Some Aspects of Logic in General
389
epistemic contingent propositions. To give but one example the implication relation can be characterized by
A < BBAME£ . ep(M; A) ----+ (M; B).
(2)
Here M runs over a class E of epistemic situations roughly characterized by some measurements (and no others) being performed and some results (and no others) being obtained. 15 The propositions ep(M; A) express that at a given time, later than all the times at which measurements have been made, the material implication
ob(A) ----+ fe(A)
(3)
follows from M. Here ob(A) and fe(A) are the elementary epistemic propositions introduced earlier. On account of their meaning the implication says: if a measurement is made, suitable to yield the result A, then A is actually found. Obviously these implications are a substitute for the classical ontic propositions saying that the system in question has the property A, as well as for the quasi-ontic propositions submitted to quantum logic. Our elementary implications (3) obey classical logic, and it may be added that one has no trouble with the ex falso quodlibet of the material implication as long as the epistemic situation M that is our premise leaves us ignorant as to the question whether a measurement with A as a possible result is made or not. Under these circumstances the elementary epistemic implications have precisely the same assertive force as their ontic counterparts. Interpretations like (2) seem to show that the Hilbert space algebra is made for epistemic propositions. This can be seen anew by similar interpretations using the probability propositions of quantum mechanics. Since these interpretations are well known I satisfy myself by reminding us that probability propositions constitute a third class of propositions occurring in quantum mechanics and obeying classical logic. These propositions being by far the most important ones their classical behavior is the main reason why most physicists don't bother about logical problems. Strangely enough, the boolean algebra generated by the possible probability statements of quantum mechanics is seldom, if ever, mentioned in publications on the foundations of quantum mechanics. Incidentally, this algebra is an excellent example of a boolean propositional algebra that can be most naturally introduced as the quotient of a free boolean algebra modulo some extra laws, in our case the usual requirements for a probability distribution on a Hilbert space algebra. My last remark concerns the question how a quantum logicist would explain the ordinary logical behavior of quantum mechanical probability statements. In a quantum logical context the wording of such a statement would presumably have to be: "the probability that 0: is p" where 0: is a quasi-ontic proposition submitted to quantum logic. How does it come about that the 15
ibid. p. 113, formula (25)
390
VI.25 Quantum Logic and Some Aspects of Logic in General
whole proposition is true or false, as the case may be, without anything the like being the case for a? The difficulty becomes particularly evident when we consider statistical ensembles. Whether or not our original formulation of quantum mechanics makes it a theory of statistical ensembles and nothing else, every formulation must provide means for dealing with statistical ensembles at least derivatively. The question then becomes: What makes the statement true or false that 800 out of 1000 members of a statistical ensemble 'satisfy a'? I don't mean to ask this question in any ontological sense. The question rather is: How do the truth values of such a statement come about in approximately the sense in which the truth values of a conjunction somehow emerge from the truth values of its components? In the epistemic approach we could construe our statement to mean: If a measurement deciding upon A (corresponding to a) is made on everyone of the 1000 members of the ensemble then in 800 cases A will actually be found. This is a truth functional construction. But in quantum logic?
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?* In my paper I am going to give an elementary axiomatic treatment of the question what kind of hidden variables are excluded by quantum mechanics on the basis of Bell's inequality.i In this field the two perhaps best known facts are 1) that long ago v. Neumann gave a proof that quantum mechanics does not admit hidden variables and 2) that later on Bohm and others succeeded in showing that it does. 2 Since then it was a question of conceptual clarification what kind of hidden variables, or rather: what kind of theories of hidden variables, could indeed be excluded by proof and what other kind, if any, would have to be admitted. 3 Taking the view that Bell's achievement essentially is a proof to the effect that a theory of local hidden variables is incompatible with quantum mechanics I wish to emphasize the following peculiarity of this proof: v. Neumann and all his followers had concepts of hidden states allowing them to show that the extensions of these concepts were empty. Consequently, the hidden states themselves being excluded no further probabilistic reasoning was necessary.4 In Bell's case it is precisely the other way round. The class of local hidden states is not empty, and the burden of the proof concerns the probabilistic part of the theory. In view of this situation an analysis in two parts suggests itself, a nonprobabilistic and a probabilistic one, and since the argument belonging to the latter is well known emphasis will be on the former.
1. Introduction By way of introduction I first want to touch upon the problem of interpretation. There is general agreement that the most important elementary statements of quantum mechanics are probability statements, - statements about probabilities. A major difficulty, however, appears as soon as it is asked: Probabilities of what are we making statements about in quantum mechanics? Looking merely at how people actually express themselves in this respect the most frequent wordings to be found in the literature are of two sorts. Some people accept the classical way of speaking: They speak about the probability that an observable has (or takes on) a certain value. Others, more or less inspired by the Copenhagen interpretation, prefer a different formulation: For them a quantum mechanical probability is the probability * First published as Scheibe 1986a. 1 Bell 1964 and 1971. 2 v. Neumann 1932, 111.2 and IV 2; Bohm 1952 3 Bell 1966; Bohm and Bub 1966; Gudder 1970; Belinfante 1973; Jammer 1974, Ch.7. 4 An overview concerning these proofs is given in Scheibe 1981.
391
392
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
that a measurement of an observable yields a certain value. In either case and independent of any interpretation there is the remarkable fact that the probabilities in question only depend on the values of the observables and not on the observables themselves if only the former are among the possible values of the latter. Using a notation the meaning of which will become clear later on this remarkable independence is expressed by the equation
p(A, X) = p(B, X)
(X E A,B;A,B E Ob)
(1)
The difference in formulating quantum mechanical probability statements does, of course, no harm as long as we 'think with the learned and speak with the vulgar.' But what is, in the case before us, the view of the learned? When a statement is made that the probability that a is, say, 1/3, then whatever this whole statement may be about, that part of it which is indicated by a refers to a single system or at any rate to a single case, e.g. a single measurement. Even in the case where probabilities are interpreted as relative frequencies the a's refer to single systems or cases of measurement in telling us what those frequencies count. If this view is correct and if the elementary statements of a theory of hidden variables will be probability-free statements about single systems then the essential link between such a theory and quantum mechanics turns out to be the common domain of the quantum mechanical probability functions, regarded as a domain of propositions about single physical systems. As regards interpretation, if the classical way of speaking ~ observables have or possess certain values ~ would be more than a fa<;on de parler then the propositions in question had to satisfy a condition of invariance similar to (1) but with ordinary truth-value functions instead of probability functions:
v(A, X) +-+ v(B, X)
(X E A,B;A,B E Ob)
(2)
For this was the classical way of thinking: that things do have their properties irrespective or any measurements that could display them. However, as will be shown later on, it is but another version of v. Neumann's famous result that in quantum mechanics no truth-value functions of the kind required exist: In search of an interpretative basis for a theory of hidden variables the reinstallment of a classical interpretation of quantum mechanical 'properties' really seems to be a blind alley. Let us therefore resort to the alternative way of speaking mentioned a moment ago and look for a corresponding way of thinking. One possible suggestion is to specify the elementary propositions of a theory of hidden variables as the subjunctive conditionals if the observable A were to be measured then the measurement would yield the result X.
(SC)
As in classical physics certain maximal sets of these propositions could then be taken as the possible subjunctive states of a single quantum mechanical system. As to the question in what sense the subjunctive conditionals (SC)
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
393
are more than a fa~on de parler, apart from the usual associations connected with them, the only palpable assumption to be made about them in the following is that they are allowed to violate the invariance condition (2) that was implied by the meaning of their classical counterparts. This negative assumption is quite compatible with the less committing character of our new propositions: Being subjunctive with respect to the actual performance of measurements it may very well - and does indeed - happen that various subjunctive conditionals occurring in one subjunctive state involve incompatible outcomes of measurements for different and even commensurable observables. At first sight the admission of such possibilities seems to be very strange and even contradicting ordinary quantum mechanics. One immediate objection coming to the mind is: Whenever quantum mechanics allows the simultaneous measurement of several observables every hidden state must provide for a consistent and unique outcome of the measurement of these observables. However, as has been already indicated in connection with (2), the admission of at least some of the states in question is the only alternative to v. Neumann's result that there are no hidden states at all. Secondly, this way out is compatible with quantum mechanics if one assumes a theory of measurement 5 according to which every measuring apparatus has a well determined separation power in the following sense: There is a unique observable A such that if a measurement is performed with the given apparatus there will be a unique outcome for A but for no observable finer than A. While this A is measured directly, any observable B coarser than A is measured derivatively by the same measurement. Under this assumption, saying that several observables are measured simultaneously is but another way of saying that their product is measured directly and each of them, taken by itself, is measured derivatively. If we now assume that the hidden states give the outcomes of direct measurements incompatible results for commensurable observables this does not lead to any difficulty since in the sense of direct measurement no two or more observables are simultaneously measurable. This has repeatedly been expressed by saying that in such states the result of a measurement of one observable may depend on what other observables are measured simultaneously.6 Incompatible results for the same observable belong to measurements only one of which is direct, and commensurable observables are those for which compatible results can be inferred from direct measurements. The subjunctive character of the elementary statements (SC) takes account of the theory of measurement indicated, and the subjunctive ontology in question can very well be viewed as an extension of those variants of the Copenhagen interpretation which take a single quantum mechanical system to be a bundle of apparently incompatible potentialities. 7 5 6
7
G. Siif&mann 1958, 111.6. Bell 1966, §V; Ochs 1970, §3; 1971, §1. Bohm 1951, Chs. 6, 8 and 22.
394
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
2. Non-probabilistic Theories of Hidden Variables After these introductory remarks let me now enter the first part of my paper and begin by giving a more formal account of the concept of a nonprobabilistic theory of hidden variables. Being formal all subsequent considerations will be independent of the interpretation given in the introduction. Our starting point is a structure
(L,Ob)
(3)
that stands for the theory for which we want to find a theory of hidden variables. L is assumed to be a separable 8 and a-complete orthocomplemented lattice. Ob is the set of observables on L, i.e. the set of all (denumerable) complete sets of mutually orthogonal elements of L. This conception of an observable expresses the view that a measurement poses a question that in any given case will find exactly one real answer out of a set of mutually exclusive possible answers. With respect to the set of observables the elements of L thus appear as the possible outcomes of measurements. I shall distinguish two major special cases of (3): 1) the quantum mechanical case in which L is the orthocomplemented lattice of closed subspaces of a Hilbert space, and 2) the classical case where L is boolean (or even the quotient of the boolean lattice of Borel sets of a classical phase space modulo its subsets of Lebesgue measure zero 9 ). 3) being the given theory, a theory of hidden variables for it is a threefold structure (8; L', Ob')
(4)
where L' is a boolean lattice of subsets of 8 and Ob' a corresponding set of observables on L'. 8 is the space of hidden states. In the final analysis L' and Ob' will be defined by 8 and (3). Consequently, the following three axioms, fixing the connection between the two theories (3) and (4), should be viewed as axioms exclusively concerning the state space 8. The first axiom says that 8 is a set of subjunctive states of the theory (3): if s E 8, then s : Ob
>---+
Land s(A) E A for A E Ob.
(0:)
Thus every state is a mapping that assigns to every observable a possible outcome of its measurement. lO 8 9
10
Separability here means that any complete set of mutually orthogonal elements of L is denumerable. Because of the requirement of separability - cf. note 6 - the lattices of Borel sets of classical phase spaces are excluded. This corresponds to the quantum mechanical case 1) where we have assumed the v. Neumann version of quantum mechanics as opposed to its Dirac version. States may be 'hidden' in a sense that is not covered by (0:): The points of classical phase space represent states hidden with respect to the classical theories
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
395
To understand the second axiom let us look what happens to the subjunctive conditionals if they are represented by sets of subjunctive states from S. Knowing already that the subjunctive conditionals depend on the observables this dependence is now made explicit by considering the representations iA : LA >---> Pow(S) } iA(a) = {sis E S 1\ s(A) :::; a}
(4a)
Here A E Ob is an observable, LA is the boolean sublattice of L generated by A, and the mapping iA, which is always a homomorphism, represents the possible outcomes of a measurement of A (outcomes in a wider sense) by sets of subjunctive states in which these outcomes would be certain. It goes without saying that in a theory with the state space S two subjunctive conditionals have to be identified if they are represented by the same set of states. In general this identification will destroy the unique assignment of a possible measurement result from L. To prevent this our second axiom requires that
((3) In the presence of this axiom the original function of the outcomes of measurement is retained. At the same time ((3) guarantees that there are sufficiently many states with respect to L and that, in particular, S is not empty. Our third axiom works in the opposite direction. The two axioms (0:) and ((3) still admit S to be the set of all mappings satisfying the conclusion of (0:).11 But in the classical case we do not want to have hidden states and propositions other than the ordinary classical ones. We must therefore reduce the set of admitted subjunctive states in proportion to the classical behavior of our original theory (3). To this end let me call an observable A E Ob objective in a state s E S if for every B E Ob, finer than A : s(B) :::; s(A)
(4b)
In such a state the measurement of any observable B answering derivatively the question posed by A would lead to the same result as would a direct measurement of A. Now, as earlier investigations6 suggest, what really endangers the objectivity of an observable A is the existence of incommensurable observables finer than A. Calling A innocent if this situation does not arise we arrive at our third axiom if s E S then every innocent observable A E Ob is objective in s
11
(r)
admitted in 2), and the same is the case with Dirac's 0- distributions with respect to v. Neumann's version of quantum mechanics admitted in 1). To include these cases of hidden states (0:) would have to be generalized In fact this set always satisfies (0:) and ((3).
396
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
In the quantum mechanical case the only innocent observables are the maximal ones. Since every maximal observable is objective in any state, ('Y) becomes vacuously true in the quantum mechanical case. On the other side, in the classical case (I) does the job it was designed for: Here all observables are innocent, and (I) becomes equivalent to the invariance condition (2) that was already used to characterize the classical case. More precisely, (I) reduces to (5) below and S turns out to be essentially a set of a-additive truth-value functions. 12 Having stated the axioms as requirements on the state space S our concept of a theory of hidden variables (4) can now be completed by defining L' to be the boolean lattice of subsets of the state space S (finitely) generated by the representations iA(a) from (4a) of the subjunctive conditionals. Ob' is defined as the set of all denumerable, complete sets of mutually orthogonal elements of L'. By means of (4a) Ob is canonically represented in Ob'. But in general our construction will lead to new subjunctive propositions and new observables having no counterpart in the theory (3) from which we started. Having at hand a general concept of non-probabilistic theories of hidden variables let us now see what can be proved about it in some special cases. The first is v. Neumann's case. To reproduce his result I call a theory of hidden variables (4) for quantum mechanics to be of the v. Neumann type if if s E S then every A E Ob is objective in s.
(5)
This definition is in need of justification since the hidden variables envisaged by v. Neumann were defined to be dispersion-free states. Since we have not yet introduced probabilities we must look for a non-probabilistic equivalent to the concept of dispersion-free states. An immediate equivalent is the concept of a a-additive truthvalue function
v : L >--> {O, I} with v(v) = 1, v(A) = 0 and for any sequence an with am 1- an: if v(Vna n ) = 1 then v(a n ) = 1 for exactly one of the an.
(5a)
Obviously V satisfies (5a) iff v is a two-valued, a-additive measure on L13 and in this sense a dispersion-free state. On the other hand, there is a natural (injective) embedding of the state space S with (0:), ((3) and (5) into the set of all a- additive truth-value functions on L: Given s E S we define
v(a) 12
13
I if s( {a, al. }) = a = { 0 if s({a,al.}) = al..
It has to be noted that in the narrower classical case 2) no O"-additive truth-value functions exist. Consequently, there is no state space S satisfying (0), (f3) and (5) and, therefore, no hidden-variables theory in our sense, cf. note 10. However, this somewhat disagreeable consequence only reflects a 'continuum problem' and does not in the least affect the point to be made for quantum mechanics. Cf. Mackie 1963, p. 67. Mind that a O"-additive truth-value function in the sense of (5a) is not necessarily a homomorphism of L into the boolean lattice (0, 1). Cf. Kamber 1965. §5, note 6.
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
397
(5a) then follows from (5). Now, by Gleason's theorem there are no dispersionfree states for a quantum mechanical £.14 Therefore S would have to be empty. But this is excluded by ((3). Therefore, no space of hidden states exists, and v. Neumann's result reappears as Theorem 1: Quantum mechanics does not admit of a non probabilistic theory of hidden variables of the v. Neumann type. As opposed to this wholly negative result the nonprobabilistic part of our enterprise leads to a positive solution in the case treated by Bell. Again for quantum mechanics we have here the additional assumption that the underlying Hilbert space 1£ is the tensor product
(6) of two Hilbert spaces 1£1 and 1£11, expressing the situation that we are dealing with a quantum mechanical system consisting of two subsystems I and II. A theory of hidden variables is now called to be of the Bell type if if s E S then for any two observables AI, All E Ob S(AI ® All) S;;; S(AI ® 1£1I), S(1£I ® All).
(6a)
In other words, the subjunctive states of such a theory respect the product structure of 1£: If instead of measuring directly any observable AI pertaining to system I we would directly measure AI together with any observable of system II the outcome for AI inferred from the result of this measurement would always be the same in such a state, and vice versa. Condition (6a) is all that is used of locality in the subsequent argument, including the probabilistic part. 15 In comparing (6a) with its counterpart (5) in the v. Neumann case we immediately see that (6a) is much weaker than (5): Whereas in (5) the inequality s(B) ~ s(A) is required far every pair of abservables with B finer than A in (6a) it is only required for observables of the subsystems (in the 14
15
Gleason 1957. The application of Gleason's theorem to two-valued a-additive measures on an infinite-dimensional Hilbert space is somewhat trivialized by the fact that in this case L has already boolean sublattices not admitting any two- valued a-additive measures. A recent stronger result in Krips 1977 allows the following modification of our argument: For a quantum mechanical L let Ob be any set of observables, containing all maximal ones. Let Lo = UOb and replace (5) by the weaker assumption that every s E S induces a unique function v : Lo >---+ {I, O} in an obvious manner. Then (5a) follows for these functions. But it follows from Krips' theorem that even for the smallest Lo possible, namely the set of all I-dimensional subspaces, no such truth-value functions exist. The original wording of the locality condition in Bell 1964, p. 196, was: "The vital assumption is that the result B for particle II does not depend on the setting of the magnet for particle I, nor A on In (6a) this assumption is reformulated as a condition on a given hidden state s: Given AI the result S(HI 0 All) predicted by s for a direct measurement of All does not depend on the result S(AI 0 All) predicted by s for a direct measurement of AI however AI is chosen, and vice versa.
a:
Ii."
398
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
place of A) and for their products (in the place of B). It is, therefore, no wonder that we now find Theorem 2: Quantum mechanics admits non-probabilistic theories of hidden variables of the Bell type. Using the equivalence of (6a) with if S E S then there are subjunctive states Si over Hi such that s(AJ Q9 All) = sJ(A J) Q9 sIl(A Il )
(6b)
a proof of theorem 2 is obtained by taking S to be the set of all subjunctive states satisfying (6a).16
3. Probabilistic Theories of Hidden Variables In the last part of my paper probabilistic theories of hidden variables are introduced. Assuming that our original theory (3) is provided with a set P of a-additive probability measures on L, leading to a structure
(L,Ob,P)
(7)
essentially the same must be required for a theory (4) of hidden variables for it. This gives us a structure
(s" L'
Ob' , p' >
(8)
with a set pI of probability measures on L', 17 and the question will now be how the structure (8) has to be related to (7) in order to become a theory of hidden variables for it also with respect to the probabilities. The obvious answer is given in axiom To every pEP there exists a p' E pI such that for every A E Ob, a E LA
p(a) = p'(iA(a))
(8)
where the mappings iA from (4a) effect the representations ofthe subjunctive conditionals as sets of hidden states. Turning again to the special cases of the previous section it is natural to call a probabilistic hidden-variables theory to be of the v. Neumann (resp. 16
17
If we would take the set of all subjunctive states i.e. all functions satisfying (0), then ((3) were an immediate consequence. It requires some consideration, however, to obtain the same result for the smaller state space submitted to (6a). It may be recalled that (r) is automatically satisfied in the quantum mechanical case. Not having required a-completeness for L' we do not require a-additivity for the probability measures in p'.
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
399
Bell) type if its non- probabilistic part is of this type. From Theorem 1 it follows immediately that there are no probabilistic theories of the v. Neumann type. As opposed to this, Theorem 2 invites us to look for corresponding theories of the Bell type. However, in fact we find the negative result Theorem 3: Quantum mechanics does not admit of probabilistic hiddenvariables theories of the Bell type (i.e. local ones). The standard proof of this theorem works with expectation values for quantities. 1 To adapt the proof to the present setting let the theory (7) be quantum mechanics together with the special assumption (6), and let (8) be a Bell type theory of hidden variables for it. Let Q be the set of quantum mechanical discrete quantities, identified with the set of self-adjoined operators on H having a discrete spectrum. We then have the mapping
8: Q ~ Ob } 8(H) = the set of eigenspaces of H in H.
(9)
For each H E Q we define EH : 8(H) ~ lR } EH(a) = eigenvalue of H for the eigenspace a
(lOa)
and
(lOb) O"H is the representation of the quantum mechanical quantity as a function on the space of hidden states. This representation exactly corresponds to the representation (4a) of observables in the sense that
(11)
if a E 8(H) and Ct is the corresponding eigenvalue. Finally, let p and p' be probability functions according to axiom (6) and let E resp. E' be the corresponding expectation value functions. Then the crucial equation which renders the application of Bell's inequality possible is (12) for any two operators F and G on HI and HII respectively. Once this equation is obtained the rest of the argument is the usual one: The right side of (12) being an expectation of the product of two functions on a classical probability space Bell's inequality IE'(O"F01 . 0"10G) - E'(O"F01 . . +IE'(O"F'01 . 0"10G)
0"10G,)1
+ E'(O"F'01 . 0"10G,)
} 1 :::;
2
(13)
holds whenever the four functions 0" ... involved are absolutely bounded by l. Because of (12) the same would have to hold for the corresponding quantum
400
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
mechanical expectation values. However, there are quantum mechanical probability functions for which the latter is not the case. As regards (12) the equation in (6) for the probability functions p and p' immediately leads to the corresponding equation (12b) for the expectation values. This, however, is only the first step in getting at (12). The really important step is done by proving (12b) and it is here where the locality condition (6b) comes in. It is used for obtaining the third line of the following computation whose remaining steps are general and routine: fYF0C
= EF0C(s(8(F 0 G))) = EF0C(s(8(F 01) 0 8(10 G))) = EF0C(s(8(F 01)) 0 s(8(10 G))) = EF01(S(8(F 01))) . EI0C(s(8(10 G))) = fYF01(S) . fYI0C(S)
The remarkable thing about theorem 3 is that in the Bell case we come quite close to constructing the probability measures required by (8). First, it was shown by Theorem 2 that there are non-probabilistic Bell theories of hidden variables. Moreover, on account of axiom ((3) and the invariance condition (1) for probabilities of the original theory we get unique probabilities
p'(iA(a))
=:
p(a)
(14)
for all counterparts of subjunctive conditionals in L' F. It may even be the case that for some probability measures on L an extension to L' is possible. It has been proven that at least Bell's inequality would be no obstacle to this in some cases. 18 But in others it is: There are probability measures on L that cannot be extended in the sense of (6) because they do not satisfy the inequality. It remains to see that the result of Theorem 3 is specific in the sense that the additional assumption of locality cannot be dismissed. This is shown by our last Theorem 4: Quantum mechanics admits of (non-local) probabilistic theories of hidden variables. To prove this we take S in (8) to be the set of all subjunctive states, i.e. all functions satisfying (0:). Then ((3) follows in the strong sense that no identifications of subjunctive conditionals occur, i.e. from the premis of ((3) not 18
Mixtures of product states {) @'l/J are cases in point, cf. Selleri and Tarozzi 1981, §5.
VI.26 What Kind of Hidden Variables Are Excluded by Bell's Inequality?
401
only a = b but also A = B follows. As we already know ('Y) is vacuously true for quantum mechanics. To obtain (6) we remark that for the state space S chosen L' turns out to be the boolean product of the indexed set {LA} AEOb of boolean lattices LA (or even the field product if the LA are identified with fields of subsets of A). This is immediately seen by remarking that S is simply the Cartesian product of all observables A E Ob. If now a quantum mechanical probability measure p on L is given it induces a probability measure on LA for every A E Ob. To satisfy (6) for this p we take as p' on L' simply the product measure, i.e. the uniquely determined probability measure p' on L' for which
for every AM E Ob and am E AM .19. This completes the proof of Theorem 4. 20
19
20
Cf. Sikorski 1969, §13. In the discussion Prof. Suppes raised doubts as to the validity of Theorem 4, this theorem being at variance with his 'Corollary on Hidden Variables' in Suppes and Zanotti 1981, p. 198. According to the Corollary a pair of Bell-type inequalities concerning three random variables is a necessary (and sufficient) condition for the existence of 'a hidden variable ... with respect to which the three given random variables are conditionally independent.' I do not think that the literal incompatibility of this result with theorem 4 is also one in fact. Suppes' approach to the problem of hidden variables is quite different from the present one. Indeed, it is quite different from any other conception of hidden variables that I know of. This suggests a thoroughgoing comparison that I shall entertain in the near future, and I am sure that this will clear up also the seeming incompatibility in question. For the moment I only want to point out that no theory of hidden variables in my sense would satisfy equation (12), and it is for this reason that Bell-type inequalities are of no harm: they simply cannot be transmitted to the given quantum mechanical theory.
VI.27 The Copenhagen School and Its Opponents* 1 The Sins of the Physicists In this contribution, I shall speak about the so-called Copenhagen interpretation of quantum mechanics and about some objections that have been raised against it. The title of this lecture expresses this topic in a somewhat personalized way, but it does so not merely in order to sound more interesting and perhaps attract a larger audience. The title also points to the fact that we are dealing with an emotional controversy about a fundamental theory of the new physics in which, besides factual arguments, a few poisonous arrows have been exchanged as well. Thus, for example, Rosenfeld, an unconditional partisan of the Copenhagen School, concludes a review of David Bohm's "Causality and Chance in Modern Physics" with the words: "That such irrational dogmatists should hurl the very accusation of irrationality and dogmatism at the defenders of the common sense, uncommitted attitude of other scientists is the crowning paradox which gives a touch of comedy to a controversy so distressingly pointless and untimely."! But the other side reaches for rhetorical weapons as well. John Bell, for example, exclaims in defense of deBroglies theory of the guide wave: ''why ... had Born not told me of this 'pilot wave'? ... why did people go on producing 'impossibility' proofs, after 1952, and as recently as 1978? When even Pauli, Rosenfeld, and Heisenberg, could produce no more devastating criticism of Bohm's version than to brand it as 'metaphysical' and 'ideological?' .... long may Louis de Broglie continue to inspire those who suspect that what is proved by impossibility proofs is lack of imagination.,,2 It will not surprise you to hear that the situation illustrated by such quotations has, in the meantime, also received a diagnosis from the sociology of science, although perhaps one would not have expected an author to characterize Bohm's action as an investment strategy with which he allegedly gathered social capital before his "first strike" against the Copenhagen view so that with this cushion in his back he could then pass over to a high risk strategy of subversion3 . Whatever may be the case with regard to this psychosocial aspect of the matter, inasmuch as we are also dealing with a scientific controversy, one will expect that at this level too certain tensions can be diagnosed and that through their resolution one can even learn something about the subject itself. And since in truth this is much more interesting than those ubiquitous quarrels, in what follows, I want to limit myself completely to what one could call the science-theoretic side of the matter. Remaining within the framework of my introduction, I want to make clear right away what I mean by these tensions internal to science. In their attempts • First published as Scheibe 1990a, translated by Hans-Jakob Wilhelm 1 Bohm 1957; Rosenfeld 1958 2 Bell 1987; here pp. 160 and 167 3 Pinch 1977
402
VI.27 The Copenhagen School and Its Opponents
403
to formulate the general content of quantum mechanics, the representatives of the Copenhagen School often used formulations with which they do not merely say how things are in their opinion, but beyond that, they say that things must be thus and so. And they did this - mind you - not by first delivering a simple statement of the content in order then separately to add that things were necessarily as stated. Rather, they said both in one and the same proposition - in the same breath, as it were. They chose formulations for the mere communication of an item in which at the same time the inevitability of what is communicated is asserted. Thus Bohr, for example, inorder to communicate the idea that a quantum phenomenon contains besides the object also an experimental design, likes to say things like: " ... there can be no question of any unambiguous interpretation of the symbols of quantum mechanics other than that embodied in the well-known rules which allow to predict the results to be obtained by a given experimental arrangement ...."4 And in order to communicate the idea that the experimental design is described classically, Bohr says, for example: "However far the phenomena transcend the scope of classical physical explanation, the account of all evidence must be expressed in classical terms ...."5 Please note that the presentation of these quotations is not supposed to show that Bohr did not present good reasons for his requirements or that he did not have any. I am merely concerned to demonstrate a somewhat unfortunate amalgamation of a simple statement of content with a modal paraphrase of the same. According to our usual understanding, the assertion of the necessity of a proposition adds nothing to its content. Yet, one could also use an expression of necessity in order to point to those contents which are responsible for the necessity in the previously intended, usual sense. And it seems to me that such a use can indeed be found in the early presentations of the Copenhagen view. This is extremely misleading, however, and it takes on the burden of the claim that here finally the physical theory has been found which proves its own necessity. Bohr's constant contamination of communication and justification, which gives his works an imploring tone from which one cannot escape, may in the end be explained as a matter of style, although perhaps it cannot be justified. Things get worse when the matter gets into the hands of re-interpreters, where one has no special reasons to read another motivation into their choice of expression. As an example, I want to cite a work by Ballentine which, by the way, is meritorious in many respects 6 . Fully aware of a whole spectrum of interpretations of quantum mechanics, the author wishes, above all, to distinguish two main classes. On the one hand, we have (I) the statistical interpretation according to which a pure state ... leads to a description of an ensemble of systems prepared in an equal manner . .. 4 5 6
Bohr 1935, p. 701; emphasis mine Bohr 1958, p. 39; emphasis mine Ballentine 1970, p. 360
404
VI.27 The Copenhagen School and Its Opponents
and distinct from this (II) interpretations which claim that a pure state furnishes a complete and exhaustive description of a single system ... Now, why is the second manner of interpretation - to which, according to Ballentine, the Copenhagen reading belongs - so different from the first? The two certainly differ in that the '¢-function states probabilities about the results of measurements which in the first case refer to a statistical totality and in the second case to an individual system. In no way, however, is this the reason why the opposition leaves such an indelible impression on us. This impression only arises because of the fact that in (II), unlike in (I), we are not simply dealing with a description, but rather with a complete and exhaustive description. It sounds as though this is an essential difference between two interpretations of quantum mechanics: one is complete, the other is not. But this is out of the question: According to the usual understanding, to say that a description is complete simply means that as a description nothing more can be added to it. This completeness is thus not a part, but rather a property of a description, and it would not add one jot to it even if this description were incomplete. The proposition with which we articulate the completeness of a quantum-mechanical description is itself not a proposition of quantum mechanics at all. The logical situation is thus completely analogous to the one identified earlier for Bohr's statements, except that in the present case there is the additional disfigurement of the Copenhagen interpretation in that a fundamental difference to the so-called statistical interpretation is feigned through a basic category mistake. Thus, we see already from a few textual examples that there is reason enough to distinguish clearly between the attempt at a clean formulation of a theory, on the one hand, and the possibility of substituting one theory through another (e. g. more complete) theory, or similar meta-theoretical questions, on the other hand. If this is not distinguished and if, as in the quotations, these things are run together, then the mentioned tensions arise and lead, just because readers are usually not fully conscious of them, to those endless misunderstandings which fill the discussion around the foundations of quantum mechanics. In what follows, I understand the Copenhagen interpretation of the quantum mechanical formalism as the attempt at a formulation of a theory and I distinguish from this, as far as possible, a Copenhagen philosophy which goes beyond the mere formulation of the theory. Both - the Copenhagen version of quantum theory and the philosophy behind it - have been widely criticized, and this criticism has partly grown into suggestions of theories in competition with the Copenhagen version of quantum theory or at least into attempts at such theories. Only at this stage do we then also encounter meta-theoretical propositions, in our case: propositions dealing with the comparison of theories. This, at least, is how I want to reconstruct the opposition in question from the perspective of the theory of science, without thereby making a claim to historical accuracy. After a recapitulation of the
VI.27 The Copenhagen School and Its Opponents
405
Copenhagen interpretation, I then want to limit myself to the context of a discussion which for the moment I only want to indicate by the names of David Bohm and Johann von Neumann. According to my impression, the opposition indicated by these names is one of the few, through the analysis of which, one can also learn something about quantum theory itself and the understanding of reality that it achieves. And more than that - to add my personal opinion - one cannot expect anyway.
2 The Copenhagen Interpretation In the interpretation worked out by Bohr, Heisenberg, and Pauli 7, quantum mechanics breaks with certain principles of classical physics. In classical, prequantum-theoretical physics, a physical system is described 1) observation free and 2) probability free. 8 Turned around positively, the description occurs by means of - as I want to put it - ontic propositions with which we ascribe or deny properties to the system. Thus, for example, we would say of a particle system that its particle i at time t is or is not at the location x, or we would say of a field that at time t and location x, it does or does not have the field strength !.p. Such ontic propositions are observation free in the sense that they neither say anything about how we procure the system for the purpose of its observation nor about what measurements we have made or still intend to make. Rather, we are dealing exclusively with the system itself. A description composed of ontic propositions does not include any probabilities either. These propositions neither say how probable it is that the system has a certain property nor that the system with certainty has this property, but merely that it has this property. Thus, as observation free, descriptions of systems in classical physics are at the same time objectified and, as probability free, at the same time determined. This strategy of description, of course, does not entail a denial either of the fundamental importance of the possibility of measurements or the fact that with respect to a system we may have such a deficient state of knowledge that we can only represent it in propositions of probability. Indeed, without changing its content, we can easily give an ontically formulated physical theory an epistemic formulation by replacing the propositions that our system has these and these properties with propositions stating that these properties have been ascertained in the system by means of measurement or, at least, would be ascertained, if a suitable measurement were made. In an analogous way, we can replace the categorical ascription of properties with a statement of probabilities for their occurrence, or for their establishment in case of a measurement, and would thereby again not modify the content of the relevant 7 8
Concerning the contributions by Bohr and Heisenberg, see Scheibe 1989b; the contribution by Pauli is appreciated in detail in Laurikainen 1988 For the following see Scheibe 1964
406
VI.27 The Copenhagen School and Its Opponents
theory about the system, but rather merely express our epistemic relation to the latter. According to a defensible, although not generally accepted, view, this is precisely what we do in (classical) statistical mechanics. These are thus possibilities of the re-formulation of classical theories with an equivalent content. Yet, if we had pointed this out to a physicist of the classical era, his reaction would have probably only been: Certainly, one can do all that. But one can also leave things as they are. And as long as we are concerned with the fundamental question regarding the nature of physics, the latter alternative is by far the more appropriate one. In contrast with this classical situation, the Copenhagen message for quantum theory is precisely that in this case an interpretation appropriate to the formalism must refer to the experimental design and that probabilities are required for the description of state. Here we are not dealing with possibilities which one can seize or leave as one pleases. In the introduction we have already heard this from Bohr9 with reference to experimental design, and Pauli, for example, expresses the same idea with reference to probabilities when he says: "The statistical behavior of the many like individual systems [of an ensemble] ... is regarded in quantum mechanics as the final irreducible fact of lawfulness."l0 But not only do we repeatedly have such statements which directly express the definite abandonment of the two classical principles. We also have what one could call the Copenhagen philosophy (in distinction from a Copenhagen interpretation abstracted from it), and in typical rationalist manner, this is larded with concepts which contain modal connotations and as such likewise constantly imply the abandonment of the said principles. This starts with Bohr's quantum postulatel l which expresses, in the most prominent place, the characterization of a quantum phenomenon as a new kind of wholeness of object and experimental design. In a typical formulation, Bohr states that "The essential wholeness of a proper quantum phenomenon finds indeed logical expression in the circumstance that any attempt at its well-defined subdivision would require a change of the experimental arrangement incompatible with the appearance of the phenomenon itself.,,12 This remark, the modal character of which is obvious in words such as "attempt", "require", and "incompatible", gives rise to two lines of thought. At first, these lines are still united in an explication of the mentioned indivisibility of a quantum phenomenon by the fact that the interaction of an object with an apparatus for the purpose of its preparation or measurement within the order of magnitude of Planck's quantum of action is no longer controllable. "The element of wholeness - says Bohr-,symbolized by the quantum of action and completely foreign to classical physical principles has ... the consequence that in the study of quantum processes any experimental inquiry implies an 9 10 11 12
See no. 4 Laurikainen 1988, p. 161 Bohr's view is treated in Scheibe 1973c, Ch. I Bohr 1958, p. 72
VI.27 The Copenhagen School and Its Opponents
407
interaction between the atomic object and the measuring tools which ... evades a separate account if the experiment is to serve its purpose ... ,,13 Thus, the unverifiability of the interaction merges, so to speak, object and apparatus into a new indissoluble unity. From this point on, the train of thought branches out. One line now leads directly to indeterminism: "According to quantum theory - says Bohr whom I continue to cite in order to illustrate his peculiar mode of expression - just the impossibility of neglecting the interaction with the agency of measurement means that every observation introduces a new uncontrollable element. Indeed, it follows. .. that the measurement of the positional co-ordinates of a particle . .. means a complete rupture in the causal description of its dynamical behavior, while the determination of its momentum always implies a gap in the knowledge of its spatial propagation.,,14 In light of this situation, one should consider it fortunate that this indeterminism can at least be described by meanS of probabilities which are then, of course, irreducible. This indivisibility of a quantum phenomenon, however, also has the other consequence that we are able to witness the totality of the properties of a quantum object only in phenomena which are mutually complementary, i. e. in particular mutually exclusive. This has just been stated already for the measurement of position and momentum, and quite generally Bohr writes: ''the renunciation in each experimental arrangement of the one or the other of two aspects of the description of physical phenomena - the combination of which characterizes the method of classical physics ... - depends essentially on the impossibility, in the field of quantum theory, of accurately controlling the reaction of the object on the measuring instruments ... ".1 5 Obvious examples of complementary phenomena are the various realizations of the so-called wave-particle dualism, and these are at the same time supposed to show that complementarity, as a positive counterpart to the classical ideal of observation-free objectification, through the unification of the wave and particle images in one theory, also makes something possible which from a classical perspective seemed impossible. So much for the sketch of Bohr's holism, a holism which is essentially characterized by the concept of complementarity and in the formulation of which Bohr constantly expresses the unavoidability of a break with the classical principles of the objectification and the determination of events. And in light of the significance of the issue, this connotation was a very natural process. For should one have introduced measuring apparatuses only to declare in the next moment that this is not essential? We don't want to be taken for fools! And yet we must now take to heart the science-theoretic principle already stated in the introduction and ask ourselves: What becomes of this holistic philosophy when we project it onto quantum theory as a physical theory? 13 14 15
Bohr 1963, p. 60 Bohr 1934, p. 68 Bohr 1935, p. 699
408
VI.27 The Copenhagen School and Its Opponents
Do we perhaps expect that propositions expressing necessity or impossibility become, as propositions of the theory itself, inevitable? This is hardly possible. It could very well be the case, however, that we find (non-modal) propositions in the theory which could assume a key role in proofs to the effect that substitute theories for quantum theory which are again following certain classical ideas do not exist. Bohr's mode of expression could then be justified as a paraphrase of such theoretical propositions with regard to the role they play in proofs of impossibility. In order to find such propositions, let us remember that, according to the Copenhagen interpretation 16 , a quantum phenomenon consists of an object to be described in terms of quantum theory and an experimental design to be described in terms of classical theory. By means of this design, the initial state 7/Jo of the object is first prepared. This state then develops, according to the Schrodinger equation (la) (where H is the Hamilton operator), in order finally to be subjected to the measurement of an observable A by means of which one can (statistically) verify the predicted probability for a result in a (lb) (where P: is the spectral decomposition of the operator representing A) in the state 7/Jl. The two formulae (1) form the core of quantum theory to the extent to which it is of interest to the physicist. Consequences of the theory that are of fundamental interest are the wellknown Heisenberg relations of indeterminacy
(2) They imply that in every state 7/J of the object, the standard deviations of position and momentum cannot both be made arbitrarily small. This proposition limiting the prep arability of a state unites in an obvious way the aspects of complementarity and indeterminism. By contrast, the proposition of the so-called reduction of state, which does not yet follow from (1), illuminates most clearly only the indeterminism: If the object, after a measurement (of the observables A), continues to remain available, then the measurement can be used for the preparation of the new state
P:
P:7/Jl 11P:7/Jl II
(3)
where represents the result of measurement. The reduction of state illustrates the quantum-theoretical indeterminism not only through the fact 16
Compare the detailed presentation in Heisenberg 1959a, pp. 27 ff
VI.27 The Copenhagen School and Its Opponents
409
that the new state is known before the measurement only with the probability (lb). The real anomaly (from a classical perspective) consists in the fact that the measurement always brings with it - besides the gain in information brought about through its result - also a loss of information, since all states are (in an intra-theoretical sense) maximal. Besides indeterminism, however, one can also see the aspect of complementarity in its pure form in quantum theory itself. It is found in the preprobabilistic theory of observables 17 , that is, of the totality of the quantities of an object which the theory holds to be measurable. The central point of this issue, which will be especially important in what follows, is that the formulation of quantum mechanics soon revealed that for the first time in the history of physics, physical quantities - in this case the quantum-mechanical observabIes - were represented in a non-commutative domain of calculation. Moreover, it became obvious that the multiplicational structure of this domain of calculation could be used to express the phenomenon of the non-simultaneous measurability of two observables: commutative operators represent simultaneously measurable observables, while non-commutative operators represent observables that are not simultaneously measurable. Although we are thus approaching a physical interpretation, we are now already encountering once again modal formulations: non-simultaneous measurability surely means as much as the impossibility of a common measurement of the relevant observabIes. But what does "impossible" mean in this context? Have we now arrived at the point where we must finally surrender? The answer to this question provides a nice lesson about the value of internal theory analyses. For a more detailed analysis shows that we can define the impossibility in question intra-theoretically, as a non-existence: The measurement of an observable leads to a decision regarding an alternative of properties of the system under investigation. An alternative is thus a totality of possible properties such that in a measurement exactly one is obtained as a result. In the famous deBrogliean paradox, through the closing of a shutter, the alternative is decided, whether an electron was in one or the other of the two parts of the box separated by the shutter. Now, the most important relation between the alternatives of a quantum-theoretical system is that one alternative is more fine-grained than the other. An alternative A is more fine-grained than an alternative B, if every property of A has a property of B as a consequence. By means of two additional shutters, deBroglie's box can obviously be developed into a more fine-grained alternative than the one just considered. Through the measurement of more fine-grained alternatives, more coarse-grained alternatives are, so to speak, measured along with it, and it now makes sense to declare two alternative to be commensurable, if they have a common refinement. The possibility of a common measurement of A and B then simply means that the theory provides for an alternative, 17
This theory and its consequences is treated in detail in many articles of the collection Hooker 1975 and 1979
410
VI.27 The Copenhagen School and Its Opponents
the decision of which at the same time brings a decision regarding A and B along with it. The fundamental situation for common quantum mechanics, however, is that there are not only incommensurable alternatives (and thus observables), but rather that for every alternative, there is a corresponding incommensurable alternative.
(4)
The impossibility entering at this point, however, again only states that the catalogue of properties of the theory provides, for example, for one entry for the possible measurement of a coordinate of position or of momentum in this or in that interval, but that it contains no entry which would correspond to the measurement of the position being here and the momentum there.
3 Von Neumann's Proof and Bohm's Theory With the theory of quantum-mechanical observables, the reduction of state, and the Heisenberg relations of indeterminacy, we have highlighted three basic traits of quantum mechanics which constitute extreme deviations from the classical way of thinking ruled by the principles of objectification and the determinacy of natural events. Indeed, we are dealing here with peculiarities which have not even found a unanimous philosophical interpretation within the Copenhagen School: While Bohr, as indicated, emphasized the new totality of a quantum phenomenon, Heisenberg attempted to conceive the indeterminacy of events between preparation and measurement along the lines of an Aristotelian potentiality 18, and Pauli sought - as did von Neumann and Wigner - to connect the completion of a measurement essentially with human consciousness. 19 For our purposes it is important that we have here a fairly clearly defined stock of propositions of quantum theory which, without having itself a modal character, could occupy a key position in the meta-theoretical question regarding the possibility or impossibility of an "explanation" of quantum mechanics by means of a theory which is again set up on classical ground. There are scores of attempts to prove, or at least make plausible, such a possibility or impossibility. Nevertheless, there is to date no general investigation concerning the principles on the basis of which these proofs are attempted. 2o We shall at least want to gain an approximate understanding of what kind of adventures we are dealing with here, before we tackle concrete cases. At first glance, it seems to be a more difficult task to defend an irreducibility of quantum mechanics than to attack it. For the defender would 18
19 20
See no. 15 and Heisenberg 1959b, p. 140 See Laurikainen 1988, pp. 57ff, 144f., 176f The following presentations may be mentioned: Bell 1966; passo/Fortunato/Selleri 1970; Belinfante 1973; Jammer 1974, Ch. 7
Ca-
VI.27 The Copenhagen School and Its Opponents
411
have to prove that no conceivable classical theory could in any conceivable way explain quantum theory. We would thus be dealing with the proof of a proposition of almost fantastic indeterminacy, whereas the counter-proof would simply consist in the presentation of one classical theory which, in a manner to be demonstrated as well, furnishes the explanation. To this extent, of course, the asymmetry is simply the purely logical difference between the denial of an existential proposition in contrast with its assertion. In any more concrete case, things can be the other way around, and the existential proof can pose the greater difficulties. Whatever the case, proof and counter-proof must minimally rest on the following three preliminary conceptual clarifications: I) It must be roughly clear, when a theory is called "classical", that is, for example, when it fulfills the two principles of objectification and determinacy. II) Within the stock of propositions to be explained, quantum theory must be sufficiently delimited. III) The question must be (roughly) settled, when we may regard a relation between the theories in question as an explanation of one through the other. Everyone of these requirements has its problems, of which requirement III presents perhaps the greatest. The concept of an explanation can be taken in a very narrow, but also in a very wide sense. Should we, for example, require that all concepts or axioms of quantum theory be defined or proven from those of the classical theory, in order to be able to speak of an explanation? In that case, woe to the opponents of quantum theory! As another extreme case, one could imagine a concept of explanation which only requires that the relevant classical theory allows for the reproduction of all the empirical achievements of quantum theory. In that case, this classical theory could even contradict quantum theory outside of its domain of application and thus represent a genuine alternative to it. Down with the dogmatism of the Copenhagen physicists! Indeed, so many questions are lurking here, which are all expressions of the mentioned indeterminacy of the problem of irreducibility, that the citing of a few examples is almost misleading. In order to judge this situation, we must, for the time being, rely on the experience which the theory of science has gathered, even independently of the present case, in matters of the concept and explanation of a theory. And as we do this, we shall have to begin by admitting that no generally satisfying classical explanation of quantum theory in its Copenhagen version has been delivered as yet: One or the other of the three requirements I-III is violated, and, usually, it is all three. Quantum-theoretical elements are introduced in an ad hoc fashion into the explaining theory, the claim of explanation remains unclear because decisive parts of quantum theory are passed over in silence, and the explanation itself most often does not proceed according to a discernible overall concept. The
412
VI.27 The Copenhagen School and Its Opponents
proofs of impossibility, by contrast, reveal a certain superiority, I do not mean in terms of their relevance or their scope, but in terms of clarity and stringency. This at first glance seemingly opposite judgment, however, is also easy to explain. A proof of impossibility can argue that, for the purpose of the sought-after explanation, this part of a classical theory would have to relate with that part of quantum theory in such and such a way. On the basis of this partial definition, it is then shown that this is already impossible. The existential proof, by contrast, necessarily deals with the entire theories and also with their relation as a whole. Thus, I make this judgment while acknowledging the difficulties involved, and in particular, I am conscious of the fact that the protagonists of an anti-Copenhagen atomic mechanics had more immediate worries than that of fulfilling the standards of the theory of science. But these difficulties existed for the pioneers of quantum mechanics as well, and it must be permitted, already for the sake of what has been achieved, to measure the theoretical products of physics, from time to time, by the standards of the theory of science. Now we shall want to take a closer look at the situation in what is probably the most impressive case of a confrontation between the two positions. On the one hand, we have the tradition of proofs, beginning with the socalled von Neumann proof, of the impossibility of - as it is called since von Neumann - a theory of hidden parameters. 21 The aim in this tradition was to extend the scope of the original proof further and further, that is, to exclude more and more possibilities of theories of hidden parameters. On the other hand, we have the life's work of David Bohm who, since the beginning of the 1950s, proceeded to attack the Copenhagen version of quantum theory (which had, by now, become an orthodoxy) and who, through the attempt at setting up a classical theory of hidden parameters, fought in particular against the tradition following von Neumann. Bohm's main motives for his attack were to break the Copenhagen monopoly of interpretation through the demonstration of alternatives and to explain the anomalies of quantum theories on the basis of a physics that was again classically and, in particular, deterministically oriented. At first, the intention of also making progress in physics in this manner remained in the background. In the course of his efforts, Bohm clearly underwent a development. In the mid-1960s, he abandoned an initial theory of hidden parameters22 in favor of another theory,23 only to return, in the mid-1970s, to the first theory24. The middle period is also the time of the greatest convergence towards Bohr. Indeed, one would have to say that really only the first theory attempts to re-install both of our guiding principles. The second theory was probably more the attempt 21 22
23 24
An overview over this tradition is given in Scheibe 1981a Bohm 1952; Bohm 1953; Bohm/Vigier 1954; this period together with related attempts is presented in Freistadt 1957 Bohm/Bub 1966b; Bub1968; Bub 1969 Bohm/Hiley 1975; Bohm/Hiley 1984; Bohm/Hiley/Kaloyerou 1987
VI.27 The Copenhagen School and Its Opponents
413
to show that the Copenhagen conception of the role of the measuring instruments in the description of the object is compatible with a determinism through hidden parameters. The proximity to Bohr also becomes intelligible through the fact that Bohm's general views in natural philosophy always had a marked holistic tendency25. In recent times, however, his holism is again more closely connected to the extreme non-local interaction which dominates his first theory of hidden parameters. For some more detailed remarks on Bohm's theories, I want to concentrate on those parts of the theories which are indicated with von Neumann's term of "hidden parameters". The probabilistic character of quantum theory makes plausible the idea, especially cultivated by Einstein, that the quantumtheoretical description of an object is incomplete 26 • We know already from the introduction that this idea is not an element of quantum theory itself. Rather, at first, it is very vaguely the idea that there is another classical theory which delivers a probability free and to that extent more complete description of the object. Von Neumann now imagined 27 that this would occur in such a way that in this theory there would be in addition to the quantummechanical '¢I-function so-called hidden parameters .x which would end the indeterminacy still left by '¢I and together with '¢I determine the respective (probability free) state s of the object: ('¢I,.x)
I--t
s
(5)
One of the main problems of a theory of hidden parameters is the question, how the new objective states s relate to the observables of quantum mechanics. Von Neumann as well as his successors thought that each of these states would have to decide each quantum-mechanical alternative and this in such a way that the respective results are independent of the alternative to which they belong. Thus, if we conceive s as a function which assigns to each alternative exactly one of its properties,
(6a)
s(A) = s(B) for s(A) E B or s(B) E A
(6b)
then
would have to be valid as well. In the beginning of the 1980s, the von Neumann tradition of proof arrived at the conclusion in its full generality that such hidden states do not exist. 28 Von Neumann already remarked about his own somewhat more particular result that as far as it was concerned, he "did not have to go into the details of the mechanism of 'hidden parameters,.,,29 25
26
27 28
29
Bohm 1980 Einstein/Podolsky/Rosen 1935 v. Neumann 1932 (1955), IIl.2 and IV.2 An overview of definite results is given in Kruszinsky 1984 See v. Neumann 1932, p. 171
414
VI.27 The Copenhagen School and Its Opponents
Indeed, his result has the same degree of relevance even when one does not attempt to obtain the hidden states s through a "mechanism" (5) at all. The result is typical for the selective character of proofs of impossibility mentioned earlier. In particular, the question regarding the explainability of quantummechanical probabilities does not come into play at all. For the states do not even exist of which this explanation could give a probabilistic evaluation. Their non-existence is based alone on the constitution of incommensurability (4) which the theory of quantum-mechanical observables ascribes to an object, and von Neumann's result simply states (even as a theoretical consequence) the impossibility of an ontic description of an object - an impossibility which, in a more intuitive way, the Copenhagen interpretation asserted from the beginning in its abandonment of objectification. Now, how does Bohm deal with this situation? His first theory is a classical field-particle theory clearly formulated in its central aspects. The 7jJ-function is conceived as a real 7jJ-field, as I want to call it, which is defined in the configurational space of the particles and satisfies the Schrodinger equation. The particles move according to classical mechanics under the influence of the classical potential from the Schrodinger equation and the so-called quantum potential (7a) which originates from the 7jJ-field. In conjunction with the additional initial condition (7b) the grounds for which I shall skip, this is already the entire theory of hidden parameters. It is designed following von Neumann's scheme contained in (5), and the hidden parameters>' are locations of particles. In the case of multiple particles, the 7jJ-field is, of course, not a field in space. But this does not take away from its reality, since it leads, in any case, to well-determined forces acting on the particles. In this respect, the theory resembles the Newtonian theory of gravitating mass points. It gains its peculiar appeal through some exotic properties of the quantum-potential. Like gravitation, the corresponding force is a force at a distance whose non-locality, unlike in the case of gravitation, is additionally underlined by the fact that it can also grow for great distances between particles. In order to establish the link to quantum mechanics, a statistical mechanics is simply joined to the theory of hidden parameters, whereby, analogous to (7b) the initial condition p
= R2 (for t = 0)
(7c)
is assumed in an ad hoc manner for the distribution of locations (!) p. Thus, the usual classical mechanical states are determined through the core theory
VI.27 The Copenhagen School and Its Opponents
415
via (5), and the question of what it means that a classical mechanical particle quantity has this and this value with this and this probability is settled by means of the probabilistic extension (7c). Up to this point, the theory may be peculiar, but it is clearly formulated in its concepts and propositions. It should also be mentioned right away that it has achieved a few impressive explanatory successes. The explanation of the hole experiments 30 should be mentioned in the first instance. Today, computer images show in an impressive manner how the quantum potential of a suitable '!f!-field guides the particles passing through the holes on well-defined paths to precisely those locations on the collector screen which are possible locations of impact due to the simultaneously occurring interference of the partial waves of the '!f!-field. Complementarity in Bohr's sense follows as little from these experiments as any general idea has ever followed from our individual experiences. Matters do not look as favorable for Bohm's theory when we consider how it fulfills the task of a general explanation. First, there is the fact that the probabilities calculated according to Bohm's theory do not generally agree with the quantum-theoretical probabilities. For the positions we have agreement because of (7c). But the calculation of the momentum distribution in the ground state of the hydrogen atom, for example, always yields the momentum 0, according to Bohm. And this contradicts not only the quantummechanical distribution, but also the relations of indeterminacy. Bohm's way out of this difficulty is to beat the opponent at his own game, as it were. He argues that his probabilities concern the objective existence of a quantitative value, while the quantum-mechanical probabilities are probabilities of finding this and this result when measuring a quantity - i. e. just as Bohr has always proclaimed. Bohm must go beyond the Copenhagen interpretation, however, in his explanation of the fact that he has two probabilities where there always used to be only one. In the example just mentioned, this explanation states that we do not measure the "true" momentum - which here is always 0 but rather a momentum which only arises through the measurement: the quantum-mechanical probability is distributed according to the shoves, so to speak, which the electron receives during the measurement. Thus, here we have exactly the view which the Copenhagen School at first also considered, but which in the end it categorically rejected: we measure a statistically disturbed momentum, and we hold a disturbing parameter in the measuring apparatus responsible for the distribution of the momentum - a parameter which at the time is not controllable, although in principle it is. And from this direction, we now get Bohm's answer to von Neumann: A theory of hidden parameters does not have to be constructed on states of the object which ascribe to all observables precise values. For the indeterminacy of the observables, with the exception of the particle locations, is to be found in the environment of the object, and not in the object itself. Thus, the von 30
Philippidis et al. 1979
416
VI.27 The Copenhagen School and Its Opponents
Neumann type of proof of impossibility is criticized, not for an error in its manner of inference, but through the rejection of one of its premises: that is, a rejection already of the assumption (6a). The troublesome aspect of Bohm's first theory remains, however, that it does not deliver a plausible explanation for the unitary symmetry of the totality of quantum-mechanical observables. If one turns a blind eye to the problem of quantization, one can say: For those observables corresponding to a particle quantity along the lines of Bohm's theory, it makes sense to say that in general they cannot be precisely (in the sense of the quantum of action) measured. For in this case, the theory states that this indeterminacy refers to the actual values of these quantities. But what about all the remaining observables which quantum mechanics typically introduces? In their case, it remains completely unclear what is measured - precisely or imprecisely - in the object at all. The theory does not provide them with any ontological basis in the object. 31 Bohm's second theory makes a virtue of this necessity. The crucial idea for the first theory of a classical multi-particle mechanics on the basis of the quantum potential is abandoned in favor of the view that "the quantum 'observable' is no longer identified with any physical quantity or measurable property of the system in the usual (classical) sense .... Instead, each quantum observable is associated with a specific process of interaction between the system and a certain 'apparatus' ... " 32. This holistic view, of not introducing a system into a theory in isolation, but only in its interaction with its environment, an environment which in the end can include the whole universe, finds its expression in the fact that the dynamic basic equation of the theory is no longer merely the Schrodinger equation. Rather, the latter equation is modified by a non-linear (and in a certain sense non-local) additional term which, under certain circumstances, explicitly provides the dynamics of a quantum-mechanical reduction of state (3). The part of the new basic equation that provides the reduction also contains the hidden parameters and thereby, and in contradiction to the Copenhagen view, leads to a determined reduction of state. Furthermore, on the assumption that the hidden parameters are completely unknown, one obtains the correct quantum-mechanical probabilities (Ib). In conclusion, we want to see that the hidden parameters introduced in Bohm's second theory - although contradicting the Copenhagen interpretation - nevertheless represent what is perhaps the greatest possible concession to that interpretation33 . The best way to see this is to ask, how, in this case, a von Neumann type proof of impossibility is circumvented. For, in this case, 31
32
33
Critical comments by members of the Copenhagen school are: Pauli 1955; Rosenfeld 1955; Heisenberg 1959a, pp. 119 if; informative is also the controversy between Bohm and the Jauch school: Jauch/Piron 1963; Bohm/Bub 1966a; Jauch/Piron 1968; Gudder 1968; Bohm/Bub 1968 Bohm/Bub 1966b, p. 465 For the following, see Scheibe 1986a (this vol. ch. V1.26)
VI.27 The Copenhagen School and Its Opponents
417
the theory does not follow only the von Neumann approach (5) by means of which the hidden states are determined through the 'ljJ-functions and the hidden parameters. The theory also follows ~ by way of the determined reduction of state and in distinction to the first theory ~ the approach (6a), according to which a hidden state potentially decides every alternative. Consequently, one could conceive the state as a totality of subjunctive conditionals which state in each case for one alternative, which of its properties would result, if the alternative were measured. Thus, in such a hidden state, no properties would either be ascribed or denied to the object itself, and it would not be claimed of any alternative that it is measured. These two issues remain completely in the balance, and only the connection between alternative and result of measurement is determined in case a measurement takes place. Such a state would leave the structure of incommensurability of the observables intact at the price that the other von Neumann premise (6b) is not also fulfilled. For such states just do not exist. With this, however, one buys into a considerable anomaly: Now it will not only be possible, but it will be the rule that a property which is a possible result of distinct alternatives A and B ~ and every property is such ~ is in a state the result of a measurement of A, but not of B. This is how the Copenhagen spirit takes revenge on states which get too close for comfort! The anomaly described can possibly be understood as a non-locality. If we have a system composed of two systems I and II, then quantum mechanics permits, in suitable cases, secure inferences from possible measuring results in I to such results in II. And such cases are compatible with the fact that we know that both systems are light-years apart from each other. This would be an extreme non-locality, if, as the Copenhagen quantum mechanics wants to have it, the measuring result is only factual when a measurement is taken. Einstein, who preferred locality, concluded from this that 1) the properties in question are very well objectively present before any measurement and that, since the mentioned inferences from I to II are possible for incommensurable observables, 2) quantum mechanics is incomplete34 . If we now ask, whether a hidden state in the previously considered sense as such guarantees an objective presence of properties in Einstein's sense, the answer will be: no. For, if such a state describes a composite system I + II, then it is possible that the result of a measurement in I depends on whether a measurement was taken in II and what was thus measured. Thus, in order to satisfy Einstein's ideas, we shall have to exclude these possibilities. With such, as one might say, local hidden states, which surely exist, one can explain Einstein's predictive cases, which for him guaranteed reality: The correlations between observables in I and II occurring in a quantum-mechanical state ¢ of the total system are already given with every local hidden state s compatible with ¢. If one attempts, however, to reconstruct quantum mechanics also probabilistically by means of local hidden states, it will become apparent, with reference to Bell's 34
Cf. Einstein/Podolsky/Rosen 1935
418
VI.27 The Copenhagen School and Its Opponents
inequality, that in this case the task is impossible. This seems to fit well with the mentioned non-locality of Bohm's theory. One must observe, however, that the non-locality always merely concerned the interactions introduced, while now we are concerned with hidden states. Since Bohm's second theory was only worked out for maximal alternatives, its states are neutral with respect to the dichotomy of locality and non-Iocality35. But now we know: an extension to cover all alternatives will have to make use of non-local states.
35
Bohm is clear on this, cf. Bohm/Bub 1966b, p. 467.
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison* I
Johann von Neumann was the first to prove the impossibility of hidden parameters in quantum mechanics!. Of course: only of a certain kind of hidden parameter. At first, J. von Neumann imagined that, in addition to the quantum-mechanical "p-function, there could be further parameters>' which would end the indeterminacy still left by "p and together with "p definitely determine the respective probability free state s of the object. He immediately commented on his actual proof, however, that in it, he "did not need to go into the details of the mechanism of the 'hidden parameters' at all". Indeed, he was concerned directly with the question of the existence of hidden states which he sought to obtain simply through the usual re-interpretation of dispersion free, probabilistic states (see below). To this end, he set physically plausible requirements for the expectation value function in Hilbert space: He stipulated positive linear forms that were to be, in a certain sense, continuous. When it was revealed, however, that it was only the long familiar expectation value functions which satisfied this set of axioms, it followed in particular that there are no dispersion free and thus also no hidden states. Now, there is always something precarious about considering things which do not exist. Thus, in order to give a clearer impression of the (hidden) states of the von Neumann type, we generalize the particular conditions in Hilbert space and consider, to a large extent, arbitrary ortho-complementary lattices L.2 For the purposes of physics, however, it makes sense to limit oneself to a-complete and separable L. In that case, there are in L no more than a denumerable number of elements orthogonal in pairs. On the other hand, without further ado, one can form (even infinitely) denumerable conjunctions and disjunctions. We interpret the elements of L as the totality of possible contingent properties of a physical system. The two physically important special cases are: 1) for quantum mechanics, the lattice Lq of the closed sub-spaces of a separable complex Hilbert space, 2) for classical mechanics, the lattice Lc of the Borel sets of a phase space modulo of Lebesgu measure O. In the classical case, we must form the mentioned classes of equivalences in order to produce denumerable conditions. But this corresponds exactly to von Neumann's Hilbert space representation of quantum mechanics in which continuous observables such as position and momentum do not have sharp eigenvalues. * First published as Scheibe 1991d, translated by Hans-Jakob Wilhelm. 1
2
v. Neumann 1932 (1955), 111.2 and IV.2 For the following, see Scheibe 1986a (this vol. VI.26)
419
420
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
In an L, we now introduce the totality A(L) of the alternatives, i. e. of the subsets of L, the elements of which are orthogonal in pairs and have the oneelement of L as disjunction. If we interpret the elements of L as properties of a physical system, the alternatives attain the significance of measurable parts of the observables of the system - aside from their numerical values. After all, by means of the measurement of an observable, the decision is to be brought about regarding which of its values "is present". In the continuous case, the values now have to be replaced by intervals at any rate, in order to produce denumerable conditions. Whatever the case may be, the values (or the intervals) stand for contingent properties of which exactly one is ascertained by measurement as its result. This is just why the proper object of a measurement is an alternative in the sense defined. We now turn to von Neumann's concept of a state. As mentioned, this concept is to be obtained from the concept of probabilistic states. We want to take the liberty, however, of characterizing these states not, as von Neumann does, by means of expectation functions, but rather by means of probabilities, and that means in this case: through a-additive normed measures on L. Among these then, the non-dispersive ones are the candidates for hidden states. That is to say, in the case of probabilities, they are simply the 2valued ones. These would thus be characterized by
(0:) functions u : L -t {O,1} such that u(V) = 1, u(!\) = 0 and for every alternative A E A(L), there exists exactly one a E A with u(a) = 1 (thus with u(b) = 0 for all the remaining b E A). It is obvious that the a-additivity for 2-valued probability functions leads to (0:). The following, of course, is another matter: As (even if) 2-valued probability functions, they are, with regard to the interpretation, still probability functions. In the so-called statistical interpretation, they would then only be defined for an ensemble of systems. Thus, an express re-interpretation is required, if one now wants to understand the u with (0:) as descriptions of state in the classical sense such that u states which properties a a system has and which it does not have, that is, depending on whether u(a) = 1 or O. What is special about the 2-valued functions among all probability functions is the fact that only they could possibly allow for this re-interpretation. This should be the first point of orientation. This limitation, of course, remains bound to the concept of probability assumed as the basis, and it would not be valid, if this concept were extended. Leaving this option aside for the moment, we want to explicate the concept of state characterized in (0:) by providing several equivalent versions. One should bear in mind, however, that - just because of the equivalence - the states according to the following characterizations, just as the states according to (0:), cannot possibly count as hidden states for quantum mechanics. A first possibility is given by
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
(13) functions s : A(L) --+ L with s(A) E A as well as s(A) s(A) E B or s(B) E A for any two A, BE A(L).
421
= s(B), insofar as
Accordingly, a state obtains when in every alternative exactly one property is distinguished (as the result of measurement in this state), and distinguished (of course) in such a way that the distinction is the same for all alternatives which likewise bear a property distinguished in a given alternative. The connection with (Q) is evident: If a u is given with (Q), then one obtains an s with (13) in the following way: With a given A E A(L), s(A) is that a E A, for which u(a) = 1. Then, if s(A) E B for B E A(L), it follows, as required in (13), that s(B) = s(A). For otherwise u would assign the 1 to two different properties in B which contradicts (Q). It is also readily seen that this assignment of an s with (13) to a u with (Q) is a one-one assignment and comprehends every s. For if s is given with (13), then that u corresponds to it for which u(a) = 1 for a = s(A), where this assumption does not depend on A because of (13) if only a E B. We obtain a further characterization of our states with the help of the important relation between two A, B E A( L) that B is more fine-grained than A. That is to say that every property a E A is a disjunction of properties b E B. Together, all these disjunctions provide a more fine-grained subdivision of the properties of B as contrasted with A. The relation of (possibly) greater refinement of B as contrasted with A is an ordering relation in A(L) which is akin to the ordering relation of the lattice L itself. Thus, without risking misunderstandings, we want to use the same designation < for both. In the next characterization, this brings us to
b) functions s : A(L) --+ L with s(A) E A as well as s(B) < s(A) if B < A for any two A, BE A(L). Thus, in the transition to a more coarse-grained alternative A ~ as one could say, paraphrasing b) ~ the state must distinguish that property s(A) which is implied in the property s(B) distinguished in the more fine-grained B. (I) too is just another version of the same concept of state. The equivalence with (13) can be seen as follows: Let (13) be valid and let B < A. Then there exists exactly one a E A with s(B) < a. We now form the alternative A' = {a} U {b E Bib -.l a}. Then s(A') = a. Otherwise it would be the case that s(A') E B and thus, because of (13), s(A') = s(B). In this case, however, s(A') 1.- a, and s(B) 1.- a would thus be in contradiction with s(B) < a. Consequently, s(A') = a and thus s(A') E A, that is, s(A') = s(A) = a, thus in the end s(B) < s(A). Conversely, let now b) be valid and let s(A) E B for A, BE A(L). If s(B) -=f. s(A), then with P = {s(A), s(A)~} we would necessarily have s(P) = s(A)~. But now A < P and thus s(A) < s(P), because of b), and thus s(A) < s(A)~ which is impossible. Thus s(B) = s(A), and we have (13). We arrive at our penultimate characterization with a partial operation on A(L). First, we call two alternatives commensurable, if they share a common
422
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
refinement, that is, if with a given A, B E A(L) there is aCE A(L) such that C < A and C < B. Among the shared refinements of A and B there exists one (if there exist any at all) which is the most coarse-grained: the conjunction of A and B. Its elements are all conjunctions a 1\ b i= A with a E A and b E B. Without risking misunderstandings, we designate this alternative likewise by A 1\ B. Then, we consider
(8) functions s : A(L) ~ L with s(A) E A and s(A 1\ B) = s(A) 1\ s(B) for commensurable A, B E A( L ). Again, this is a self-evident requirement from the classical perspective, and further it is one which will be the focus of our attention in Part III. For the moment, it remains to be shown that (8) likewise is only a reformulation of ({3). We demonstrate this via ('Y). Let (8) be valid and B < A. Then B 1\ A = B, thus with (8) also s(B) 1\ s(A) = s(B) and hence s(B) < s(A), as required. If conversely ("() is valid, then we immediately obtain s(A 1\ B) < s(A) 1\ s(B) for commensurable A, B, since A 1\ B is more fine-grained than A and B. Further, s(A) E s(B) and s(B) E B, thus s(A) 1\ s(B) E A 1\ B, if s(A) 1\ s(B) i= A. But this is the case, since otherwise also s(A 1\ B) = A which is impossible. Thus, s(A) 1\ s(B) E A 1\ B. In any case, however, s(A 1\ B) E A 1\ B. Because of s(A 1\ B) < s(A) 1\ s(B), this is only possible if there is equality. Finally, we must still consider the special case where L is the lattice of the idempotent self-adjoint elements of a C* -algebra .C* the operations of which for commutable (!) elements P, Q are given in the familiar way by
p. Q,
P
+Q -
p. Q,
1- P.
For Lq (quantum-mechanical case), this concerns the projection operators in Hilbert space, while for Lc (classical case), it concerns the characteristic functions of Borel sets. Since we have denumerable conditions in L, we want to assume that this transfers to .c* and that every element H E .c* has a unique spectral representation
with I:i lai 12 < 00. We then interpret the self-adjoint elements H* = H from'c* as observables of a physical system. An alternative A E A(L) is then definitely assigned to every H E .c (= totality of the self-adjoint elements from .c*), and the observables can be characterized, in accordance with the spectral representation, as pairs
of alternatives and corresponding spectra (of eigenvalues). The observables are thus somewhat more fine-grained than the alternatives, and we want to
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
423
call two observables equivalent if they have the same alternative. We shall assume states of a system relative to the observables as functions s' : £. -+ lR
in such a way, however, that for equivalent observables
i. e. that what ultimately matters in a value assignment is only the question, which property Pi belongs to the alternative {Pili. Under these conditions, we then have a one-one relation between all the functions s considered above, which assign properties s(A) E A to alternatives A, and all the functions s' introduced as follows: For a given s, s( ({ Pih, {aih)) = aio, if s' ({ Pih) = Pia. The question still remains, what effect in this case our restrictive conditions have for classical states. A glance at (<5) suggests that this time classical states are given by
(f) functions s' : £. -+ lR with s' (H . K) = s' (H) . s' (K) for commutable H,KE£.. One must keep in mind in the proof that the product of two observables Hand K has an alternative which in general is not the conjunction of the alternatives of Hand K, but rather a coarser version of the same. This is due to the fact that in
H .K =
L ai!3k PiQk ik
(with commutable H, K) in general not all ai!3k differ from one another, which means that there are occurrences of combinations of the PiQk. Nevertheless, (f) is equivalent to (<5). For if (<5) is valid, then s' (H . K) must be a value which belongs to a property above s(A /\ B), if A and B are the alternatives to H or K. Because of (<5), we then obtain the correct product. If conversely (f) is valid, then one returns to (<5) by taking the liberty of choosing, for given alternatives A, B, the observables H, K with respect to their values in such a way that all products ai!3k differ. With (f), we have reached a characterization of classical states which we, as mentioned, shall pick up again in Part III. The equivalence of (f) and (a) will then allow us to demonstrate in what sense Bell's theorem is a generalization of von Neumann's theorem. And we shall understand the latter in the broader sense in which it also includes the whole tradition of theorems of impossibility that followed von Neumann. 3 The step beyond von Neumann's result in this tradition was the theorem by Gleason which, without taking the route of expectation value functions, shows directly for the case 1) of quantum mechanics that the only a-additive probability functions in Hilbert 3
An overview is given in Scheibe 1981a; see also Kruszynski 1984
424
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
space (with dimension> 2) are the usual ones. The step beyond von Neumann results from a reformulation of Gleason's result for expectation value functions, a reformulation which now requires the linearity of these functions only for commensurable observables. It must be pointed out, however, that the inference, especially relevant for the question of hidden parameters, of the non-existence of non-dispersive states in the sense of (a) is somewhat trivialized by the following fact. In the classical case 2) there exist no 2-valued a-additive (!) probability functions 4 either, and there is a reason for this which applies to quantum mechanics as well. In quantum mechanics, 2-valued aadditive probability functions do not exist in the infinite-dimensional case for the simple reason that they do not exist on the Boolean (!) property lattices (as sublattices) belonging to the continuous observables. 5 , 6 II
Now that the situation in the range of action of von Neumann's theorem has been sufficiently clarified, we want to turn to Bell's theorem. The proof of this theorem, however, requires a prior clarification of the status of Bell's inequality which is used in the proof.1 In many of the countless works that have been dedicated to this inequality, the latter is "deduced" without clarification on what assumptions and for what purpose such a deduction is made. In the proof of Bell's theorem, Bell's inequality appears as a consequence of classical probability theory - or rather: it should appear as such. The point is that the relevant inequality is not valid in quantum mechanics. An only marginally weaker inequality, however, is valid in this domain as well. What matters is thus only this small difference. Moreover, on one or the other stronger assumption, Bell's inequality is valid in quantum mechanics as well. We are able to gain a general access to the present issue by proceeding from a *-Algebra C* and defining with a positive, normed, linear form E on C* LlE(A, A'; B, B')
= IE(AB) -
E(AB')I
+ IE(A' B) + E(A' B')I
(1)
where the A, A' , B, B' E C shall be self-adjoint and commutable inasmuch as they appear as factors of the same product in E. In the physical interpretation, the self-adjoint elements of C* are observables, and E is an expectation value function. We now distinguish two principal cases: In the classical case, C* is the *-Algebra of all integrable functions on a classical probability space, and with the measure of probability I-l 4
5 6 7
See Kamber 1965, sect.6, Example 3), and sect. 13.2 Besides Kamber 1965 see also Kamber 1964 and Dombrowski/Horneffer 1964 At this point H.-J. Schmidt (Osnabriick) has saved me from a mistake concerning a strenthening of Gleason's theorem. First publication in Bell 1964
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
Ec(A)
=
J
425
(2a)
AdM
In the quantum-mechanical case, C* is an irreducible *- Algebra of bounded linear operators of a Hilbert space, and with a statistical operator W Eq(A)
= Tr(W A)
(2b)
If, in that case, the spectra of A, A', B, B' or - equivalently - the operators themselves are bounded absolutely or by a norm through 1, then we have in the classical case Bell's inequality8 LlC(A A"B B') I-'
'
"
< 2 -
(3a)
and in the quantum-mechanical case 9 , we accordingly have LlW(A, A'; B, B') ::;
2V2
(3b)
The proof of (3a) runs as follows: Ll~(A, A'; B, B')
= IEc(A(B - B'))I + IEc(A'(B + B'))I
::; Ec(IAIIB - B'I) + Ec(IA'IIB + B'I) ::; Ec(IB - B'I) + Ec(IB + B'I) = Ec(IB - B'I + IB + B'I)
::;2
where in the final step we make use of the fact that IB - B'I
+ IB + B'I
::; 2
for IBI,IB'I ::; 1. In a similarly simple way, we have in the quantummechanical case (3b), beginning with Schwarz's inequality and at first only for pure states, Ll~(A, A'; B, B')
= I(¢, A(B - B')¢)I + I(¢, A'(B + B')¢)I = I (A¢, (B - B')¢)I + 1(A'¢, (B + B')¢)I ::; IIA¢II·II(B - B')¢II + IIA'¢II·II(B + B')¢II
::; II(B - B')¢II + II(B + B')¢II ::; {2(II(B - B')¢W + II(B + B')¢W}! ::; {2· 2(IIB¢W + IIB'¢112)}!
::; 2V2.
In this case, the assumptions IIAII, IIA'II ::; 1 and IIBII,IIB'II ::; 1 are each drawn upon once, and the equation in the penultimate step results through 8 9
In this form the inequality occurs first in Bell 1971; see also Selleri 1972 The origin of this inequality is unknown to me. The following proof lowe to a suggestion of Prof. Stamatescu (Heidelberg). For a proof under more general assumptions see Summers/Werner 1987, p. 2442
426
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
the mere computation of II(B =r= B')4>W = ((B =r= B')4>, (B =r= B')4». For arbitrary statistical operators W, (3b) eventually results from the fact that the expected values are convex sums of expected values with pure states (see 2) below). Thus, here too, everything is very simple. In principle, the classical inequality could have been discovered in the 18th century, the quantummechanical one at the end of the 1920's. This remark in no way makes light of the recent discovery, the significance of which lies, of course, in its physical application. Without additional assumptions, the two inequalities cannot be strengthened. The strengtening already mentioned from (3b) to (3a) is possible in a series of cases, which shall now be briefly listed. 1) A and A' are also commutable as well as Band B'. Since then all of the operators involved are commutable, we have classical conditions such that an strengthening is to be expected. For discrete observables, we introduce a common orthonormal basis which then becomes a classical probability space on which the operators appear as real-valued functions. 2) In the second case, we make the relative assumption that (3a) is valid for statistical operators Wi' Then (3a) is also valid for the convex sum 2:i Pi Wi with Pi 2 0 and 2:i Pi = 1. For this purpose, the classical proof works from the very beginning. 3) In the next caselO, we assume that we are dealing with a quantummechanical system consisting of two subsystems I and II. Further, let the observables A, A' and B, B' belong to I and II respectively, more precisely A = F 0 1,
A' = F' 0 1,
B = 1 0 G,
B' = 1 0 G'.
In that case, the (general) commutability condition is automatically satisfied. Here, (3a) is likewise valid, if W = U 0 V where U belongs to I and V to II. This follows directly with the separation Tr((U 0 V)(F 0 G))
= Tr(U· F) . Tr(V· G).
4) With 2) and 3), (3a) now follows as well for W = 2:i PiP
i 0 '¢i which we shall encounter in the next case. We shall describe this last case in a little more detail. As in 3), we begin with a composite system 1011. Our special assumption then is one which combines the observables F, F', G, G' with the state W for which we form the expected values in (1). Individually, we assume that the observabIes have discrete spectra and that W is a pure state. These assumptions are not essential. An essential assumption is that F and G as well as F' and G' are EPR-correlated in PY An explicit formulation of this for F and G (for example) is the following: With p{ or as the spectral projections corresponding to the proper values Ji of For 9i of G, let
Pp
10 11
For the cases 3) and 4), see Selleri 1988, pp. 28 f Einstein/Podolsky/Rosen 1935
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
427
II(Pt ® PP)p112 = 1 11(1 ® PP)p112 be valid as well as the same equation with the commutation of F and G. This means that we have a one-one correspondence between the possible measurement results fi and gi in a measurement of F or of G (here it is simply the identity of the indices i) such that in the state P the conditional probability for the result fi in a measurement of F after receipt of the result gi in a measurement of G (and likewise that of gi after receipt of fi) is equal to 1. In other words, we have here the situation that, as EPR imagined in their thought experiment, on the basis of the peculiarity of the state P, the observables F and G can, so to speak, be measured for each other: We know the result of a measurement of F (or G) on the basis of the result of the measurement of G (or F). Our additional case regarding the validity of Bell's inequality in quantum mechanics is now: 5) If F is EPR-correlated in state P with G and F' with G' , then,1~ :S 2, i. e. (3a) is valid here as well. In this generality, we are so far only dealing with a conjecture. But there exists a proof for the special case that the Hilbert spaces of the subsystems are 2-dimensional. 12 III The essential feature of von Neuman's theorem (and proof) from I is that the level of probabilities is not entered at all. There is no confrontation between quantum-mechanical and classical probability theory because there are no hidden states which could create a classical probability space in the first place. And it is just this non-existence which can be proven for states of the von Neumann type. The essential feature of Bell's theorem (and proof) is, as will now be shown, that in this case there exist hidden states in sufficient number and that therefore one can no longer base the non-existence of a theory of hidden parameters for quantum-mechanics on the non-existence of those states. Rather, one must make an assumption regarding the connection between the quantum-mechanical and classical probabilities according to the idea of a theory of hidden parameters, and this time one must base the proof of the non-existence of such a theory just on that connection. The basic structure of such a proof itself fundamentally brings Bell's inequality into play: This inequality appears as a theorem of classical probability theory. Its connection with the corresponding quantum-mechanical theory expressing the idea of hidden parameters permits the inference that the inequality should also be valid quantum-mechanically. Since this is not the case, the non-existence of a theory of hidden parameters follows (on the assumptions made in this respect). Thus, the main burden of proof is taken on by the inference according to which Bell's inequality should also be valid quantum12
Scheibe 1991e (this vol. V1.29)
428
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
mechanically. At this point, we should already mention that it is this inference which is made possible by the condition often called Bell's locality. In our reconstruction, however, this condition is nothing but the condition (€) from I limited to the observables F and G which belong to different quantummechanical systems. For the sake of clarification, we want to compare Bell's theorem and its proof with an analogous theorem together with its proof. Before we proceed to do this, however, we must set up, as a common basis of both undertakings, a canon of minimal conditions for a theory of hidden parameters for quantum mechanics. 13 This theory itself is essentially based on classical principles: We have a state space S and a set 9 of quantities
f:S-+JR the states being present in sufficient number: (a) If f(s)
= g(s)
for all s E S, then f
= g.
Since the quantities are functions on S, 9 naturally turns into an algebra. For multiplication, for example, we have the definition
(f. g)(s) = f(s) . g(s).
(4a)
s(f) = f(s),
(5)
Through the dualization
a given state s E S in turn becomes a function
s : 9 -+ JR. This allows us to express that there shall be sufficiently many quantities: (b) If s(f)
= t(f) for all f
E g, then
s = t.
A further consequence of (4a) is
s(f· g) = s(f) . s(g).
(4b)
With some further assumptions which are not important to us, we finally also arrive at the classical probability measures and their corresponding expectation value functions E~
: 9 -+ JR.
So far, we have only talked of the theory of hidden states s E S. Its connection with quantum mechanics is now essentially established through a bijection
x: Db -+ g,
(6a)
where Db is the totality of the bounded and discrete observables of quantum mechanics. In any case, we demand 13
Scheibe 1981a
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
429
(c) X conserves the spectrum of every observable. Thus, the quantities from 9 too can have only a discrete spectrum. Accompanying (6a) is the mapping
s'(H) = s(X(H))
(6b)
which makes the classical states into functions of the quantum-mechanical observables. Because of (b), (6b) is an injection. This makes the classical states into hidden states of quantum mechanics, and we achieve the connection with our consideration in Part I, in particular with the characterization (E) of states of the von Neumann type. By means of X, however, we also obtain functions E~(X(H)), and with respect to these, the most important requirement is valid: (d) For every classical measure of probability IL there is a quantum-mechanical statistical operator W such that
E'fv(H) =
E~(X(H)),
for all H E ~b, and every W is obtained in this manner. In preparation of our reconstruction of Bell's theorem and its proof, we now first want to present the almost completely analogous case in which a proof of the non-existence of a theory of hidden parameters is based on the Heisenberg indeterminacy relations. It is remarkable that something like this has never been explicitly undertaken, although, of course, it has frequently been said that the reason for the impossibility of returning to a classical theory lies just in the Heisenberg indeterminacy relations, and although every physicist would believe an explicit proof of this kind unseen. 14 But just because of all the fuss that has been made over Bell's theorem, it is instructive to be acquainted also with the following case. In this case, the role of Bell's inequality is assumed by another theorem of classical probability theory by means of which the existential weakening of the Heisenberg indeterminacy relations is negated. The latter states for quantum mechanics:
(HU) There exist observables Hand K as well as (quantum-mechanical) states W
aw(H) . aw(K) ~
E
> 0 such that for all
E,
where, as usual,
is the square of the dispersion. This existential weakening results from position and momentum for Hand K resp., and n/2 for E. The negation then of (HU) is (in classical notation): 14
See, for instance, de Broglie 1957, p. 26
430
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
(HU') For any two quantities f and 9 and arbitrary
E
> 0, there exists a
(statistical) state I-l such that
for the dispersions a w
(HU') is a theorem of classical probability theory which is violated by quantum mechanics, and the Heisenberg indeterminacy relations negating (HU') represent the earliest and to this day (in spite of Bell) the most famous violation of a probability-theoretic kind. Now, how does this fact become a basis of a proof of the non-existence of a theory of hidden parameters? One tries to infer that the assumption of such a theory would entail (HU') also for quantum mechanics. Characteristically, this proof requires something beyond what we have stated so far concerning the connection between the two theories. For the inference would look as follows: Let the quantum-mechanical observables Hand K as well as E > 0 be given. With (6a), we move over to the classical quantities X(H) or X(K). Because of (HU'), there is a classical (statistical) state I-l corresponding to these quantities such that
Corresponding to I-l in turn, there exists a quantum-mechanical state W, in accordance with (d), such that
Ew(M) = EJl(X(M)) for all observables M. At this point, one would now like to infer that then also
for all M. And this is the place where we need the additional assumption
(7) Only with the help of this are we able to infer
aJl(x(M)) = EJl(X(M)2) - EJl(X(M))2 = EJl(X(M 2)) - EJl(X(M))2 = Ew(M2) - Ew(M)2 = aw(M). The above inequality thus provides us with
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
431
and we would have (HU') in quantum mechanics as well. Thus, with (7), the proof of impossibility can be furnished. (7) is a typical condition of isomorphism, and hence one could leave the question regarding its origin at that. For, of course, it is conditions of isomorphism of which one must require more or less in order to make a reduction of quantum mechanics to a classical theory as such plausible. For the clarification of Bell's theorem, however, it is important to show that conditions of isomorphism can originate from properties of the hidden states. This is true of our preparing case: (7) is - ceteris paribus - equivalent to
s'(H2) = s'(H)2
(8)
for hidden states s'. If we have (7), then with (6b)
s'(H2) = s(X(H2)) = s(X(H)2) = s(X(H))2 = S'(H)2, thus, (8). Conversely, if the latter is valid, then
(X(H))2(s) = (X(H))(s)2 = s(X(H))2 = s'(H)2 = S(X(H)2) = (X(H)2)(S), thus again (7). In these inferences, we have made use of (4)-(6) and (a). This demonstrates how the satisfaction of the condition of isomorphism (7) and hence the feasibility of the proof of non-existence depends on the existence of sufficiently many hidden states (8). Something along these lines, however, was to be expected, since the hidden states, after all, constitute the state space of classical probability theory and thus provide the ground, as it were, on which the entire argument rests. Yet, here we do not want to pursue any further the question of how large, for example, the set of all functions s' with (8) is. Like the non-existence theorem advanced for the purpose of analogy, Bell's theorem also claims (in our reconstruction) the non-existence of a theory of hidden parameters. From the start, it refers to a system I ® II composed of two systems I and II. The role of (HU') is now assumed by Bell's inequality as a theorem of classical probability theory in its application to the theory of hidden parameters for I ® II. In accordance with (3a)we would have t1~ (j,
f'; g, g') ::; 2,
(9)
where the application is limited to the case in which j, f' are quantities of I and g, g' are quantities of II. The role of (HU) is now assumed by the negation of (9) for quantum mechanics, i. e.
t1W(F ® 1, F' ® 1,1 ® G, 1 ® G') > 2
(10)
for certain observables F, F,' and G, G' of I or II and states W of I ® II.15 This time 15
Mind that these existence statements are stronger than those which result from negating Bell's inequality in general. For we are here confronted with the special 'product situation'. In the 2-dimensional case, it is true, the violation of Bell's inequality has been proven for this case first, see Bell 1964
432
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
X(F 0 G) = X(F (1) . X(10 G)
(11)
corresponds to the condition of isomorphism (7), and the equivalent condition for the hidden states is
s'(F 0 G) = s'(F (1) . s'(10 G).
(12)
It is immediately evident that (12) is a generalization of (E) from Part I: While there it was the product rule for any two commutable operators H and K that was in question, we are now only dealing with such products, the factors of which belong to system I or II. This generalization makes Bell's theorem possible: While hidden states of von Neumann's type (E) do not exist at all, certainly states exist which satisfy only the weaker condition (12) ~ hidden states of the Bellian type. Even in this case, we shall not delve deeper into the issue, but it should be noted that (12) attaches no conditions to s' limited to the sets of observables of I and II. The proof of the equivalence of (11) and (12) is again very simple. If (11) is valid, we infer (for any two operators Hand K)
s'(H . K) = s(X(H . K)) = s(X(H) . X(K)) = s(X(H)) . s(X(K)) = s'(H) . s'(K), i. e. (12). Conversely, if the latter is valid, then
(X(H) . X(K))(s) = (X(H))(s) . (X(K)(s) = s(X(H)) . s(X(K)) = s'(H) . s'(K) = s'(H. K) = s(X(HK)) = (X(HK))(s) and thus with (a) again (11). Here we have again made use of the classical equations (4)-(6). It should be noted that (12) is simply the equation corresponding to (4b), as one moves on to quantum mechanics. In conclusion, we deliver the proof of Bell's theorem itself. Again, the inference is indirect. If a theory of hidden parameters exists, Bell's inequality would be found in quantum mechanics as well. For: Let the observables F, F' of I and G, G' of II as well as a (statistical) state W be given, then there would be a classical (statistical) state f.L such that
E'tv(F 0 G) = E~(X(F 0 G)) and accordingly for the other pairs. 16 Again, we have come to the place where it must be shown that in that case
Ll'lv(F, F'; G, G') = Ll~(X(F), X(F'); X(G), X(G')) 16
Mind that in our foregoing model case the reverse direction of (d) had been employed
VI.28 J. von Neumann's and J. S. Bell's Theorem. A Comparison
433
is valid as well. And indeed, we obtain this step immediately with (11) because of
Efv(F ® G) = E~(X(F) . X(G)) Since X also conserves the spectra, Bell's inequality would follow also for quantum mechanics in contradiction to (10). Thus it is condition (12) as the (only) condition of the hidden states on which, aside from Bell's inequality, our proof essentially rests. (12) or related conditions have been interpreted as locality conditions (the so-called Bell-locality), and the attempt has been made to bring this into connection with the Einstein-locality. In this context and against this tendency, it has also been claimed that Bell's inequality is irrelevant to the problem of a local theory of hidden parameters. 17 Our remarks have nothing to do with these quarrels which are marked (and conditioned) by numerous conceptual confusions. These remarks merely demonstrate how one can indeed base an unobjectionable proof of the non-existence of a theory of hidden parameters of quantum mechanics on Bell's inequality and the factorization (12). From the technical perspective on the proof, it is evident how we arrive at (12), if the proof is to use Bell's inequality, just as we were lead earlier to the condition (8), if we want to base the proof on the negation of Heisenberg's inequality. And this is valid independently of the physical interpretations of those conditions. It is just because of this that here we are able to speak of a proof
17
For attempts at greater precision see Hellman 1982 and Redei 1991
VI.29 EPR-Situation and Bell's Inequality* I Following the original paper of BellI the connection between the EPR- argument for the incompleteness of quantum mechanics 2 and Bell's inequality is usually established by saying that the former suggests the existence of a local hidden-variable theory for quantum mechanics whereas the latter then allows to show that such a theory does not exist 3 . More specifically the following account is given. The EPR-argument is based on two premises: The reality criterion: If, without in any way disturbing a system 1, we can predict with certainty (i.e., with probability equal to unity) the value of a physical quantity F of 1, then there exists an element of physical reality corresponding to F. 4 The locality condition: If the prediction for F of 1 can be made by measuring a quantity G of a second system II, spatially separated from and not interacting with system 1, then that prediction can be made without disturbing 1.5 To get the argument started the following situation is envisaged: EPR-situation: The total system 10 II is in a state I/> in which every pair out of a set M of pairs (F, G) of quantities of 1 and II respectively is EPRcorrelated. Here F (of 1) and G (of II) are said to be EPR-correlated in state I/> (of I ® 11) if there is a one-to-one correlation a +-+ (3 between the possible values of F and G such that, if a measurement of F yields a, an immediately following measurement of G would, on account of 1/>, certainly yield (3, and vice versa. Given an EPR-situation characterized by I/> and M we can now argue, by first using the locality axiom and subsequently the reality criterion, that quantities F of 1 and G of 11 occurring in M have 'real' values, i.e. values that are not 'created' by a measurement. In other words, there is a state hidden behind I/> in which all the quantities of 1 and 11 occurring in M have definite values. Now, although M can be very comprehensive, given 1/>, it can certainly not contain all quantities from 1 or II. Ignoring this it is usually concluded that, according to EPR, there should be hidden states covering all quantities from I and I I. And, of course, one can argue that, given any * First published as Scheibe 1991e. 1 2
3 4
5
Bell 1964 Einstein et al. 1935. See, for instance, the review papers Clauser/Shimony 1978 and Selleri 1988. This is essentially the famous wording of the criterion in the original EPR-paper Einstein et al. 1935. The locality condition is more explicit in Einstein 1948 and 1949 than in the EPR-paper.
434
VI.29 EPR-Situation and Bell's Inequality
435
quantity F of I, there is always a state if> of I Q9 I I and a quantity G of I I such that F is EPR-correlated with G in if> and vice versa. In this way, although no single EPR-situation allows the conclusion, their totality somehow may bring it about. At any rate, the second part of the account - the one bringing in the Bell inequality - starts with the assumption that there are (total) hidden states and that they are local in a certain sense that has first been specified by Bell. There is, in my view, a problem connected with the manner in which this condition of locality, i.e. the one that is actually used in refuting the assumption of a local hidden-variable theory, comes out of the original locality condition formulated above and put into their argument by EPR. Since this is one of the major points of contact between EPR and Bell I shall comment on it in the last section of this paper. For the moment we may assume that, if the transformation of the original idea of locality into Bell's version of it is made, then the rest of the argument is straightforward: The assumption of a classical probabilistic and local hidden-variable theory that can reproduce all of the probabilistic predictions of quantum mechanics has the consequence that Bell's inequality would have to be generally valid also in quantum mechanics. This not being the case that assumption is refuted. Now my main concern in this paper will be to raise the following question: Since it is obvious that the assumption of an EPR-situation is vital for the EPR-part of the whole argument, what is its relation to Bell's inequality as the main tool in the Bell-part of the argument? The following is obvious: If the set M in an EPR-situation consists of at least three pairs of observables one has freedom enough to violate the inequality (in the version (3) below). But what if there are only two pairs? In this case only two of the four pairs occurring in (3) below are not strictly correlated, and this suggests the conjecture that Bell's inequality holds without any further assumption. A proof is given in the simplest case where the Hilbert spaces of the subsystems are 2-dimensional. 6 II
To prepare the ground let us first recapitulate what Bell's inequality says and what its status in quantum mechanics is. For any self-adjoint elements A, B, A', B' of a *-algebra n, pairwise commutable as they appear jointly in the following expression, and for any positive linear form E on n we define LlE(A,B,A',B')
== IE(AB) - E(AB')I + IE(A'B) - E(A'B')I.
(1)
In the physical interpretation the self-adjoint elements of n are observables, and E is an expectation-value function. Two main cases are to be distinguished: The classical case where n is a *-algebra of integrable functions on a classical probability space and (f-l being a measure on the space) 6
Profs. Bell, Selleri and Van Fraassen have confirmed my impression that this question has not been dealt with in the literature.
436
VI.29 EPR-Situation and Bell's Inequality (2a)
and the quantum mechanical case where n is an irreducible *-algebra of bounded linear operators on a Hilbert-space and
Eq(A) = Tr(W A)
(2b)
with a statistical operator W. If the spectra of the A, B, A', B' are absolutely bounded by 1, then in the classical case Bell's inequality
2
(3a)
2V2
(3b)
Ll~(A,B,A',B'):::;
and in the quantum mechanical case
LlW(A,B,A',B'):::; holds. 7 The proof of (3a) is straightforward:
LlE(A, B, A', B') = IEe(A(B - B'))I + IEe(A'(B + B')I :::; Ee(IAIIB - B'I) + Ee(IA'IIB + B'I) :::; Ee(IB - B'I) + Ee(IB + B'I) = Ee(IB - B'I + IB + B'I) ::::;2 where the last step comes from the corresponding (trivial) inequality for the argument of E e , in the last but one line. I have given this proof because in view of the literature it seems to be the most simple and reliable proof existing. Moreover, in view of the host of 'derivations' of Bell's inequality from unclear premises that prevail in the literature8 I want to emphasize the status of the inequality as being just a theorem of classical probability theory. As such it could have been found in the 18th century.9 Saying this is not belittling the importance of its recent discovery which, of course, lies in the physical application. However, to be involved in physics may not dispense us from trying to argue as clearly as possible. And those 'derivations' of Bell's inequality would gain much in clearness if they were reconstructed as so many proofs from the axioms of classical probability theory together with, but strictly separated from, considerations as to how and why these axioms come into playas the physical circumstances may require. 7
8 9
Equation (3a) is, then, the version of Bell's inequality that appears in Bell 1971. A proof of (3b) under very general assumptions is given in Summers/Werner 1987, p. 2442. Besides the papers mentioned in no. 2 see also Selleri/Tarozzi 1981. For precursors of Bell's inequality see the references in Pitowsky 1989, p. 49.
VI.29 EPR-Situation and Bell's Inequality
437
In the quantum mechanical case the inequality (3a) has to be replaced by the weaker statement (3b) in order to be generally valid. However, as is well known there are several more special assumptions under which
..dW(A,B;A',B')::; 2
(3c)
also in quantum mechanics. Since I want to add another such case it may be worthwhile to review briefly the more important cases already known. 1) Equation (3c) holds if all observables A, B, A', B' pairwise commute.1° Obviously the additional assumption restores the classical situation so that the result is to be expected. Accordingly, the proof (for discrete observabIes) is given by using the additional assumption for introducing a common orthonormal basis of eigen-vectors for A, B, A', B'. This basis then is immediately turned into a classical probability space on which our operators reappear as real-valued functions. The rest of the proof is identical with the one given for (3a). The next case to be mentioned is the case of a quantum mechanical system composed of two subsystems I and I I. As we know from the introduction this is the case we shall resume for our main consideration. In it the four observables A, A' and B, B' are assumed to belong to the subsystems I and I I respectively. More precisely we assume
A = F ® 1, A' = F' ® 1, B = 1 ® G, B' = G'
(4)
with F, F' and G, G' being observables of I and II respectively. Our general assumption about commutability is then satisfied automatically. For the moment the special case to be mentioned is 2) Equation (3c) holds if W is a product, i.e. if W = U ® V where U belongs to I and V to I I. The proof is immediate if one considers that
Tr((U ® V)(F ® G» = Tr(UF) . Tr(VG) which reduces the situation to the classical one. Our third case shows how the classical behavior of ..dW can be hereditary: 3) If (3c) holds for statistical operators Wi then also for the convex sum Li (h W where Pi :;::: 0 and Li Pi = 1 . Here the proof is the classical one ab ovo. From (2) and (3) it follows immediately that 4) Equation (3c) holds if W = Li P"'i®1/Ji This result l l is remarkable if compared with the EPR-situation to which we now turn. 10 11
I did not find this case in the literature. Selleri 1988, pp. 28f.
438
VI.29 EPR-Situation and Bell's Inequality
III
In the EPR-situation we deal not with a mixture as in 4) but with the corresponding pure state
(5) with orthonormal bases {
(6) Now it may happen that with respect to a pure state if> of the compound system I 0 I I the two observables F and G are correlated in the following sense: There is a one-to-one correspondence between the possible outcomes Ii and gi, of a measurement of F and G respectively (here assumed to be established by the identity of indices) such that in if> the probability of the result Ii for F and G j for G of a joint measurement is 0 if i i= j:
(i
i= j)
(7a)
This assumption is equivalent with the requirement that the conditional probability of Ii of F given gi of G (or vice versa) is equal to 1: (7b) So here we have the situation envisaged by EPR that the observable F of I is measured by a measurement of G of the other system II: Given the result gi of the latter (7b) allows us to predict the result Ii for the observable F. 12 The major conjecture now is: Conjecture: If in the case of a compound quantum mechanical system I0 II the observables F, F' of I are EPR-correlated with G, G' of II respectively with respect to a pure state if> of I 0 I I then (under the usual assumption concerning the spectra) Bell's inequality 12
It should be remarked that at this point the EPR-argument uses Von Neumann's projection postulate. Interestingly enough, whereas there are countless attacks on this postulate to be found in the literature I have never seen anybody blaming EPR for using it.
VI.29 EPR-Situation and Bell's Inequality
439
Ll~(F @ 1), F' @ 1; 1 @ G, 1 @ G') :::; 2
holds. It may immediately be added that, as the examples given in the literature show, this conjecture cannot be strengthened by dropping one of the two correlation assumptions. I do not have a proof of this conjecture in general. But here is the sketch of a proof for the simplest case in which I and I I have 2-dimensional Hilbert spaces. (This proof, therefore, covers the 'spin case' that is virtually the only one discussed in the literature.) Without loss of generality we can assume that F and F' (and consequently G and G') do not commute. (For if they did our conclusion would follow already according to case 1) above.) Because of the EPR- correlations P then has two essentially different decompositions (5). Using the polar decomposition theorem it follows that in any decomposition (5) of P the ai have equal absolute values. 13 From this it can easily be concluded that there is also the decomposition
(8) where we have re-named the 'ljJ-vectors to achieve the familiar notation. If in this representation we choose
F
(9a)
and correspondingly (9b)
then a straightforward calculation yields
(pl(F@ G)p) = a(3 - UX
(lOa)
where in vector form U=(u,V,W),
X=(x,y,z),
UX=ux+vy+wz.
(lOb)
In the next step we make essentially use of the assumption that F and G are EPR-correlated. This means that
(11) 13
Of course, this can be established also directly. The use of the polar decomposition theorem was suggested to me by Frank Artzenius.
440
VI.29 EPR-Situation and Bell's Inequality
with eigen-vector bases {c,Oih and {-0ih of F and G respectively. It is not too difficult to infer from this that there is a (possibly different) decomposition but again with eigen-vector bases such that there is one unitary matrix Uik which transforms {'Pih in {c,Odi and {,¢di in {-0di. Consequently, the matrices representing F and G according to (9) commute (as do the corresponding diagonal matrices in the representation (11)). This finally immediately leads to
U = AX, U' = A' X'
(12)
where we have included the case of F' and G'. The inequality to be proven is thus
l(a(3 - AX2) - (a(3' - AXX')I + 1(a'(3 - NXX') + (a'(3' - NX,2)1 :::; 2
(13a)
Now the eigenvalues of F are a ± lUI, and similarly for the other operators. The spectrum condition under which (13a) has to be proven therefore is
la ± IAXII, la' ± IA'X'II, 1(3 ± lXII, 1(3' ± IX'II :::; 1.
(13b)
The rest of the proof is dominated by case distinction which, however, can be easily reduced to the principal case in which the numbers within the I . . . 1 in (13a) are both non-negative. Call their sum ,1. Using the Cauchy-Schwarz inequality we easily get ,1 :::;
(a(3 - AIXI2) - (a(3' =f A'IXIIX'I) +(a'(3 =f A'IXIIX'I) + (a'(3' - A'IX'1 2)
according to whether A ~ A'. To simplify the notation we follow up only the case A > N. Abbreviating the right-hand side of the last inequality by ,11 the reader will easily verify the identity 2,11
where a± = a ±
AIXI
= a+(L - b'-) + a~(b_ + b'-) +a_(b+ - b~) + a'-(b+ + b~)
and similarly in the other cases. It follows
2,11
= la+IIL - b'-I + la~IIL + b'-I +la_llb+ - b~1 + la'-llb+ + b~l·
Using (13b) we obtain 2,11 :::;
IL - b'-I + IL + b'-I + Ib+ -
:::;4
b~1
+ Ib+ + b~1
VI.29 EPR-Situation and Bell's Inequality
441
where the last step again uses (13b). The final result now is immediate. What does our conjecture, if it holds in general, mean? As was already said in the Introduction, it does in general not mean that we have to go beyond an EPR-situation in order to violate Bell's inequality: If M consists of more than two pairs of observables violation is possible in principle. It is, therefore, not necessary to conceive of the hidden states as covering more observables than occur in an EPR-situation in order to show their non-existence.1 4 However, Bell's theorem cannot be extended to the lowest case: Of the four pairs of observables occurring in the inequality either none or one or two can be EPRcorrelated. In the latter case violation of (3c) is not possible, and the case Bell vs. EPR was thus not the easiest. IV
Einstein said: 15 "But on one supposition we should, in my opinion, absolutely hold fast: the real factual situation of the system I I is independent of what is done with the system I, which is spatially separated from the former." Bell translated: 16 "The vital assumption is that the result B for particle II does not depend on the setting a, of the magnet for particle I, nor A on b." Was that a correct translation? I will approach the answer by starting out from the following question: Suppose we want to show the non-existence of any classical hidden-variables theory for quantum mechanics by using the fact that Bell's inequality is valid classically but not in quantum mechanics. How would we proceed? I think the key inference can hardly be other than the following: If the theory in question existed then, since the inequality holds classically, it would have to hold also in quantum mechanics. The question, therefore, is how to make this inference. Again one can hardly think of any other possibility than that the equation
Eq((F ® 1)(1 ® G)) = Ec(CJ(F ® 1) . CJ(1 ® G))
(14)
is the decisive link between the two theories. Here Ec and Eq are from (2a) and (2b) respectively, and it is assumed that (0:) to every Eq (as in (2b)) there exists an Ec (as in (2a)) such that (14) for all F and G.
CJ as in (14) assigns to every quantum mechanical observable its classical counterpart. If we assume that (f3d A and CJ(A) have the same spectrum the inference sought can now be made: Given Eq we have Ec with (14). Given observables F, G, F', G' with the usual spectrum premise for (3) the 14
15 16
Recall that Bell actually did confine his consideration to the spin observables, see his 1964. Einstein 1949, p. 85. Bell 1964, p. 196; see also ibid. no. 2, p. 200.
442
VI.29 EPR-Situation and Bell's Inequality
classical counterparts a(F), etc. will also satisfy this premise. Consequently, they satisfy (3a). By (14), the F, G, F' and G' would have to satisfy (3b). Now, though already (14) looks quite nice one is tempted to replace it by (14')
(a) by
(a')
= (a) with (14') instead of (14)
and to insert the additional postulate (,82) a((F 01)(10 G)) = a(F 01) . a(10 G) from which (14) obviously follows. (,82) is nothing but a condition of isomorphy: a maps the observables on classical quantities in such a way that the image of the (quantum mechanical) product is the (classical) product of the images. If at this point it were asked whether a locality condition had already been put into the argument it would be hard to find one. And yet, as shown, the argument is perfect. Accordingly, we have to set out and find locality. To this end we must take into account that, according to the assumptions introducing (2a), the classical quantities by which the quantum mechanical observables are represented via a, are functions on a state space, say, S. But this means, by duality, that the states are functions on the algebra of quantities:
8(J) = 1(8)
(8
E S)
(15a)
Moreover, by definition
(J. g)(8) = 1(8) . g(8)
(15b)
and therefore on account of (15a)
8(J . g) = 8(J) . 8(g)
(15c)
Using a we can now define the hidden states also as functions on the algebra of quantum mechanical observables:
8(A) = 8(a(A)).
(16)
Now by the isomorphy (,82) the product rule (15c) can immediately be transferred to the observables with the result (,8~) 8((F 0 1)(1 0 G)) = 8(F 01) ·8(10 G) We thus see that there is a substantial restriction on the hidden states if these are viewed as functions on the set of quantum mechanical observables: (,8~) exactly corresponds to the classical equation (15c) and is equivalent to (,82) if we assume b) if 1(8) = g(8) for all 8 E S then 1 = 9
VI.29 EPR-Situation and Bell's Inequality
443
as another obvious axiom about the classical states and quantities. (!3~) is almost Bell's locality assumption as it is actually used. We only need to go one last step by observing the equivalence of (!3~) with (!3~) s((F 01)(10 G))js(l0 G) is independent of G, and s((F 0 1)(10 G))j s(F 01) of F. Here independence means that the value of the first (second) quotient is the same for all G(F). In order to understand the meaning of (!3f) it is useful to observe that this condition is obviously violated by most expectation value functions. This will become even more obvious if we generalize the equivalent (!3~) of (!3~) to become (N) For all commutable observables A, B: s(a· B) = s(a) . s(B). For expectation-value functions (N) means that they are dispersion-free. Now already Von Neumann had proven that there are no dispersionfree expectation-value functions in quantum mechanics, and this fact was interpreted by him and many followers to the effect that there is no hiddenvariables theory for quantum mechanics. 17 Later on this interpretation was criticized with the argument that Von Neumann's conception of hidden states had been much too restrictive. IS Indeed Bell's 'locality' assumption (!3~) is much weaker than (N) because the homomorphism is required to hold only for observables A and B that belong to different physical systems. It was in this way that hidden states were made logically possible again even if their lifetime turned out to be very short: They soon became the victims of a probabilistic argument, i.e. of Bell's theorem. I think it is fair to say that already Von Neumann's famous proof did dispose of Einstein's original idea of a hidden state. 19 For Einstein "there is such a thing as the 'real state' of a physical system - something that objectively exists independently of any observation or measurement ... ".20 But the hidden states of the post Von Neumann era actually became dependent on measurements. According to Bell the Von Neumann tradition "tacitly assumed that measurement of an observable must yield the same value independently of what other measurements may be made simultaneously".21 Indeed, what can it mean that the value of an observable A in a state s depends on another observable B? It does not mean that A is a function of B in the usual sense. It rather means the following: First there is no dependence of A on B insofar as the value of A is uniquely determined by the state s. But secondly, there is dependence in the sense that for observables B, commensurable with A, the quotient s(A· B)js(B) in general has different values for different B. These 17
18 19 20 21
For a systematic presentation of this tradition see Scheibe 1981a Belinfante 1973, Bell 1966, and many others. The belittling of Von Neumann's result in Belinfante 1979, Ch. I, is unjustified in view of this function. Einstein 1953, p. 14. Bell 1966, p. 451.
444
VI.29 EPR-Situation and Bell's Inequality
two features can go together on the ground of the following interpretation in terms of measurements. 22 It is assumed that every measuring apparatus has a well determined separation power in the following sense: There is a unique observable A such that, if a measurement is performed with this apparatus, there will be a unique outcome for A but for no observable finer than A. This outcome of a direct measurement (as it could be called) is the one that is uniquely assigned to A by any state. However, in general we can also (directly) measure observables finer than A and then infer the value of A from the direct measurement. But this very common practice hinges on a tacit assumption of independence: If, for instance, instead of measuring A (directly) we measure the finer observable AB then the inference in question would be impossible if s( A . B) / s( B) depended on B. The generalized hidden states showing just this dependence, we can 'understand' it by 1) viewing states as giving only the value of each observable that would come out if a (direct) measurement were made, and 2) giving up the idea that results of measurements have the independence described. The foregoing considerations seem to show that Bell locality is a far cry from Einstein locality. I do not want to become indulged in questions of terminology. But it may at least be mentioned that, whereas Einstein locality being essentially the principle of action by contact, obviously is linked to space, it is hard to see any immediate relationship between Bell's condition and space. However, the main point is that Einstein locality has essentially to do with interaction, presupposing, I think, the common, classical way of describing a physical system in terms of states as quoted above. By contrast, Bell's condition concerns already the kinematical or state description of a system. It first accepts a very uncommon generalization of the concept of state by allowing the dependencies explained above and then mitigates the general situation for the case of systems consisting of subsystems. In accordance with the idea that the generalized states be used as hidden states of a quantum mechanical system the approach mirrors the situation brought about by the critical quantum mechanical states (5) which has also nothing to do with interaction. 23 I am afraid that physicists may be inclined to think that Bohm's 1951 hidden- variables theory illustrates Bell's theorem. But whereas the non-local interaction inherent in Bohm's quantum potential indeed violates Einstein locality 24 the theory has not yet been adjusted to the theoretical apparatus within which Bell's locality condition is meaningfu1. 25 Consequently, we cannot tell what it would mean that the theory is or is not 'local' in Bell's sense. Saying this is not denying that there is some (possibly deep) connection between the two localities. It is only to remind us that it is still waiting for being understood. 22 23
24 25
For details of this part of the paper see Scheibe 1986a (this vol. VI.26). Bohr 1935, p. 700. See Bohm's own emphasis in Bohm/Hiley 1975. This is admitted in Belinfante 1973, p. 94.
VI.30 Three Remarks Concerning Bell's Inequality* It is shown that in his 1964 paper Bell does not pay much attention to the EPR argument: (1) The most specific feature of the argument - the EPR situation - is not even mentioned. It turns out that the EPR situation is sufficient for Bell's inequality to hold. (2) The proof effective role of locality is factorizability, because of its happy linkage to the form of Bell's inequality. (3) In an analogous way Heisenberg's inequality can be used to prove the non-existence of hidden variables. Key words: EPR correlations, factorizability, Heisenberg's inequality
1. EPR Situation and Bell's Inequality Bell's famous paper l , in which he published his inequality, was given the title "On the Einstein Podolsky Rosen Paradox." In his Introduction Bell calls the paradox an argument that quantum mechanics could not be a complete theory but should be supplemented by additional variables restoring to the theory causality and locality. The goal of his paper then was to show this idea to be incompatible with the statistical predictions of quantum mechanics. However, even a superficial reading of Bell's paper is sufficient to become convinced that one of the main features of the EPR argument is not used in Bell's proo(2. In fact Bell does not even mention that EPR, in their "construction" of hidden states, make essential use of an EPR situation: (EPR) The physical system in question, being composed of two subsystems I and II, is in a state if> in which every pair out of a set M of incommensurable pairs (F, G) of quantities of I and II respectively is EPR correlated. Here F (011) and G (01 II) are said to be EPR-correluted in state if> (of I0II if there is a one-to-one correlation Ii +-+ 9i between the possible values of F and G such that, if a measurement of F yields Ii, an immediately following measurement of G would, on account of if> , certainly yield 9i, and vice versa. A necessary and sufficient condition for F and G to be EPR correlated in if> is that if> makes the probability of the result Ii for F and 9i for G in a joint measurement 0 if i -I- j. Now, an EPR situation is used in the EPR argument in the following way: The value li of F of system I, where (F, G) EM, is "real", because we can know it by a measurement of G of the other system II which in no way disturbs system I. I shall come back to the locality assumption lying behind this inference. But it is clear that, unless we have the correlations required by (EPR), the argument does not get off the ground. In fact it covers only the observables of I occurring in a pair of the given set Min (EPR). But, since M is such that any two of those * First published as Scheibe 1993b 1 2
Bell 1964 Einstein/Podolsky/Rosen 1935
445
446
VI.30 Three Remarks Concerning Bell's Inequality
observables are incommensurable, the virtual information about their "real" values would already by hypermaximal. Although, as I said, Bell does not mention (EPR) in the proof of his theorem there is a quite positive connection between it and the main vehicle of the proof: Bell's inequality. For the latter I adopt the version LlE(A, A'; B, B') ::; 2,
(la)
LlE(A, A'; B, B') == IE(AB) - E(AB')I +IE(A' B) + E(A' B')I.
(lb)
where
Here E is a normed positive linear form on a *-algebra A and A, B, A', B' are self-adjoined elements of A, pair-wise commutable as they appear in (1 b) and with spectra absolutely bounded by 1. The inequality (1) holds in the classical case where A is a *-algebra of integrable functions on a classical probability space and (p, being a measure on the space)
(2a) It does not hold generally in quantum mechanics where A is an irreducible *-algebra of bounded linear operators on a Hilbert-space and
Ec(A) = Tr(W A),
(2b)
with the statistical operator W. However, as is well known, even in quantum mechanics (1) holds under certain restrictive assumptions, e.g., if all observabIes A, B, A', B' pairwise commute, or in the case most important for us where we have a system composed of two subsystems and
A = F 0 1, A' = F' 0 1, B = 1 0 G, B' = 1 0 G'
(3)
if W is a product. Now, a new case of additional premises, sufficient to ensure Bell's inequality also in quantum mechanics, is precisely given by an EPR situation, i.e., we have the Conjecture: If in the case of a compound quantum mechanical system 10II the observables F, F' of I are EPR correlated with G, G' of II, respectively, in a state P of 10II, then (under the usual assumption concerning the spectra) Bell's inequality
Llp(F 01, F' 0 1,10 G, 1 0 G') ::; 2 holds. It may immediately be added that, as is shown by the examples given in the literature, this conjecture cannot be strengthened by dropping one of the two correlation assumptions. However, it is not required (as it was in (EPR))
VI.30 Three Remarks Concerning Bell's Inequality
447
that F and F' (or - equivalently - G and G' ) be incommensurable. It seems an open question whether the conjecture holds in general. But there is a proof for the 2-dimensional case3 . The conjecture, if it holds in general, does not mean that we have to go beyond an EPR situation in order to violate Bell's inequality: If M in (EPR) consists of more than two pairs of observables, violation is possible in principle. This remark, however, is neutral with respect to the question whether the non-existence of hidden variables in the sense of EPR can be shown by means of the inequality. In this respect our conjecture is but a vague indication that an EPR situation is classical in its role in constructing hidden variables of the EPR type. The question of a rigorous non-existence proof is resumed in the next section. In this section my remark concerning Bell's enterprise has been: (I) In his alleged proof of the non-existence of hidden variables of the EPR type, Bell does not use the feature most typical for the EPR construction, namely the EPR situation. There is, however, a close connection between the latter and Bell's inequality: The lowest non-trivial case of an EPR situation seems a sufficient condition for Bell's inequality to hold also in quantum mechanics.
2. Factorizability in Bell's Theorem If Bell does not refer to an EPR situation in his proof, what other assumption of EPR, if any, does he consider? Certainly the locality condition, which in the terminology of the preceding section would be: If the prediction for F of system I can be made by measuring a quantity G of a second system II, spatially separated from and not interacting with system I, then that prediction can be made without disturbing 1. Now this condition is not very explicit in the original EPR paper - far less than (EPR) from Sec. 1. It is only in later publications of his that Einstein stresses locality4. Accordingly, it is one of them to which Bell refers by quotation. There Einstein said: "But on one supposition we should, in my opinion, absolutely hold fast: the real factual situation of the system II is independent of what is done with the system I, which is spatially separated from the former.,,5. Bell translated: "The vital assumption is that the result B for particle II does not depend on the setting a, of the magnet for particle I, nor A on b."6. This may be a correct translation. But if we look, from an axiomatic point of view, what assumption Bell actually uses in his proof, it turns out to be the following: He makes locality a condition of factorizability concerning the hidden states of the quantum 3 4
5 6
Scheibe 1991e (this vol. VI.29) Einstein 1948; Einstein 1949a Bell 1964, p. 200; Einstein 1949a, p. 85; italics mine Bell 1964, p. 196
448
VI.30 Three Remarks Concerning Bell's Inequality
mechanical system 1011. Any such state being a function assigning, to every observable a real value and B being their totality, it is required: (FAC) For every S E B,
s(F 0 G) = s(F 01) . s(10 G) for all observables F and G of I and II, respectively. This is indeed essentially a condition of independence in the following sense: Given G, the result s(10 G) predicted by s for a direct measurement of G does not depend on the result s(F 0 G) predicted by s for a direct measurement of F 0 G however F is chosen, and vice versa 7 . The question remains what is left in (FAC) of spatial separation and vanishing interaction. However this question may be answered, it is a fact that hidden states of the kind (FAC) can be excluded by the following Bell-type proof. Given the usual quantum mechanical setting and the space B of hidden states satisfying (FAC), we try to find a classical state space < Be, Qe >, where Be and Qe are the sets of classical states and quantities, respectively, and an embedding a of the set Q of quantum mechanical observables into Qe such that: (a) For every Se E Be, the function s defined by
s(A) == se(a(A)) belongs to B. (Note that both Be and B are sets of real-valued functions on Qe and Q respectively!) (f3) For every quantum mechanical expectation value function E there exists a classical expectation value function Ee such that for all A E Qe
E(A) == Ee(a(A)) From this we first prove, assuming the usual product structure of I 0 II, the isomorphy
a(A 0 B) = a(A) . a(B)
(4)
of the embedding a. We have
(a(A)a(B))(se) = (a(A)(se) . a(B)(se) = se(a(A))· se(a(B)) = s(A) . s(B) = s(A 0 B) = se(a(A 0 B)) = (a(A 0 B))(se). Assuming that there are sufficiently many classical states, (4) follows. The following comments on this chain of inferences are in order. In the first step we just use the definition of the product of two classical quantities if these are given as functions on the state space. In the second step the classical inversion 7
For a closer analysis see Scheibe 1986a, reprinted in this volume ch. VI.26
VI.30 Three Remarks Concerning Bell's Inequality
449
/(s) = s(f) is used: Just as quantities may be viewed as functions on the state space, so states may be viewed as functions on the set of quantities. After using (ex) in the third step we finally invoke the essential assumption (FAC). The second part of the proof is founded on (f3) and (4) .Bell's inequality holds in the form
LlEc(a(F),a(F'),a(G),a(G')) ::; 2
(5)
for arbitrary classical expectation value functions Ec and quantum mechanical observables F, F' of system I and G G' of system II (Note that a(F) etc. are classical quantities!) Now let E be an arbitrary quantum mechanical expectation value function and F, F', G, G' as before. Because of (f3) and (4), we have
E(F @ G) = Ec(a(F@ G)) = Ec(a(F) . a(G)) and therefore
LlE(F, F', G, G') = LlEJa(F) , a(F'), a(G), a(G'))
(6)
which would yield immediately the general validity of Bell's inequality also in the quantum mechanics of system I@II, contrary to what is actually the case. This leads to my second remark: (II) Bell's proof is sound if it is founded on (FAC) , whatever this condition may have to do with the locality requirement of EPR or Einstein himself. What is evident is the proof enabling role of (FAC) with respect to (6): There must be a link between the classical and the quantum mechanical expression Ll, and this is given by (4) which in turn rests on (FAC).
3. Heisenberg's Inequality Reconsidered The proof-technical aspect of a condition like (FAC) is further elucidated by the following parallel case of a no-hidden-variables proof, using, instead of Bell's inequality, the negation of Heisenberg's indeterminacy relations. Just as in the preceding section we argued that the existence of hidden variables of the type (FAC) would imply the unrestricted validity of Bell's inequality also in quantum mechanics, we can argue that the existence of hidden variables of another type would imply the (classically valid) negation of Heisenberg's indeterminacy relations. And, just as (FAC) was tailor-made to bring in Bell's inequality, so the new condition is dictated by the expression occurring in the indeterminacy relations, namely the standard deviations of an expectation value function. It turns out that (FAC) has to be replaced by: (SQU) For every s E S,
450
VI.30 Three Remarks Concerning Bell's Inequality
for all observables A. The further assumptions about the classical hidden variables theory in its relation to quantum mechanics are the same as in Sec. 2, except of ({3), which has to be replaced by: ({3') For every classical expectation value function Ee the function
E(A) == Ee(a(A)) is a quantum mechanical expectation value function. As we shall see, this change must be made because of the different logical structure of (the negation of) the indeterminacy relations vis a vis Bell's inequality. By a chain of reasoning completely analogous to the one leading to (4) condition (0:) and (SQU) lead to the isomorphy
(7) corresponding to (4). If in Heisenberg's indeterminacy relations we eliminate position and momentum as well as Planck's constant, by existence quantification the negation of the resulting proposition is: Given any two quantities f and 9 and E: > 0 there is an expectation value function E such that
(8a) for the standard deviations 8E defined by (8b)
This, then, would hold for the classical hidden-variables theory, if it existed, just as it was the case with Bell's inequality. We now argue as follows. Let A and B be any quantum mechanical observables and E: > O. There is a classical expectation value function E e , such that
8E Ja(A)) . 8E Ja(B)) < E:
(9)
Because of ({3') there is a quantum mechanical expectation value function E such that
E(X) = Ee(a(X)) for all observables X. From this and (7), it follows that also
(10) which corresponds to (6) in the Bell proof. The calculation simply is
VI.30 Three Remarks Concerning Bell's Inequality
8~Ja(X))
451
= Ec{(a(X))2} - (Ec{a(X)})2
= Ec{ a(X2)} - (Ec{ a(X)})2 = E(X2) - (E(X))2 = 8~(X) From (9) and (10) we finally get
for arbitrary observables A and B, which contradicts the indeterminacy relations. We have shown: (III) A proof of the non-existence of classical hidden variables for quantum mechanics can be based on Heisenberg's indeterminacy relations in a far-going analogy to the procedure using Bell's inequality. Of course, the kind of hidden variables excluded in the one case is different from that in the other: They are defined by (FAC) and (SQU), respectively. But they are not too far away from each other. A common generalization is the condition
s(A . B) = s(A) . s(B)
(11)
for any two commutable observables A and B. It can be shown that (11) is a very strong condition essentially equivalent to von Neumann's dispersionfree states8 . In other words, this condition cannot be satisfied at all: there simply are no such hidden states. For the much weaker conditions (FAC) and (SQU) there are plenty of states of the corresponding kind, and it is only by probabilistic considerations that such hidden variables can be excluded.
8
Scheibe 1991d (this vol. ch. VI.28)
VII. Spacetime, Invariance, Covariance
The papers collected in this chapter concern the physical structure of space, time and spacetime. Problems connected with invariance properties of physical theories based on invariance properties of their corresponding spacetime theories are well-known, e.g. the Galileo invariance of Newton's dynamics, the Lorentz invariance of classical electrodynamics and the general covariance of Einstein's field equations. In [31], [331 and [341 problems of the formulation, the meaning and the physical content of invariance and covariance statements are dealt with. Less well-known is the question of characterizing the structure and theory of spacetime by assumptions as plausible as possible from the physical point of view. The classical case is Helmholtz' characterization of the riemannian spaces of constant curvature by means of their group of free mobility of rigid bodies. Since today we believe in Einstein's general theory of relativity, a more appropriate characterization should refer to the Lorentz manifolds that are the infinitesimal version of Minkowski spacetime. No such characterization seems to be known. But Hermann Weyl succeeded in 1923 in characterizing a slightly more general class of manifolds which he named after Pythagoras ([32]). A pythagorean metric on a n-dimensional manifold is given by an otherwise arbitrary non-singular quadratic differential form
ds 2 ==
L gij (x )dXi dx j , ij
det(gij) =I- 0
(1)
As a matter of course the characterization Weyl was looking for is not concerned with one particular structure (1) but with their totality or with their nature, as Weyl decided to call it. l The nature of the metric (1) is its 'being a non-singular quadratic differential form' or - equivalently - it is an orthogonal group Ok where k is the signature of the corresponding form (1). The switching from the quadratic form (1) to its automorphism group makes the following generalization possible: We replace Ok (and thereby the gij) by an arbitrary matrix group G (= subgroup of the general linear group GL( n, lR) in n dimensions), G now being the nature of a generalized geometry Ec. the nature of the metric field .. .is essentially one and therefore absolutely determined .. .in it the aprioristic essence of the space-time structure is expressed" (Weyl 1923b, p.45)
1 " ...
E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001
454
VII. Spacetime, Invariance, Covariance
What is wanted is a differential geometric condition , for G that leads back to Ok:
(2) where ~ is the equivalence A ~ B == A = UBU- 1 for some U. Weyl's choice for this condition was that for every model S of Ec there exists exactly one affine connection for S that is compatible with the metric of S:
,(G) == 'VS.Ec(S)
'* c(S).
(3)
where c(S) says that there is one and only one such affine connection. (2) is indeed provable for the, defined by (3), and so one has a characterization of the desired kind. Theorem (2) is illustrated2 by the (non-pythagorean) Galileo metric
g =diag{l,O,O,O},
h =diag{O,l,l,l}
(4a)
which is compatible with both affine connections and
(4b)
where U is the potential of the gravitational field. Weyl's characterization is at the same time an explanation of the pythagorean geometry by means of a more general kind of geometry. Explanations (more precisely, reductions) are investigated in Ch.V, especially in V.23 and 24. There, however, explanations that are at the same time generalizations are of a simpler kind. The concept 'man', for instance, is explained by generalization to the concept 'animal being' in the sense that man is the only rational animal being. In the present context such an explanation would be the explanation of euclidean spaces by riemannian spaces with a vanishing curvature. In particular, all euclidean spaces are riemannian spaces, and the condition leading back from a riemannian to an euclidean space is a condition meaningful for one riemannian space - namely, that its curvature vanishes. 3 In the Weyl characterization the geometry depends on the external parameter G - the nature of the metric -, and it is G and not a manifold endowed with a G-metric which is the result of a generalization, namely of the special groups Ok. It is true that - as in the previous case - all manifolds of pythagorean nature are manifolds of the general group nature. The characterizing condition, however, under which a group G is a group Ok is not meaningful as a condition on one manifold. Rather the more complex connections (2) and (3) dominate the scene. An account of the treatment of invariance and related concepts in [31], [33] and [34] is most profitably started with (absolute) canonical invariance. 2 3
Scheibe 1999, VIII.3 (Newton/Cartan theory of gravitation) Scheibe 1997b, IV. 1-3
VII. Spacetime, Invariance, Covariance
455
It comes into playas soon as one accepts that physical systems are viewed as being set - theoretical structures and physical theories as statements about structures. For structures the concept of isomorphism is easily defined, and with respect to isomorphism canonical invariance amounts essentially to this: If 8 is a model (not a model) of a physical theory T, i. e. if 8 satisfies (does not satisfy) the axioms of T, and if 8' is a second structure, isomorphic to 8, then 8' is also a model (not a model) of T. All physical theories are canonically invariant but because of its fantastic generality physicists are normally not aware of this fact. It is different with relative or conditional (canonical) invariance. If, for instance, Galileo invariance of Newton's dynamics is fully spelled out it turns out that this is, of course, not invariance under arbitrary isomorphisms but only under Galileo transformations. And the latter in turn are isomorphisms and even automorphisms leaving invariant a Galilean spacetime. The point, therefore, is that certain automorphisms, defined as leaving invariant a geometrical structure, happen to leave invariant also a dynamical theory. In general the full conditional invariance statement is of the form: If an isomorphism leaves invariant a fragment 8 0 of a given structure 8 (and hence is an automorphism of 8 0 ) then it leaves invariant also a partial statement a1 of the axioms a of the given theory. This concept of conditional canonical (transformation and) invariance already covers a considerable part of the usually mentioned invariance statements in physics. There are, however, important other invariances in certain physical theories, not as alternatives to canonical invariance, but existing alongside it: invariance under coordinate transformations and, most importantly, under gauge transformations. There is widespread confusions concerning covariance as something different from but similar to canonical invariance. Roughly speaking covariance is invariance under coordinate transformations. Coordinates are introduced by definition in differentiable manifolds (X; F) where X is the 'space' and F a set of local coordinate systems on X, maximal with respect to the pseudogroup Goo of arbitrary differentiable coordinate transformations. More generally the coordinate transformations may be confined to a sub-pseudo-group G <:::; Goo. In physics G is, for instance, the group of Lorentz transformations of spacetime or the group of euclidean transformations of Newton's absolute space etc. Now, geometry and physics become interesting only if a manifold is endowed with further structures, e.g. a vector field, a tensor field, an affine connection etc. As geometrical objects these structures, however they may be introduced, can be characterized by coordinate representations. This characterization must, of course, include 'laws' telling us how the representation changes under coordinate transformations. And the proper laws of the theory, formulated in the coordinate representations of the geometrical objects, must be, under pain of inconsistency, invariant under those changes of the representations that are induced by the admitted coordinate transformations. This invariance, essentially dependent on G, is covariance.
456
VII. Spacetime, Invariance, Covariance
In his basic paper of 1916 on the foundations of general relativity theory Einstein introduced a 'postulate of general covariance' saying that "the general laws of nature are to be expressed by equations valid for all coordinate systems, i.e., covariant under arbitrary substitutions" (cf. [33]). He thereby produced a controversy about the empirical or factual content of this postulate. In [33] it is argued that however the correct answer to this question may come out, the postulate that Einstein wanted to introduce was the stronger requirement that the general laws of nature should not prefer any coordinate systems, i.e. they should not allow a distinguished subclass of coordinate systems. In more technical terms this means that in view of the laws of the theory it should not be possible to reduce the pseudo-group Goo to a proper subpseudo-group G c Goo . The relation of this postulate of non-preference to that of covariance is nicely illustrated by the Newton/Cartan theory of gravitation which is a generally covariant formulation of Newton's theory and yet allows the distinguished subclass of Galileo coordinate systems. 4 The general situation is still marked by conceptual difficulties awaiting their solution.
4
See no. 2
VII.31 Invariance and Covariance* According to its title the present paper has a twofold aim. As far as invariance is concerned I want to argue that the numerous invariances of physical laws can essentially be reduced to one concept of invariance that is already part of the concept of kinds of mathematical structures that are used to formulate physical theories. The second aim of the paper is to present the concept of covariance not as any kind of invariance but rather as a concept of equivalence between two formulations of a physical theory that are already invariant, one of which, however, has a higher 'degree' of invariance than the other. As regards the actual use of the terms 'invariance' and 'covariance' in physics and the occasional discussions of it in the philosophy of physics I do not want to entertain any criticism of the literature although the somewhat deplorable state of affairs concerning the two concepts would justify this. The present paper will thus be self-contained, and I will pay attention to what usually is meant by invariance and covariance only to the extent that makes it sufficiently clear what I am talking about. To give a brief outline of my argumentation I begin with calling attention to the fact (part I) that one of our fundamental metamathematical concepts, the concept of a species of structures, is essentially defined by a condition of invariance. On the other hand, the mathematical part of every physical theory (the 'formalism' that is peculiar to it) presumably can be reformulated as a species of structures. Consequently, to the extent to which this is true, an invariance condition would be built into a physical theory from the very outset. This naturally leads to the question what this kind of invariance, canonical invariance as it will be called, has to do with the well-known invariances appearing in physics as characteristic properties of its fundamental laws. One proper context in which this question can be discussed is given by those physical theories whose mathematical formulation is based on species of structures called manifolds (part II). Most important examples are theories, e.g., Einstein's theory of gravitation, that are founded on a theory of spacetime as the basic manifold. But also the configuration spaces and phase spaces of classical mechanics are cases in point. For a large class of these theories (based on manifolds) it can be proved that canonical invariance is essentially equivalent to the common invariances of physical laws, and it seems plausible that complete generality can be achieved. As a consequence, no invariance requirements beyond the one that is already part of the definition of a species of structures would have to be imposed on a physical theory. This result can be viewed as a substantiation and justification of what is nowadays sometimes recommended as the 'coordinatefree method' in physics. * First published as Scheibe 1982c. The completion of this paper was made possible
by a Visiting Fellowship at the Center for Philosophy of Science at the University of Pittsburgh. The author wants to express his gratitude to the chairman and the director of the Center, Professor A. Griinbaum and Professor L. Laudan, for the kind invitation and generous hospitality.
457
458
VII.31 Invariance and Covariance
Transcending the concept of invariance the concept of covariance (part III) points to the fact that two species of structures can be equivalent and as such can become the formalisms of one and the same physical theory although their 'degrees' of invariance are different. They are, however, supposed to be comparable in the SenSe that the groups underlying the invariances are related by inclusion. Covariance as equivalence takes up the old issue that came up with the general theory of relativity to increase at will the invarianCe of a physical law. Maxwell's equations in their Lorentz-invariant form, for instance, can be made invariant under arbitrary differentiable transformations by explicitly introducing the Minkowski metric as a tensor field. If this is done two new facts occur: the invariance of the new equations and their equivalence to the equations from which one started. But the neW equations are invariant under the larger group in exactly the same sense as were the original equations with respect to the smaller group. Therefore, if there is any need for a new concept then it is with respect not to invariance but to equivalence. This can be shown eVen more convincingly in the context of the so-called 'principle of general covariance'. If this principle is taken to mean that physicallaws should be invariant under arbitrary differentiable coordinate transformations then, as is well known and will here be confirmed anew, it can be trivially satisfied unless further requirements are included. With respect to the converSe relation of covariance, however, the situation is altogether different: The reduction of the group of coordinate transformations is not always possible, and if it is possible this may very well be a nontrivial result. In applying this relation of reduction I shall argue that the general theory of relativity has to be estimated on account of its irreducibility and not as is frequently done with regard to the possibility of finding covariant versions of other physical theories on the invariance level of general relativity. The three concepts of species of structures, of the invariance inherent in them and of the equivalence between them are due to Bourbakil. They are, however, founded on a particular formulation of logic and set theory that I do not want to take Over. I will rather assume a modern formulation of the system of Zermelo-F'raenkel (ZF). This means that set theory is presented as a standard theory2. For convenience I will even work with an extension by definitions of ZF that is rich enough to contain all the auxiliary concepts that will constantly be needed (ZFT). The reader mayor may not assume that ZFT (and hence ZF) is given an interpretation in the modeltheoretical sense. Such an assumption would at least not be necessary since all metatheoretical concepts that will be introduced will concern only syntactical entities (formulas or terms). 1
2
Bourbaki 1968, Ch. IV Shoenfield 1967, Ch. 9
VII.31 Invariance and Covariance
459
I
Roughly speaking a species of structures is an axiom system of a special kind: There is a subdivision of its concepts into basic concepts XI, ... ,Xn , and typified concepts Sl, ... ,8 m as well as a corresponding subdivision of the axioms into proper axioms D:i and typifications 1'1, ... ,I'm such that (1) 1'1' determines the type of sJ.L relative to the Xv i. e., it determines the 'nature' of the entities falling under sJ.L relative to those falling under any of the Xv whereas (2) the 1'1' by themselves and the D:i by fiat do not determine the 'nature' of the entities falling under the Xv relative to each other or to any concept presupposed by the axioms. To give the details in set-theoretical terms 3 , we think of an axiom system as being a finite set of formulas of ZFT depending on certain variables indicating the sets ('concepts') that are interrelated by the axioms. We then have their subdivision into the base sets X (short for: Xl, ... ,Xn ) and the typified sets s (short for: Sl, ... ,sm). The typifications 1'1' of (1) are formulas
(1) where a I' is a scale term, i.e., a term constructed from its arguments by successively applying one of the operations that yield a power set (Pot) or a Cartesian product ( x ). In the case before us the arguments are the X and possibly further constant (!) terms that are available in ZFT. To keep the notation simple the latter will not be made explicit but must be kept in mind. The 'nature' of the s (or - as above - of their elements) is determined by (1) relative to the X and constant sets from ZFT in the sense that the s are elements of the power set of ... the Cartesian product of ... the X and the constants. The remarkable thing about (1) is that it provides counterparts of all the predicates and terms of arbitrary arity and order as they appear in the various independent logical calculi. In the present context a scale term a is called a type of structures and a system of sets X and s satisfying (1) a structure of type a. Having done with the determination of the s with respect to the X and constant sets announced in (1) we have now to prepare the ground for the invariance condition taking charge of (2). To do this we define the canonical a-representation to be the following assignment: Given a scale term a and bijections f from sets X onto sets X' (i. e., according to our convention from a(X) onto a(X') is uniquely bijections fv from Xv to X~) a bijection determined by the two recursive conditions, that for the power set operation
r
(2a) and for the Cartesian product operation 3
Bourbaki 1968, Ch. IV
460
VII.31 Invariance and Covariance
(2b) We go on to define f to be a a - isomorphism from (X, s) onto (X', s') if and only if (X, s) and (X', s') are structures of type a and the f are bijections from the X onto the X' such that r(s) = s'. It then follows immediately for the typifications (1) that isou(X,s;X',s';f) ~ s E a(X) +-+ s' E a(X')
(3a)
is provable (in ZFT). By analogy we now require that the proper axioms are canonically a-invariant in the sense that for their conjunction isou(X, s; X', s'; f) ~ a(X; s) +-+ a(X'; s')
(3b)
is provable (in ZFT). An axiom system consisting of typifications (1) and proper axioms satisfying (3b) is called a species of structures. In Bourbaki4 the crucial property (3b) of a is called 'transportability'. By using the term 'canonical invariance' instead I want to emphasize that, if anything, then (3) is a condition of invariance. The intuitive idea standing behind any definition of invariance is the idea of something that remains unaltered while something else on which it depends is changed. In the present case what remains the same is, expressed in semantical terms, the truth value of a proposition, and what is changed is that about which the proposition is a proposition: If the structure (X, s) is replaced by a structure (X', s') isomorphic to the first, then a is true about the second if it is true about the first and false about the second if false about the first. Extreme cases of invariance of truth values would be given by formulas provable or refutable in ZFT as, for instance, (3a) or (3b) themselves. In a model of ZFT such formulas would be true or false no matter what interpretations their variables are given, and this means that any replacement of the values of these interpretations would leave truth values invariant. This complete freedom of replacement is restricted in (3) in two ways: The base sets X may only be replaced by sets resulting from them by bijections f, and the typified sets s only by their images under the canonical representations of the f. This makes room for many more formulas becoming invariant in the sense of (3). On the other hand, the restrictions are still wide enough to prevent the axioms from saying anything about the 'nature' of the X in the sense that their elements were determined relatively to each other or to any constant set. Without bringing this idea under precise terms let us imagine some clear cases in which one would say that such a determination occurs. Cases in point would be given by all typifications (1) with one of the base sets instead of sJ.L and the rest of the base sets instead of X as our axiom a. It is evident that such a relation can be destroyed by isomorphisms and therefore would contradict (3b). The same would be true for axioms a saying 4
ibid.
VII.31 Invariance and Covariance
461
that one base set X has an empty (or nonempty) intersection with another base set or constant set. Thus even such negative determinations are ruled out by our invariance condition. Examples for species of structures abound from mathematics: In point of fact all the well-known concepts of a group, ring, vector space, topological space, manifold, fibre bundle etc. are defined by axioms that can easily be reconstructed as so many typifications (1) and axioms proper satisfying (3b). It is likewise a fact that these mathematical concepts are frequently applied in theoretical physics, and at least in the mathematical treatment of higher level theories such as quantum theory and the theory of general relativity the use of species of structures has been generally acknowledged: "A general lesson to be drawn from the development of the theory of relativity is that it is desirable to analyze in detail the various structures inherent in the mathematical models used to describe physical phenomena" 5 . The multifarious application of species of structures in theoretical physics does not mean, however, that the common conceptual basis of this mathematical field is sufficiently understood. By now there exists only one thoroughgoing attempt at a general conceptual reconstruction of physics on the basis (as far as mathematics is concerned) of the concept of species of structures6 . There is, admittedly, also the set theoretical approach of Suppes, Sneed and others, and it has recently been claimed that this approach could be viewed as an extension of the Bourbaki program to science7 . Apart from several objections that could be made against this claim the set theoretical approach can be left out of consideration here because it never does focus on the defining conditions of a species of structures which, however, make all the difference with regard to the goal of this paper. As I have tried to argue in a previous paper8 species of structures occur in physical theories not only as incidental concomitants. Rather they can be used to characterize a physical theory as a whole as far as this is possible without regard to their empirical interpretation. In particular this characterization includes physical laws in the narrower sense and with them the sort of things in whose invariance properties we are primarily interested. However, the characterization can not be achieved solely by the simple and general species of structures that are the building blocks of modern mathematics. Rather we must have recourse to richer species built up from the simpler ones: "As a rule, rich structures are used in physics; those of differential manifolds carrying additional geometric objects and of Hilbert spaces with preferred sets of operators"g. 5 6 7 8 9
Trautman 1972, p. 85 Ludwig 1978 Stegmiiller 1979 Scheibe 1979 (this vol. ch. 111.11) Trautman 1972, p. 85
462
VII.31 Invariance and Covariance
A general concept particularly helpful in making clear what is at issue here is the concept of an extension of a species of structures. Given a species (ao; ao) it can be extended by adding a (not necessarily) new typification s E a(X) and axiom a(X; so, s), both referring to the old base sets, such that (ao x a;ao 1\ a) again is a species of structures. It follows from the assumption that (ao; ao) already is a species of structures that the canonical (ao x ao)-invariance of ao 1\ a is equivalent to the provability of isoo-oxo-(X, so, s; X', s~, s'; 1) 1\ ao(X; so) --=-t a(X; so, s)
f--t
a(X'; s~, s') (4)
In this way the 'invariance part' of (3b) can be isolated for the added axiom a. Now, in physical practice one frequently proceeds from axioms that in a systematic account of the species of structures involved would be added only at the very end. It is this methodological inversion that can be elucidated by the concept of extension. Take ordinary quantum mechanics as an example. The core of it, the physical law that really matters, is Schrodinger's equation (with Ii = 1) (5a) Asking for the axiomatic background of this equation it turns out that it is 'only' the last step in a series of extensions. The series would start with an abelian group X with addition A which then would be extended into a complex vector space by adding a scalar multiplication B. The next extension would lead to a Hilbert space by adding a metric C, and this is the point where the structural aspect usually comes to a stop. But in principal the extension could be continued with a self-adjoint linear operator H and an Xvalued and 1R- argumented function 'ljJ as two new structures that are related by (5a). The only requirement would be that the whole axiom that we have obtained is canonically invariant with respect to the total typification of the five structures imposed on X, and this is indeed the case. In this way not only is the crucial equation (5a) submitted to a species of structures. It also has received the 'right' invariance property: The invariance of (5a) under the transformations 'ljJ'(t)
= U'ljJ(t),
H'
= UHU- 1
(5b)
with a unitary transformation U of X that is usually considered is but a special case of canonical invariance. It occurs if in the canonical representation of U not only X but also A, Band C are left invariant which is but another way of saying that U is unitary. All this comes out of (4) if ao refers to the Hilbert space and a to the properties of Hand 'ljJ including (5a) and if, finally, X' = X and s~ = so. Then s' = f(s) turns out to be (5b), and the conclusion of (4) is the restricted invariance usually considered.
VII.31 Invariance and Covariance
463
II
The foregoing example is a particularly favorable case to my point that the invariance properties of physical laws can be reduced to the canonical invariance in species of structures. The reason is that general quantum mechanics is usually presented at a fairly high level of abstraction which sometimes comes very close to its presentation as a species of structures. Actually the situation is not altogether different in the field that I am going to enter now: Manifolds of this or that kind are frequently introduced in a manner that makes it not too difficult to reformulate the definitions in terms of species of structures. Sometimes the abstract presentation is even highly recommended as a 'coordinate free' or 'intrinsically invariant' method. But seldom, if ever, is it precisely stated in what sense we get rid of coordinates and by what features the invariances so characteristic of the old coordinate dependent formulations are replaced. In order to clear up the matter I start this investigation with a particularly simple class of species of manifolds which in turn will be introduced in two steps. In the first step species of global coordinate manifolds (gcm) are defined. In a gcm the intuitive idea of a set F of preferred coordinate systems used to label all (!) elements of an otherwise arbitrary set M is given a precise formulation. The species of gcm are parameterized by two external parameters (i.e., constant terms being available in ZFT): a natural number n and a group G of homeomorphisms of the n-dimensional real number space lRnwith its usual topology. Typification and axiom defining the species gcm(n, G) are (6a) where cx gcm says that the ¢ E F are bijections from M onto lRn , that 'l/Jep-l E G for 'I/J, ep E F and that gep E G for ep E F and 9 E G. These requirements make F into a complete set of global coordinate systems on the space M with respect to the group G of coordinate transformations. It can easily be proved that cx gcm is canonically invariant with respect to the typification in (6a). The most important examples of gcm(n, G) applied in physics are those in which M is interpreted as spacetime (hence n = 4) and F is a class of preferred coordinate systems in spacetime with one of the following groups of coordinate transformations 10 : G gal G new C
c
Gros C G kin
C Gaff
C Gdif
c
G top
(7)
G poi
This hierarchy provides us with so many gcm( 4, G ... ), as G . .. is one of the following transformation groups of JR4: the direct product of the Euclidean groups of space and time (G new ), the Galileo (G gal ) and Poincare (G poi ) 10
Ehlers 1973
464
VII.31 Invariance and Covariance
group, the affine group (Gaff), the enlargements of Ggal allowing for accelerative translational (Gros)l1 and completely arbitrary (G kin ) rigid motions, the group of diffeomorphisms (GdiJ) and of homeomorphisms (G top ) of ]R4. Besides the direct spacetime applications of the species of gem other applications of interest are the (global) configuration spaces and their cotangent bundles (phase spaces) in classical mechanics. All these cases, however, and in fact the species of gem in general have still to be enriched with further structures that can be used for a more complete description of physical reality. This is achieved in a second step by defining a species of global manifolds (gm) to be any extension of a species of gem. According to the definition of extensions in Section I this means that (6a) is supplemented by a formula
s E a(M) 1\ a(M; F, s)
(6b)
such that their conjunction is a species of structures. Depending, as it does, on the parameters nand G, belonging already to gem(n, G), as well as on a and a in (6b) a species of gm will be designated by gm(n, G, a, a). With the introduction of the new structure s typified and axiomatized in (6b) all sorts of extensions of the previously mentioned physical applications of a gem come within view. Outstanding examples are to be found in Newtonian particle mechanics (based on G gal), nonrelativistic quantum mechanics (usually based on Gnew , but generalizable to Ggal and even Gros 12), classical electrodynamics (based on Gpoi ), relativistic quantum mechanics of the Dirac equation (based on Gpoi ) and Einstein's gravitational theory (based on GdiJ), the latter being restricted to the global case. In all these cases the structure sin (6b) would stand for the physical objects that are under investigation, e. g., orbits of particles, spacetime extensions of fields, probability functions for quantities etc., and a of (6b) would, among other things, state the essential physical laws that the objects have to obey, e.g., a law of motion, a law of propagation etc. For the moment I do not want to go into any details about the question how the theories just mentioned actually can be reconstructed as so many species of global manifolds. The answer to this question is intimately connected with the main problem to be treated in this section. If, for instance, we were asked to give a formulation of Newton's law of gravitation for a system of mass points as a special case of (6b), then a very common answer would be to write down the well-known gravitational equations in Galilean coordinates, justifying this procedure by the remark that the equations are invariant under any change of this kind of (Galilean) coordinate systems. It is here where we meet the common invariance' of physical laws as invariance of certain 'concrete' mathematical equations representing the laws in question under an appropriate representation of a 'concrete' mathematical group. At the same time, the formulation of Newton's law mentioned before would 11 12
Rosen 1972 ibid.
VII.31 Invariance and Covariance
465
not be a formulation in terms of species of gm although we feel that it may be equivalent to such a formulation and that the reason for this is just the 'common invariance' of the equations representative for the law. We may therefore try to solve our main problem by seeking a characterization of the species gm( n, G, CT, a) by means of appropriate equivalents realizing the idea of a coordinate formulation of (6b). One thing, however, that must be clearly understood about such an attempt is that it has no unique, intrinsic solution but rather depends on a decision as to how an object s E CT(M) is going to be represented in a coordinate system and, accordingly, how the group G of coordinate transformations is going to be represented on the set of representatives of the s. In the first characterization to be given we decide to take canonical representations throughout. Then, given n, G and CT, a canonical coordinate formulation ccf(n, G, CT, d) is a formula (8a) where a' is invariant in the sense that
9 E G 1\ s' = g<J . s 1\ s
E
CT(JRn ) -=+ a'(s) ++ a'(s')
(8b)
is provable. (8b) with g<J as the canonical representation of 9 is here offered as a condition of invariance in the usual sense, and (8a) will have to replace (6b) in the wanted characterization. This characterization can most conveniently be expressed in terms of a relation between two formulas a and a'. Given n, G and CT the relation is to hold if and only if (Cl) a extends the data to a species gm(n, G, CT, a), or - equivalently - (4) holds for a with CTO and ao being given by (6a), (C2) a' extends the data to a canonical coordinate formulation, or equivalently - d satisfies (8b), (C3) the formula rp E
F 1\ s' = rp<J . s 1\ s E CT(M) -=+ a(M; F, s) ++ a'(s')
(9a)
or - equivalently and slightly more to the point rp E
F I\s' = rp<J.
S
-=-+ s E CT(M) l\a(M;F,s)
-=+ s'
E
CT(JRn ) I\a'(s') (9b)
is provable from (6a). It turns out that this relation is itself invariant under two equivalence relations for a and d. For a it is given by the provability of
s
E
CT(M) -=+ a(M;F,s) ++ al(M;F,s)
(lOa)
from (6a), and for a' by the provability of
(lOb)
466
VII.31 Invariance and Covariance
(from ZFT alone). The characterization will be possible only modulo these equivalence relations. It says (equivalence theorem 1) that the relation defined by (c) is one-to-one modulo (10) and that to every a satisfying (cd there exists a' such that the relation holds and, conversely, that to every a' satisfying (C2) an a exists such that the relation holds. An explicit solution of (9) for a, given a', is
v
s, cp : cp E F /\ s' = cpu . s /\ s' E a(JR n )
and for
a',
/\
(l(s')
(lla)
given a,
VM,F,s,cp: cp E F /\ s' = cpu. s /\ s E a(M) /\ a(M;F,s)
(llb)
In the present case the right side of (llb) even reduces to a(JRn; G, s') and has been given the complicated formulation only for reasons that will become clear in the sequel. The proof of equivalence theorem 1 is, although a bit tedious, a straightforward matter and need not be given here. The theorem clearly shows in what sense our main problem has been solved (under the present restrictions): On account of the one-to-one correspondence established by the theorem the common invariances (8b) of coordinate formulations are exactly matched by the canonical invariances (4) in species of structures. At the same time the theorem provides a standard method to produce reformulations of the mathematical part of a physical theory as a species of structures if a canonical coordinate formulation of that theory is known. This is the case, for instance, with many subtheories of classical mechanics, Newton's theory of gravitating mass points being the most prominent example. On the other hand, the usual coordinate formulation of electrodynamics, for instance, is not canonical. The group of coordinate transformations (G poi ) gets a tensor representation instead. To include such important cases in our reduction program the equivalence theorem will have to be generalized. Before this will be done one other remark should be made now. Equivalence theorem 1 is, of course, nothing but a methodologically purified version of the transition from the old way of coordinate geometry to the modern, abstract and coordinate-free approach to geometry. Besides showing the ordinary invariances becoming canonical invariances in species of structures in this transition the theorem also shows in what sense we get rid of coordinates. A coordinate formulation 1) of necessity picks out one of the preferred coordinate systems, but 2) leaves it completely arbitrary which one is chosen. As can be most clearly seen in (9b) what we get rid of is the arbitrariness of saying what has to be said in terms of one coordinate system. What we do not get rid of is the very set of preferred coordinate systems out of which the choice has to be made. Far from becoming eliminated this set rather is introduced explicitly as a structure in a coordinate manifold. This is sometimes blurred by an otherwise very convenient piecewise application of the
VII.31 Invariance and Covariance
467
standard method. In most physical theories the structure s as well as what is said about it in (6b) are both fairly complicated, and one may wish to break them up in order to get simpler units. This method consists of three parts: 1) a decomposition of (8b); 2) the transition of the pieces to (6b) by means of the equivalence theorem; and 3) their recomposition in accordance with their original connection. Since as a rule no reference to the preferred set F of coordinate systems is needed in the third step this may give the impression that F, too, has been eliminated. This, however, is not the case since the second step heavily depends on F. The result of equivalence theorem 1 may not come as a surprise for the simple reason that (8b) is a special case of canonical invariance. Once we had decided to use the canonical representation of G in the coordinate formulations it was very suggestive to found the relation (c) on the canonical representation of F. But not all representations are canonical. In fact the most important representations used in physics (besides the canonical), namely the tensor and spinor representations of the classical groups, are not. The question therefore arises whether the coordinate formulations of physical theories based on these representations also have species of structures as equivalents and whether the invariances belonging to the former again have the canonical invariances in species of structures as their counterparts. To answer this question for almost completely general representations in coordinate formulations we introduce a new class of extensions of species gcm( n, G) depending on three additional external parameters. They are two scale terms a and a' (each with one principal argument) and an arbitrary representation r of G on a'(JRn) in the usual sense. (Like G itself r is meant to be a term, available in ZFT, for which it can be proved that it has the property just required.) A species of representations in a gem - designated by rp(n,G,a,a',r) - is then defined to be an extension of gcm(n, G) with P E Pot(Pot(M x JRn) x Pot(a(M) x
a'(JRn))) /\ ftrp(M; F, p)
(12)
as the new typification and axiom. The axiom ftrp says that p is actually a mapping (13a) from F into the set of bijections from a set Sp ~ a(M) (uniquely determined by p) onto a'(JRn) compatible with the representation r in the sense that
(13b) and with the canonical representation on a(M) in the sense that
(13c) for any
468
VII.31 Invariance and Covariance
Species of representations are applied in physics in a peculiar way: A representation being an extension of a gcm just as, for instance, some physical field nevertheless does not have the latter's contingency. Rather there are some distinguished representations singled out on every gem by means of a deduction. A deduction (here restricted to the case where base sets are kept fixed 13), relates two species of structures (aj a) and (Tj (3) by a term P that deduces the latter from the former in the sense that
P(Xj s)
E
T(X) 1\ (3(Xj P(Xj s))
(14a)
together with the invariance condition
isoaxr(X,s,tjX',s',t'jf)
~
t = P(Xjs) +-+ t' = P(X'j s')
(14b)
is provable from (aj a). The following is a deduction of rp( n, G, 0', 0") from gcm(n, G) : Dcan(Mj F) assigns to every 'P E f its canonical image 'P a : a(M) >----t a'(ll{n) defined by (2). This is the representation on which equivalence theorem 1 was founded. A second kind of distinguished representations deduced from gem(n, G) presupposes that G ~ Gdif. They concern the tensors and tensor fields of any valence. Whereas in the case of the canonical representations we had SDcan = a(M) and 0" = 0', in the tensor representations SDcan is restricted to the bundle of tensors of a given valence (k, l) and a'(ll{n) = ll{n x JR. nk+ 1 • For tensor fields the corresponding power sets have to be used. The representations themselves are defined in the usual way by giving the tensors and tensor fields their components in a given coordinate system. A third kind of distinguished representations concerns affine connections. As in the tensor case the representation is confined to a special kind of objects, and it is a'(ll{n) = pot(ll{n x JR.n 3 ) Starting now from species gem( n, G) and rp( n, G, 0', 0", r) and a deduction D of the latter from the former we can generalize equivalence theorem 1 in the following way: Replace in the formulas (8)-(11) (ll{n)
for for r(g) for SD(M,F) D(MjF)· 'P for 0"
a(ll{n) ga
a(M)
(15)
'P a
Then the equivalence theorem holds ceteris paribus with respect to the new entities, i.e., given n, G, 0', 0", rand D as above, the relation (c) establishes a one-to-one correspondence modulo (10) between all species gm(n, G, 0', a) and all coordinate formulations cf(n,G,a',a') (equivalence theorem 2). The essential generalization that has been obtained is the step from canonical to (almost) arbitrary representations and corresponding invariances on the side 13
Bourbaki 1968, Ch. IV
VII.31 Invariance and Covariance
469
of the coordinate formulations retaining thereby canonical invariance on the side of the species of structures. III
Up to this point the considerations concerning the reduction of the ordinary invariances occurring in coordinate formulations of physical theories to canonical invariance as part of the definition of species of structures were confined to the topologically trivial, global case where the space M is homeomorphic to JRn . In order to gain the full generality in which the theory of manifolds is usually developed in mathematics and even applied in physics we would now have to take a final step by introducing the local viewpoint. As will be seen in a moment this causes no problems as far as the concept of a manifold based on a coordinate manifold is concerned. However, as soon as one proceeds from this new generalized basis in order to get at a corresponding generalization of the equivalence theorem considerable difficulties come up. Already the further conceptual apparatus for a new formulation of the theorem would demand an amount of technical detail that would be beyond the scope of the present paper. I will therefore leave the matter at the stage that has been achieved in the previous section and now switch over to the concept of covariance. The species of global coordinate manifolds gcm(n, G) can be generalized by replacing the requirement that G be a group of global homeomorphisms of JRnby the weaker assumption that G is only a pseudogroup of local homeomorphisms in JRn14. The idea of a pseudogroup of homeomorphisms generalizes that of a group of homeomorphisms of JRnby allowing variable domains and ranges. The local viewpoint that is opened in this way sometimes is emphasized by the requirement that every restriction of an element of G to an arbitrary open subset of its domain also belongs to G 15 • Since this excludes groups of transformations as special cases of pseudogroups I adopt the more general concept mentioned first. The global case considered so far is recovered by the requirement that all transformations have a common domain (mostly JRnitself). It occurs if and only if the pseudogroup is a group. The groups G ... in (7) all have obvious counterparts GO ... meeting the requirement of strict locality mentioned above. Given nand G as a pseudogroup the species of coordinate manifolds cm(n, G) is then defined in formal analogy to (6a). But the new axiom O:cm would only say that F is a maximal set of local coordinate systems compatible with the transformations of G. The species of gcm considered so far are those species of cm where the transformations of G have a common domain and hence the spaces M are homeomorphic to (an open subset of) JRn . Again on the general level a species of manifolds mf(n, G, (J', 0:) can then be introduced to be any extension (6b) of a species cm(n, G). This concept is sufficiently 14 15
Iyanaga/Kawada 1977, 92 D ibid. , 108 Z; Kobayashi/Nomizu 1963, p. 1
470
VII.31 Invariance and Covariance
general to include all kinds of manifolds that are known from differential geometry such as Riemannian manifolds, manifolds with an affine connection and those that carry additional fields of all sorts that have been studied in theoretical physics. Being thus prepared to approach covariance on the most general differential geometric level there is even reason to step on the more abstract level of arbitrary species of structures and develop our concept on it. As was already announced in the introduction, our access to covariance will be guided by the idea of equivalent mathematical formulations of one and the same physical theory. Since species of structures lend themselves to such formulations we have to ask for an adequate concept of equivalence for them. This is readily at hand: Given two species (0'; 0:) and (0'1; o:d as well as deductions P and PI of (0'1; o:d from (0'; 0:) and vice versa such that
P(X;P1(X;8d) = 81 P1(X; P(X; 8)) = 8
(16)
are provable from (0'; 0:) and (0'1; 0:1) respectively (0'; 0:) and (0'1; o:d are called equivalent with respect to P and Pl' That of two species of structures equivalent in this sense, either both or neither of them can be used to formulate a physical theory, hinges on questions of interpretation that cannot be discussed in this paper 16 . It is, however, safe to say that, given (0'; 0:) together with a physical interpretation, the move to an equivalent (0'1; o:d by means of equivalence terms P and PI cannot affect the content of the theory if the interpretation is retained, and it seems plausible that by using P and PI the interpretation can always be transferred from (0'; 0:) to (0'1; 0:1) to yield an interpretation of (0'1; o:d in the same sense as the original interpretation of (0'; 0:). In Ludwig17 such a transfer is discussed even for the weaker case where the base sets can be different and PI is replaced by the assumption that PI is conservative in the sense that every structure (X; t) of species (T; (3) is isomorphic to a structure (X; P(X; 8)) with an (X; 8) of species (0'; 0:). For the treatment of problems of covariance, however, the stronger and simpler concept of equivalence seems to be sufficient. Let now E and E1 be two species of structures and P a deduction of E1 from E. Furthermore, let E* and E~ be two extensions of E and E1 respectively and equivalent with respect to the equivalence transformations
F1 = P(M;F), 81 = Q(M; F, 8),
F = P1(M;F1,8d = Q1(M;F1,8d
8
(17)
where the given deduction P appears in the upper left corner. Here the notation is already adapted to species of mf but the actual assumptions are meant to be quite general. The following scheme with P* for (17) may help to apprehend the given data and their mutual relations: 16 17
Ludwig 1978 ibid.
VII.31 Invariance and Covariance
1
1
E~P-+L\ deduction equivalence
E* +-- P* -+
extension
471
(18)
Ei
We now call Ei a covariant version of E* and - conversely - E* a reduced version of Ei both with respect to the remaining data E, E l , P and P*. It should be borne in mind that the concepts of covariance and reduction thus defined strictly speaking relate six syntactical entities to each other. In actual practice, however, we can think of the three data in the upper row of (18) as being 'kept fixed'. This will become clear if we now specialize the general setting in (18) by assuming E and El to be two species cm(n, G) and cm(n, Gd with G ~ G l We then have the following natural deduction P of cm(n, Gd from cm(n, G): Given a structure (M; F) of the species cm(n, G) the set Fl = P(M; F) is the uniquely determined complete set of coordinate systems on M with respect to G l satisfying F ~ Fl. More specifically, since the days when the general theory of relativity was born the cases of prevailing interest became those in which G was one of the classical groups and G l the pseudogroup G~if" And since then it became a prevailing tendency in theoretical physics to show of as many spacetime founded physical theories as possible that they could be reformulated as certain extensions of the species of differentiable manifolds. Before extending the exemplification in this direction also to the lower row of (18) two problems shall be formulated that pose themselves in the situation when only the upper row of (18) is fixed. To be as precise as possible their wordings will again be given in general terms. They are: (co) If in addition to E, El and P of (18) an extension E* of E is given can we then always find Ei and P* such that Ei is a covariant version of E*? (re) If in addition to E, El and P of (18) an extension Ei of El is given can we then always find E* and P* such that E* is a reduced version of Ei? The two problems will be called the problem of covariance and of reduction respectively. There is obviously a certain duality between them. However, on account of the asymmetry of (18) in its upper row the two problems receive entirely different answers. The answer to the problem of covariance is positive and trivial. Assuming that the axiomatics of the given species E, El and E* (typification and axiom proper) are E: a(M;F),El: &l(M;Fd E* : a(M; F) 1\ !3(M; F, s)
(19a)
then with P(M; F) as the given deduction of El from E the species Ei that we are looking for can be chosen to be
Ei : &l(M; Fd 1\ a(M; F) 1\ !3(M; F, s) 1\ Fl = P(M; F)
(19b)
472
VII.31 Invariance and Covariance
where the variables have already been chosen such that the equations of (17) become identities except, of course, the one that is prescribed by P. Let us exemplify this solution in the context of species of manifolds taking up the settings from the preceding paragraph. Suppose then that we want to have a covariant version of ordinary electrodynamics. In this case & would axiomatize Minkowski spacetime, &1 the species of differentiable manifolds, and fj (as part of 17*) would be a reformulation of Maxwell's equations in terms of species of structures according to equivalence theorem 2. The trick of getting at a covariant version of 17* now consists in adding s (the Maxwell field) as well as a set F of distinguished coordinate systems (the Lorentz frames) to the set FI of 'arbitrary' coordinate systems in 17i- Precisely this is expressed by & 1\ fj and FI = P(M; F) which in the present case turns out to be equivalent to F ~ FI In view of solution (19b) in all its triviality there may be two kinds of objections against our version (co) of the problem of covariance. It may, firstly, be objected that the nontrivial problem will be an analogue of (co) which exclusively deals with coordinate formulations. It may, for instance, seem a nontrivial problem to find a coordinate formulation of the ordinary Maxwell equations that is invariant under C dif . However, given the results of section II the objection can easily be met by combining those results with (19b): Equivalence theorem 2 does provide for a coordinate formulation equivalent to (19b). In the Maxwell case this coordinate formulation turns out to say about (F', s') that F' is a right coset of C poi in C dif and s' a solution of the equations that result from the Maxwell equations by transforming them with any 9 E P - the former equations being uniquely determined by P' on account of the invariance of Maxwell's equations under C poi . According to the second objection it would be pointed out that (co) could be made nontrivial if only additional requirements were imposed on the species 17i that is to be found. In the differential geometric context laid down above it could, for instance, be required that the new structures SI of mf(n, C I ) extending the coordinate manifold (M; F) of cm(n, Cd has to be a Cartesian product of 1) curves in M, 2) tensor fields on M, 3) one affine connection on M, and nothing else. This would be a perfectly clear requirement, and the trivial solution (19b) would certainly not satisfy it. Moreover, the new problem (co) would be nontrivial in the twofold sense that it would not any more have a positive answer for every given 17*, and if it has, the answer may very well be nontrivial- at least not as trivial as (19b). To this objection I wholeheartedly agree. At the same time, the existence of the unrestricted formulation (co) will now be justified as the necessary background for it. It is hard to find out whether there is such a thing as the prevailing opinion about the concept of covariance and problems connected with it. But I had always the impression that if there is, then part of it is the tendency to view covariance as something which can be had - and in a sense even trivially had - at all events. On the other hand, I have never
VII.31 Invariance and Covariance
473
seen any proof or even an attempt to prove that this view is correct. In order to explore under what circumstances a proof in the strict sense would be possible the version (co) and its solution (19b) were produced. And it may be repeated with regard to the first objection that together with the results of section II both the problem (co) and its solution cover corresponding questions concerning invariance as they are usually mixed up with covariance. Now, adherents of the triviality thesis about covariance becoming aware of the solution (19b) might still not feel themselves confirmed in their intuitions. Regarding the additional requirement objection they would perhaps point out that, once differential geometry had become of principal physical interest as a consequence of the empirical success of Einstein's gravitational theory, covariant versions meeting the objection have been found for the formalisms of all classical geometrical and physical theories. And since these were theories of widely differing contents the property of covariance shared by all of them cannot be of much physical importance. In other words, the triviality of covariance consists in the lack of any contribution to the content of a physical theory. Before commenting on this argument something has to be said about the fact that it alludes to. There could indeed be made a long list of embeddings of geometrical and physical theories in the general differential geometric context mentioned in the additional requirement objection (or something similar to it). But in order to discuss the consequences of this fact with sufficient rigor the preliminary question had to be answered precisely in what sense covariance had been established in all these cases. It seems that the concept of covariance proposed above will cover all the relevant cases known from the literature 18 . In all these cases species mf (n, G~if) submitted to the restrictions of our second objection are produced as equivalents of species gm( n, G) with G ~ G~if that function as the 'original' formulation of a physical theory. The need for a separate concept of covariance, different from that of invariance, becomes particularly clear in a study of the examples: Since the invariance properties of the given species gm( n, G) as a rule are already exhausted by G the covariant version mf(n, G~if) cannot be produced without drastic changes that even touch upon the type of the structures involved. As to the argument that covariance is too weak a property to make any considerable contribution to the content of a physical theory, it has to be admitted that even under the restrictions of the additional requirement objection the class of species gm(n, G) with covariant versions in G~if is very comprehensive. But it had to be pointed out that in the light of the utterly trivial solution (19b) the class in question is already restricted, the answer to (co) is not any more positive throughout, and therefore the historical fact described in the last paragraph is not trivial in the sense that it could not have been foreseen a priori. Moreover, particularly with Einstein's postulate of general covariance in view, one should look at the whole matter also in the 18
Havas 1964,Trautman 1967, Kiinzle 1972, Misner et al. 1973, Ch. 12
474
VII.31 Invariance and Covariance
opposite direction that is provided by the problem of reduction (re). As has been already remarked, contrary to the problem (co) the problem (re) does not have a positive answer for every case of given data. Thus in the differential geometric context, given mf(n, Cd (for Ei) and cm(n, C) with C ~ C 1 (for E) it may not be possible to find an extension mf(n, C) equivalent to mf(n, C 1 ). The latter species would then be irreducible to C, and it may even be absolutely irreducible in the sense that it is irreducible to C for every C included in but different from C 1 . With this concept at hand the postulate of 'general covariance' can easily be made nonvacuous by reinterpreting it as the want for species mf(n,C~if) that are absolutely irreducible. Questions of real frames of reference putting aside this interpretation comes fairly close to what Einstein had in mind by looking for physical laws in 'arbitrary' coordinate systems: the laws were meant to be such that they would not prefer any subset of coordinate systems. Now, in a well-defined sense this is the case for absolutely irreducible species mf(n,C~if)' and the impressive thing about Einstein's gravitational theory was that it was the first physical theory using such species. With respect to irreducibility 'arbitrary' coordinates really took over exactly the same role in the new gravitational theory (and its general extensions) as had the Lorentz frames (viewed merely as coordinate systems) in physical theories based on special relativity.
VII.32 Hermann Weyl and the Nature of Spacetime* Introduction In the early twenties Hermann Weyl's work on the general theory of relativity led to a growing interest on his part in the philosophical problems of space and time. His most important contribution to this field was an answer to the question why the structure of spacetime is the one that physics teaches us. 1 With respect to space the question was not new, and others had answered it before Weyl. But he was the first to realize that Einstein's revolutionizing ideas had made it necessary to make a fresh attack. Even if the replacement of space by spacetime was disregarded the appropriate question to be answered no longer was the one that had presented itself to Helmholtz and Lie. It seemed that we could no longer meaningfully ask why space is euclidean or homogeneous, why spacetime is minkowskian or why these structures are of any such narrow, even monomorphic kind. Rather, the lesson of general relativity had been that the metric of space and spacetime is contingent upon the distribution of matter. Therefore, Weyl concluded, the only thing we can still try to understand is - as he called it - the pythagorean nature of the metric. For as formerly euclidean metric itself was assumed to be independent of the distribution of matter in space so now the nature of the metric of spacetime, by being assumed to be pythagorean, was still assumed to be and now comes a quotation typical for Weyl's way to put such things 2 - "essentially one and absolutely determined, not participating in the irradicable vagueness of that which occupies a variable place on a continuous scale". By 1922 Weyl had given a definite answer to his problem by giving a characteristic equivalent of the property of a metric to be pythagorean in nature. At the end of his final account of the matter 3 he expressed the following opinion about his own achievement: "While all deeper mathematical theories - such as, for instance, the marvellous theory of algebraic number fields - are of no great importance in the wider context of the philosophical problems of knowledge and . .. what mathematics can contribute to enlighten those problems mostly comes from its surface, we here have the rare case that a problem fundamental for all knowledge of reality - as the space problem actually is gives rise to thoroughgoing mathematical investigations". In hearing this one might already anticipate that besides philosophy and mathematics a third domain of research, somewhere between the two, is involved in Weyl's problem, and since it is mainly questions belonging to this domain with which I shall be concerned in this paper, I shall mention it right at the beginning and pari passu with the two others. When speaking about .. First published as Scheibe 1988e. 1 Weyl 1919, p. 25 if; 1922 a; 1922 b; 1923 a; 1923 b, 7. und 8. Vorlesung; 1923 c, §19. 2 Weyl 1923 b, p. 45; 1949, p. 134. 3 Weyl 1923 b, p. 61.
475
476
VII.32 Hermann Weyl and the Nature of Spacetime
the problem shift concerning the characterization of space brought about by Einstein's theory, Weyl said: 4 "This [problem shift] leads to a totally new type ofaxiomatics". What did he mean by this ''totally new type of axiomatics"? As is clear from the context the immediate meaning of the quotation was simply that the object of the axiomatics was no longer spacetime itself, but rather what Weyl called the nature of its metric. However, the question remains what this means when seen from the more abstract viewpoint of axiomatics in general. What logical status is given to the nature of the metric of spacetime in order to make it the object of an axiomatic characterization? By Weyl's "totally new type of axiomatics" more seems to be implied than an ordinary re-axiomatization of the geometry underlying general relativity. A closer inspection of these implications and a logical analysis of Weyl's enterprise in so far as it involves conceptual problems will also be useful from a historical point of view. Weyl was - if I may use this somewhat paradoxical expression - an intuitive thinker, and the verbal formulation he gave his ideas was not always free from ambiguities. As a consequence neither he referring to his own ideas nor those who further developed them were always consistent in doing so. As regards the space problem, although it is quite clear that Weyl wanted to characterize the family of his own geometries the Weyl geometries - what he actually does can easily be taken to be the characterization of another family of theories. Secondly, strangely enough, neither the family of theories Weyl wanted to characterize nor the one he could easily be taken to have characterized is the one he is usually said to have characterized. Thirdly, according to the two foregoing points there are corresponding differences between the respective families of theories within which the characterizations are obtained. Although the differences in question are hardly relevant to the mathematical part - in the narrow sense - of the enterprise, they do concern its differential geometric foundations and the very invention that Weyl introduced into the matter: the Weyl geometries. Therefore I shall take advantage of this centenary conference to do justice also to history and restore it where necessary.
Historical Matters: The Different Characterizations Weyl introduces the principal idea of his characterization of the pythagorean metric of spacetime with a simile. 5 He compares a metric with a state, the nature of the metric with the constitution of the state and the contingency of the distribution of the metric over the spacetime points with the freedom of the citizens of that state. Weyl then argues that, although the distribution of the metric has to respect its nature and so the freedom of the citizens, the constitution of the state, this is a matter of course, and we still stand 4
5
Weyl 1923 b, p. 46. Weyl 1923b, p. 46.
VII.32 Hermann Weyl and the Nature of Spacetime
477
in need of a positive requirement saying something substantial about the latter. For the mathematical case this requirement, fixing - as Weyl expresses himself - the "common weal", is introduced, not without some rhetoric, by the following words: "If one looks at differential geometry and its application in general relativity, it is obvious that the all-important fact rendering possible the whole development and presenting itself with unescapable uniqueness is that the metric field uniquely determines the affine connection; on this, it seems, the "common weal" is founded in the kingdom of space and time". So here we have the fundamental fact on which Weyl wants to found his characterization. In the passage quoted this fact is meant to be a fact for Weyl's own geometry. For reasons of historical comparison I am now going to give a more precise formulation referring also to three other geometries for which the fact that Weyl shows himself so impressed about is also a fact: (F) Given any manifold equipped with a riemannian, pseudoriemannian, weylian or strongly weylian metric there exists one and only one torsionfree linear connection compatible with the metric. Here, compatibility of a linear connection with a metric means that the transport of tangent vectors along curves associated with the connection leaves the vectors congruent with themselves with respect to the metric. Let us now assume that we have suitable generalizations of our four kinds of metrics called i-metric of nature Qi (i = 1, 2, 3, 4). Then what have been facts with respect to those classical geometries may now be made requirements for their generalizations. And with these requirements as premises the conversions of (F) may also hold: (F') If for every manifold equipped with an i-metric of nature Qi there exists one and only one torsionfree linear connection, then the underlying geometries are the ones mentioned in the premise of (F). Obviously, (F) and (F') taken together would give us a characterization of geometries having a pythagorean metric. The precise logical status of this characterization I shall be giving only at the end of my paper. The question to be answered next is: What are those generalizations - those i-metrics - for which (F') holds? I will be very brief about the first, the riemannian case. Here already in 1919 Weyl had suggested the following generalization of the concept of a riemannian metric: 6 The nature Ql of the generalized metric consists of a class of positive definite functions, homogeneous of the first order, on ~n, equivalent with respect to linear transformations on ~n. A I-metric on a ndimensional manifold, having the nature Ql is any function on the tangent bundle whose coordinate representations belong to Ql. Congruence is defined on the whole tangent bundle in the obvious way. With this generalization in mind Weyl conjectured (F') for this case. He seems never to have come back 6
Weyl 1919, p. 27.
478
VII.32 Hermann Weyl and the Nature of Spacetime
to the question. The conjecture was proven by Laugwitz in 1958 assuming only the existence of a compatible torsionfree connection. 7 The second, the pseudo-riemannian, case seems to have had a curious fate. Let me first remind us of the following theorem: 8 (PR 1 ) Given a non-degenerate symmetric matrix '"Y of signature k (= 1 for odd dimension n) there is a canonical 1, ... , ~ + 1 for even and equivalence between the species of pseudo-riemannian manifolds of signature k and the species of linear Ok-bundles where Ok is the full orthogonal group of signature k leaving '"Y invariant. This theorem suggests that we may generalize the concept of a pseudo- riemannian manifold by the concept of a G-structure (or: G-bundle) where C is any Lie subgroup of the general linear group.9 The only thing we have to keep in mind is that the equivalence stated in (PR1 ) depends on the matrix '"Y. Therefore what invariantly corresponds to a species of pseudo- riemannian manifolds is a whole equivalence class of species of Ok-bundles with groups conjugate in the general linear group. Correspondingly, a species of G-Structures is only representative for the species of 2-metrics of nature Q2, the other representatives resulting in a canonical manner from groups conjugate with C in the general linear group. For the generalization of congruence that is yet to be obtained consider the theorem: (PR 2 ) Under the assumptions of (PR 1 ) two tangent vectors in a pseudoriemannian manifold are congruent if, and only if, they have identical representations in suitable frames of the corresponding (Ok)-bundle. This theorem suggests an obvious generalization of the congruence relation in pseudo-riemannian manifolds to arbitrary G-structures: Two tangent vectors are congruent if they have identical representations in suitable frames of the G-structures. For vectors at the same point of the manifold this congruence is equivalent with the one given by the natural operations of C at this point. 10 It should be noticed, however, that the further development of the theory sometimes depends on the assumption that the group operations are characteristic for the congruence defined by them, i.e. that no group larger than G leads to the same congruence. For G itself this assumption is that if g is any matrix of the general linear group such that 9 . Y is congruent with y for all y E ~n, then 9 E C. l i
n!
Laugwitz 1958. In this paper Laugwitz extends "Weyl's first space problem" to the pseudo-riemannian case and prepares the ground for a similar characterization also in this case. However, he did not succeed then and has recently informed me that nobody seems to have continued his work. s Scheibe 1957, p. 172, 176. 9 Kobayashi and Nomizu 1963, p. 288. 10 As we shall see below a G-structure naturally leads to a field of groups of linear transformations in the tangent vector spaces. 11 Scheibe 1957, p. 192. 7
VII.32 Hermann Weyl and the Nature of Spacetime
479
Having fixed congruence in this manner, it turns out that a linear connection is compatible with this congruence if the connection is reducible to the G-structure. Theorem (F') then assumes the form: (F~) If every G-structure admits one and only one torsionfree linear connection reducible to G then G is a group Ok. 12 To the best of my knowledge, the theorem appears in this form for the first time in a paper by Klingenberg published in 1959. 13 Klingenberg seems to have learned his version from Cartan of whom he says that it had been "CARTAN's merit to have reduced WEYL's considerations, frequently difficult to understand, to their mathematical content". Indeed Cartan's reformulation of Weyl's achievement essentially is (F~).14 However, if anything is obvious in the whole matter I am talking about, then it is that the case before us the pseudo-riemannian case - is not the one Weyl has considered. Rather we here have an historical myth well established by its frequent repetition. 15 As we mentioned already in the Introduction, it was his own geometries that Weyl wanted to characterize. Presenting their characterization as our third case let me first remind us that Weyl's geometries are themselves generalizations of the pseudo-riemannian geometries. The step from euclidean to riemannian and pseudo-riemannian geometry had made the comparison of tangent vectors at different points a curve dependent affair. By contrast, the lengths of any two tangent vectors could still be compared in the new geometry. Slightly more general, a unique relation of congruence can be defined on the whole tangent bundle. In Weyl's view the idea of a pure infinitesimal geometry was, therefore, not realized in pseudo- riemannian geometry. To realize it, Weyl suggested the following generalization, dividing the metric into two components: By the first one, we fix only the pseudo-euclidean (=pythagorean) congruence in each tangent vector space. So far there would then be no definite connection whatsoever between the metrics at different points. It is only the second component of the metric by which we re-introduce a congruence relation in the infinitesimal neighbourhood of every point, or equivalently - a curve- dependent congruence relation between tangent vectors at different points. In modern terms 16 , the first component would be given by a pseudo-riemannian gauge bundle having the positive multiples of any pseudo-riemannian tensor as its typical fibre and, correspondingly, the multiplicative group of the positive reals as its group. The bundle itself is essentially generated by local pseudo-riemannian metrics differing from each 12
13 14 15 16
It is here where the condition on G mentioned a moment ago comes in for the first time: It serves to exclude the special orthogonal group SOk.
Klingenberg 1959, p. 301. Cartan 1923, esp. pp. 171-4. Kobayashi and Nomizu 1963, p. 288 f; Kobayashi and Nagano 1965, Theorem 2, p. 86; Freudenthal 1960, p. 107; 1964, p. 157 ff; 1967, p. 326. For a modem treatment of Weyl's geometries see Folland 1970. Our version is slightly more general than Folland's: the conformal viewpoint is applied only locally.
480
VII.32 Hermann Weyl and the Nature of Spacetime
other only by a positive factor that may vary from point to point. The second component of a Weyl metric is simply a connection - Weyl's metric connection - in the gauge bundle. Weyl's geometry is a generalization of pseudo-riemannian geometry in the natural sense that there is a canonical equivalence between the class of pseudo-riemannian manifolds with the subclass of those Weyl manifolds in which the metric connection is integrable. Obviously, this generalization does not concern the external parameter which is still the signature of the metric field as it was in the pseudo-riemannian case. It is, therefore, not the nature of the metric that is generalized but the contingent behaviour of the metric connection. Having arrived at the Weyl geometry we have only gained the starting point of its generalization and subsequent characterizsation corresponding to, but not identical with, that generalization and characterization of pseudo-riemannian geometry that was reviewed a moment ago. The principle idea of the generalization ofWeyl's geometry is again grouptheoretical in character. But the group now functions in a manner slightly different from the one we have just seen in the pseudo-riemannian case. Weyl has repeatedly emphasised that in his view the basic concept of geometry is congruence. More specificly, what he had in mind was a congruence relation as it is generated by a transformation group in the obvious way. Adding the principles of differential geometry we are thus led to the concept of a field of groups of linear transformations in every tangent vector space that can be used to define congruence of tangent vectors in each point. In order that the group be characteristic for the congruence defined by it we again have to require that if any linear transformation maps each vector into a congruent one, then that transformation belongs to our group. In order to guarantee a unique nature for this kind of metric we have to require that the groups of the field have coordinate representations conjugate to a given Lie subgroup G of the general linear group. This is the concept of a G-field generalizing the first component of a Weyl metric. 17 It is a concept different from and actually weaker than that of a Gbundle. This can already be seen from the fact that, if N is the normalizer of G in the general linear group, then the species of G-fields is canonically equivalent with the species, not of G-, but of N-bundles. 18 At the same time this fact gives us the possibility to generalize also the second component of 17
18
More accurately stated, we have obtained the concept of a group field of a given nature Q3 where Q3 is a class of conjugate Lie-subgroups of the general linear group, see Scheibe 1957, §2. No group within Q3 is distinguished by our concept. By contrast, the concept of a G-structure requires such a distinction. However, the generalization ofWeyl's metric connection, to be given presently, also requires it. Scheibe 1957, p. 175 Satz 1. With respect to the "historical myth" I want to destroy it should also be emphasized that, whereas an Ok-structure uniquely determines a torsionfree linear connection compatible with it, an Ok-field does not do this.
VII.32 Hermann Weyl and the Nature of Spacetime
481
a Weyl metric: the generalized Weyl connection. We now take this to be any connection in the principal bundle with the quotient group NjG as its group. Since NjG naturally operates on the congruence classes defined by G, the metric connection will also do the job in general that it was invented for by Weyl on the more special level of his geometry. More precisely, if we call the pair consisting of a G-field and a connection of the kind just introduced a G-metric our generalization is justified by the theorem. (W) There is a canonical equivalence between the species of Weyl manifolds of signature k and the species of Ok-metrics. Finally, it is in the spirit of the enterprise when we sharpen the general idea of a geometrical compatibility of a connection with a metric for the present case by saying that a linear connection is compatible with a G-metric if it is reducible to N and induces the connection of the G-metric. With this understanding we arrive at the following form of the conversion of the fundamental theorem (F): (F;) If every G-metric admits one and only one torsionfree linear connection compatible with it, then G is a group Ok. Again this result is not the one that Weyl obtained. But before commenting on this, a brief comparison with the Cartan-Klingenberg version (F;) will be in order. There the species of pseudo-riemannian manifolds of a given signature k was generalized by the species of G-structures. Now we have obtained manifolds equipped with a G-metric as generalizations of Weyl manifolds. These were the group theoretical generalizations, extending the external parameter from the orthogonal groups Ok to any Lie subgroup G. On the other hand, the Weyl manifolds in turn are generalizations of pseudoriemannian manifolds, - generalizations pointing - so to speak - in a different direction. Now, our group-theoretical generalizations were not up to much if they would not stand in the relation corresponding to the one that relates their originals. And that they do. There is a natural equivalence between the species of G-structures and the species of manifolds equipped with G- metrics with an integrable metric connection. From this it follows immediately that (F;) is a stronger statement than is (F;). Secondly, as regards the historical fact that Weyl in his characterization was not concerned with the pseudo-riemannian but rather with his own geometries, the evidence for this fact is so manifest that it is hardly understandable that anybody could ever come to a deviating opinion on this. As we shall see presently there are difficulties of interpretation in Weyl but they do not concern this point. To get clear about it no more is needed than an even superficial reading of the relevant texts. 19 However, in order to make this paper selfcontained here is a quotation from Weyl20 appropriate to set the case beyond doubt: "A space satisfying our requirements [i.e. essentially the premise of (F')], although it has a pythagorean metric at every point, is not necessarily 19 20
Weyl 1922 a, §1; 1922 b, V; 1923 b, 7. und 8. Vorlesung; 1923 c, §§ 17 und 19. Weyl 1922b, p. 221 (p.344 of the Ges. Abh. II).
482
VII.32 Hermann Weyl and the Nature of Spacetime
a riemannian space. Rather, that generalized riemannian geometry holds in it which was established by me as 'pure infinitesimal geometry' and in which infinitesimal line segments can be compared by measurement only if they belong to the same or nearly the same points". Weyl then concludes, referring to his principal achievement: "The group-theoretical foundation therefore is a new support for my conviction that it is this geometry which has to be used for the interpretation of the physical field phenomena and not, as Einstein suggests, the narrower riemannian". Taking it for granted that Weyl wanted to characterize his own geometries, we now have to ask what generalization he suggested for them. It is here that some uncertainty as to what he really did arises. First of all, Weyl expressed himself quite clearly about the first component of a general metric: This was to be a group field of a given nature in precisely the sense as it was introduced a moment ago. At the same time the concept of a group field was his invention. Weyl is less clear, however, on the second component: the metric connection. Here, his suggestion was not the one used in the previous account. The reconstruction that comes closest to what he says about this as well as to the actual use he makes of it seems to be the following: Besides the G-field we have a full class of linear connections compatible with it and with each other, i.e. inducing a transport of congruence classes along curves and all inducing the same such transport. A typical formulation of Weyl which suggests this reads: 21 "If we are to give a complete description [11 of the metric at any point Po of the manifold and [21 of the metrical connection of this point with the points of its neighbourhood then we have to say [1] which of the linear mappings of the tangent space at Po onto itself and [2] which of its linear mappings onto the tangent spaces at the points Po in an infinitesimal neighbourhood of Po are 'congruent' mappings". Now, one way of saying which of those linear mappings of the second kind are "congruent" mappings certainly is given by fixing a class of linear connections as suggested. This would even be the most direct, if not the most elegant, way. Indeed in describing his own geometries Weyl did not use it. Instead, he introduced a connection in the pseudo-riemannian gauge bundle. In the previous account I have suggested a natural generalization of this method. But on both levels of generality the two ways of presentation are equivalent: In Weyl's geometries the class of linear connections A kr associated with the metric connection
AS
kr
ik A. + gks ASir = 8g 8x r + gik'f'r
In general the class consists of those linear connections that are reducible to N and induce the given generalized Weyl connection on the N/G-bundle. 22 21 22
Weyl 1923 b, p. 47, my italics. Of course, reasonable conditions have to be imposed on this connection such that the existence of a linear connection reducible to N and inducing the former can be proven.
VII.32 Hermann Weyl and the Nature of Spacetime
483
Let us now look at the use Weyl makes of his conception. 23 The first consequence that he draws from the premise that every metric (with the new second component inserted) uniquely determines a torsionfree connection is: (A) Every system of n 3 numbers Afk can be represented in one and only one way in the form
r::i
where rtk = and the n matrices Ail' ... ,Ain belong to the Lie algebra of G. Looking at this we can literally see how Weyl's conception is put to use: In any given class of connections of the kind in question there is one and only one torsionfree connection, and since any two connections in the class induce the same transport of congruence classes their difference must belong to the Lie algebra of G. 24 I can see only one alternative to the foregoing interpretation. According to this the second component of a G-metric would consist in one linear connection compatible with the G-field. Call this a strong G-metric. Then (F') would assume the form (F~) If every strong G-metric admits one and only one torsionfree linear connection compatible with the G-metric then G is a group Ok. As opposed to the foregoing interpretation this would be a fourth case of characterization different from the previous ones: If Weyl should have meant this case then he would really have characterized, not his own, but somewhat stronger geometries dealing with more detailed structures. Since it is unlikely that he should not have recognized this, the variant of (F~) just suggested and not (F~), now seems to me to be the correct interpretation. 25 The algebraization (A) opens that part of Weyl's enterprise that I called its mathematical part: Having obtained (A) we have done with the conceptual problems and the mathematical difficulties begin. Weyl's proof starting with (A) was elementary but very involved. It was considerably simplified by Cartan and, later on, by Freudenthal. 26 I don't want to enter this part of Weyl's enterprise, and I had no claims whatsoever about this part. What I 23
24
25
26
It should be noticed that the otherwise somewhat clumsy concept of a class
of linear connections related to a group field in the manner indicated has the advantage that it avoids the explicit dependency on the group G: It thus makes the whole concept of a metric dependent only on a class of conjugate groups representing the nature of the metric. This is very much in the spirit of Weyl! The following lemma (A) is in Weyl (1923 b), p. 49 f. A second postulate used in this argument - the socalled postulate of the freedom of the metric - can be proven, cf. Weyl 1923 b, p. 49, Scheibe 1957, p. 196, Hilfssatz l. In Scheibe 1957 I had chosen the strong case as the correct interpretation. The reader may go through the relevant passages in Weyl and convince himself that this, if a mistake, was an excusable one. Cartan 1923; Freudenthal 1960.
484
VII.32 Hermann Weyl and the Nature of Spacetime
wanted to make clear up to this point was that as a matter of historical fact we have, not one, but four different differential geometric characterizations founded on what Weyl had called the fundamental theorem of infinitesimal geometry. The differences concern the geometries to be characterized as well as the generalizations by which the characterizations are obtained. They are certainly not too important but, as I see it, at least worth to be mentioned: Weyl's geometry and the geometry of G-metrics are genuine generalizations of pseudo-riemannian geometry and the geometry of G-structures respectively. The relation between the third and the fourth case is different: Here the linear connection of a strong Weyl metric and a strong G-metric is a more detailed structure than is the metric connection of a Weyl metric and a G-metric respectively. The former induces the latter and widely differring linear connections may induce the same metric connection. 27
Logical Analysis: Two Types ofAxiomatics In the concluding part of my paper I come back to the question: In what sense does the new situation in physics, brought about by the general theory of relativity, leads - as I have quoted from Weyl - "to a totally new type of axiomatics". I take it for granted that the four characterizations that were reviewed all are of the same type. Just to remind us of their common feature, here is a restatement of Weyl's postulate as starting point for the following discussion: (WP) For every metrical manifold of nature Q there is one and only one torsionfree linear connection compatible with the metric. In three of our four cases the nature Q has been identified as a linear group taken modulo an equivalence class of conjugate groups. In any of these cases the only free variable in (WJ» is the nature Q, and, therefore, what has been characterized was - strictly speaking - a subclass of linear groups or rather of equivalence classes of groups: Asked what satisfies our postulate we would have to answer: the orthogonal groups and only these. This, however, would leave us unsatisfied. What we really want to know as being characterized is the kind of thing that our groups stand for. Now, what do they stand for? The first thing to be observed is that the type ofaxiomatics to which (WP) belongs is not the rather common type that may perhaps most aptly be called the type of re-axiomatizations. It is this which Weyl wants to express when he says that the new axiomatics "no longer have space equipped with a definite metric field as its object".28 Indeed, suppose we are given a metrical manifold in the sense of (WP). Looking at this structure alone we can never decide whether it itself or anything 27 28
In completing this part of my paper I received valuable advice from Prof. Jiirgen Ehlers (Garching). Weyl 1923b, p. 46.
VII.32 Hermann Weyl and the Nature of Spacetime
485
else related to it satisfies the postulate. By contrast, in the case of a mere re-axiomatization this would be possible. Take euclidean geometry as an example. A typical procedure of re-axiomatization was first its generalization as riemannian geometry and, second, the addition of a condition, formulated in terms of riemannian geometry and leading back to euclidean geometry, e.g. the condition that the curvature tensor vanishes. Here, the new axioms characterize the old ones if and only if they have the same models. In particular, it remains meaningful to investigate the structure our old theory was talking about as to whether it satisfies the new axioms. It remains to be seen whether this is a matter of principle in our case. But for the moment it may have become clear that (WP) as it stands is not a re- axiomatization of any of the geometries to be characterized by it. With respect to the structures being possible models of these geometries (WP) is rather a metastatement: It says something about a theory about structures, parametrized by the nature Q (here: by equivalence classes of linear groups), and the new kind ofaxiomatics consists of distinguishing a subfamily of theories within a given family: the subfamily consisting of those theories for which the postulate as a metastatement is true. Raising the new type of ax iomatics to such a higher level status is, of course, not to dismiss re-axiomatization as some kind of inferior business. It is only to recognize a remarkable difference. Not only are there very important and highly nontrivial programs of re-axiomatizations, e.g. the von Neumann program of reaxiomatizing quantum mechanics. In order to fulfill these programs we are even in need of guiding principles having the very status of metastatements. From a logical point of view the proposal to look at (WP) as a family of statements about theories is in need of explication. Since our postulate is of interest mainly with regard to physics I may first remind us of the probably most widely known metastatements occurring in physics: statements of the invariance of a physical law. Wigner liked to point out the proportion that just as in a physical law we talk about events, in a principle of invariance we talk about a physicallaw. 29 This formulation already suggests the met astatus of an invariance principle, and its most adequate reconstruction is as follows: The physical law is expressed by a sentence in the theory language, and the invariance statement is a statement about this sentence, namely the statement that a certain other sentence expressing the invariance of the first is provable from the axioms of the logic or - at the very least - that part of mathematics on which the physical theory in question is based. Another, much more involved example of a metastatement occurring in physics is what I think to be the truly important statement in the context of general covariance. It says that the axioms of general relativity being founded on the pseudogroup of differentiable coordinate transformations cannot be reduced to a smaller pseudogroup of coordinate transformations in the sense in which, 29
Wigner 1979, p. 16 f.
486
VII.32 Hermann Weyl and the Nature of Spacetime
say, Maxwell's theory in its generally covariant formulation can be reduced to the group of Lorentz transformations. 3o Now, in (WP) we are not confronted with as complicated a matter as that but rather with simple provability statements. However, it is not one such statement but a whole family depending on the parameter Q. Thus I am inclined to look at (WP) as being a family of metastatements where, given Q, the statement corresponding to Q is about a sentence, likewise depending on Q, and what it says about it is that a certain other sentence, containing the first, is provable. The first sentence
oP[S]
(1)
is the axiom of the geometry in question, saying that S is such and such a metrical manifold of nature Q. The second sentence is of the form
[WP]Q where ,6Q[S] expresses that geometrical "fact" on which the characterization is based. It is, therefore, nothing but a reformulation of (WP). The met astatement belonging to Q then is
(2) where provability refers to some codification of mathematics, e.g. the system of ZF of set theory. For some values of Q these metastatements are true, for others they are false. The characterizations reviewed previously have taught us, for each of the four cases distinguished, which of them are the true ones. More precisely, if dis(Q) is the distinguishing condition on Q, i.e. that Q is an orthogonal group Ok (or an equivalence class of such), then our main theorem, i.e. (F) and (F') taken together, is the statement f- (WP)Q +--+ dis(Q)
(3)
f- (WP)Q ~ f- dis(Q),
(4)
From (3) follows
and this is what was just said about (2). The metatheoretical status of the statements (2) is underlined by the fact, obvious from the quantifier occurring in (W P)Q, that in terms of models we could not decide upon the truth of (2) by looking at anyone of the possible models. That our subject matter is not one that demands this feature is shown by a theorem due to Kobayashi and Nagano. 31 In their paper the authors present this theorem as another version of Weyl's theorem. This is 30 31
Scheibe 1982c (this vol. sect. VII.31). Kobayashi and Nagano 1965, p. 86, theor. 1.
VII.32 Hermann Weyl and the Nature of Spacetime
487
even more misleading than the cases already mentioned. For their theorem does not only concern geometries other than Weyl's but in addition is based on a differential geometric fact that is quite different from, though related to, the fact that Weyl had chosen as the base of his characterization. It is the fact that given any tensorfield with one contravariant and two covariant indices, skewsymmetric in the latter, there is one and only one linear connection having this field as its torsion and compatible with a metric described by (1). With 'YQ[Sj expressing this fact the postulate corresponding to (2) would now be f-- (KNP)Q
(2')
with (KNP)Q and the main theorem would be f-- (KNP)Q B dis(Q).
(3')
Indeed, the whole story reviewed in the previous section could be repeated with respect to the new fact expressed by 'YQ[Sj. However, Kobayashi and Nagano obtain a result - for the pseudo-riemannian case - even stronger than (3'). Obviously, the conclusion 'YQ[Sj used by them is definitely stronger than the one envisaged by Weyl, the Weyl case being the case where the given tensorfield is zero. Accordingly, the exact counterpart of Weyl's theorem with the new conclusion as its premise, i.e. (3') from left to right, would be weaker than the original. But Kobayashi and Nagano can make up for this by proving that, given any G-structure having the property just introduced, G is orthogonal. It follwos from this that, given G, either all G-structures or none has that property. So we come to realize that even the logical status of the Kobayashii Nagano characterization is different from that of Weyl's: The structure of the theorem now is f-- (dis(Q) ---+ VS. (8[S]---+ 'YQ[S].) A(-,dis(Q) ---+ VS. aQ[S]---+ -''YQ[S].)
(5)
In semantical terms: Within our family of theories a Q we have completeness with respect to 'YQ: the latter, if true, is provable, and, if false, is disprovable. 32 My concluding remark concerns the epistemological lesson that can be drawn from the foregoing analysis. Weyl interpreted his own achievement in terms of Kant's epistemology. About the nature of a metric field he said: 33 "in it the a priori essence of space-time structure manifests itself". And 32
33
This comparison of (3) and (5) is on safe ground only for our second case, i.e. for pseudo-riemannian manifolds and G-structures as their generalizations. For the third (or even the fourth) case (5) would still have to be proven. Weyl 1923 b, p. 45.
488
VII.32 Hermann Weyl and the Nature of Spacetime
he continues: "By contrast, a posteriori ... is the relative orientation of the metrics at the various points". In another place 34 he describes his postulate as constituting the" synthetic part in KANT's sense", thus separating it from ''the mere conceptual analysis and explication of what is contained in the concepts of a metric, metrical connection and parallel displacement as such". These quotations, I think, can be paraphrased by saying that Weyl wanted to re-establish Kant's position on the higher level of the nature of a metric instead of the level of the metric itself. That this really was his intention is confirmed by a later review 35 where he says with respect to his great achievement: "A way to understand the Pythagorean nature of the metric ... exactly through the separation of a priori and a posteriori has been pointed out by the author". Thus we here have a truly ambitious philosophical claim, and it is no wonder that it has found its critic 36 : "of course, the derivation of [the pythagorean] metric from Weyl's hypotheses does not really solve the problem of space unless there is reason to take the latter as a priori true. Weyl said next to nothing on this subject, but it is clear that he thought or hoped one could so take them". Indeed Coffa can confirm this view by quoting Weyl saying 37: "It may be that the postulate of the unique determination of 'straight progression' can be justified on the basis of the requirements posed by the phenomenological constitution of space" (my italics). But Weyl also says38: "In the case of physical space it is possible to counterdistinguish aprioristic and aposterioristic features in a certain objective sense without, like Kant, referring to their cognitive source or their cognitive character" (my italics). And in saying this, Weyl has in mind the new "great divide" between the a priori nature of spacetime and the a posteriori relative orientation of the metric field at the various points. Trying to "understand" the former (as the only part left for us to be understood) Weyl starts with his concept of Q-geometry, and declares this to be the analytical part of his enterprise: We just have to begin with some very general concept the choice of which does not yet commit us to any substantial view of reality. The next step is to narrow the possibilities for Q. Here it has to be emphasized that although the decision for anyone of the Q's is synthetic the resulting theory a Q is still so general that this decision cannot be an empirical affair or - at the very least - cannot be made on empirical grounds alone. Weyl's solution was to make the question a priori in so far as he gave the restricting postulate (W P)Q the status of the metastatement (2) which, if true, is true a priori. Of course, (W P)Q in postulating the theoremhood of a certain statement j3Q relative to the geometry a Q receives a new constituent, 34 35 36 37
38
Weyl 1923 b, p. 49. Weyl 1949, p. 137. Coffa 1979, p. 277.
Weyl 1949, p. 137. Weyl 1949, p. 134.
VII.32 Hermann Weyl and the Nature of Spacetime
489
namely that very statement {3Q, and everything now seems to depend on what this (3Q says. Although I shall come back to this question it is worthwhile to pause on the level where the postulate has got the form that the geometries to be distinguished are those and only those among all Q-geometries in which a certain (Q-dependent) statement {3Q of a given type is a theorem. It is worthwhile, because it stresses the a priori character of the enterprise: Whatever the content of {3Q may be, once this statement has been fixed the decision in question is an almost logical one. Moreover, taking into account what {3Q says we immediately see that what(W P)Q says cannot be known by empirical means. For it says that for every metrical spacetime such and such is the case. Now, universal statements may very well be empirical. But this one cannot. It talks about a totality only one element of which, if any, can be realized. Finally, looking at the alternative proposal "(Q (for (3Q) of Kobayashi and Nagano we see that the question whether "(Q holds in any given world S becomes an a priori affair in the sense that, if the geometry a Q holds in S if aQ[S] -, then that question is equivalent to whether "(Q can be proved or refuted from a Q . It does not seem to be known whether this result also holds for Weyl's (weaker) {3Q. The foregoing observation makes this an interesting subject of further research. A final look at the content of {3Q should start with the remark that as a matter of course, Weyl did not set out to convince people that they could not have done physics without violating his postulate with respect to {3Q. Newtonian gravitational theory is founded on a metric the nature of which is galilean and as such does not satisfy Weyl's postulate 39 . But what, in an important respect, the difference is between Newton's and Einstein's theory, that is shown by the postulate. Besides the Q-metric already present in a Q a torsionfree linear connection is introduced in {3Q. The assumptions determining the latter are analytical as were those determining the Q-metric. Nonetheless, both concepts concern very elementary experiences about measurement and parallel displacement. Furthermore, there is the most natural assumption, combining the two, that the displacement respects congruence. It is then but another version of the intimate relation between gravitation (represented by the connection) and metric postulated by Einstein that is expressed in Weyl's {3Q .40 It is particularly this aspect under which Weyl's achievement deserves more attention than it has received hitherto.
39
40
Kunzle 1972. It is interesting that although Newton's theory in a sense is a limiting case of Einstein's, Weyl's postulate does not survive the limiting process. Weyl (1923 c), § 29
VII.33 Covariance and the Non-Preference of Coordinate Systems*
I In his famous paper of 1916 on the foundation of general relativity Einstein
has formulated the following principle that he himself calls "the postulate of general covariance,,1: (C) The general laws of nature are to be expressed by equations that are valid for all coordinate systems, i.e. that are covariant under arbitrary substitutions. This principle Einstein has viewed as a strengthening of what he in the same paper calls "the postulate of general relativity,,2. This postulate is given the following formulation: (R) The laws of physics have to be such that they are valid for arbitrarily moved reference systems. Einstein argues that the relativity postulate follows from the covariance postulate by saying3 : For among all substitutions we at any rate find those which correspond to all relative motions of the (three-dimensional) coordinate systems. In arguing thus Einstein obviously assumes that the reference systems men-
tioned in the relativity postulate can be described by certain coordinate systems mentioned in the covariance postulate such that the relative motions of the former are described by certain transformations (= substitutions) of the latter. Under this assumption the relativity postulate says the same of every element of a certain set as the covariance postulate says of every element of a much larger set. Obviously, the former would then follow from the latter. Only one year after Einstein's basic paper Kretschmann has objected against the above argumentation "that every physical theory, by a mere mathematical . .. modification of the equations representing the theory and without changing its content, can be forced to satisfy even the most general relativity postulate,,4 In his rejoinder Einstein immediately admitted this objection and also the general opinion of the physicists - if there is such a thing - seems to have become "that - in Einstein's words 5 - , of necessity, every empirical law can be given a general covariant formulation". In spite of this * First published as Scheibe 1991£. 1 Einstein 1916b,sect. 3 2 ibid. sect. 2 3 ibid. sect.3 4 Kretschmann 1917, p. 576 5 Einstein 1918, p. 242
490
VII.33 Covariance and the Non-Preference of Coordinate Systems
491
giving in and in spite of this general tendency Einstein has continued to argue that physics is in need of a general principle of relativity and that this can be achieved by founding it on a principle of general covariance. Thus, to give but one example, in the widely known book "The Evolution of Physics" written by Infeld but authorized by Einstein, in passing from special to general relativity we read 6 : "Can we formulate physical laws so that they are valid for all coordinate systems, not only those moving uniformly, but also those moving quite arbitrarily, relative to each other? If this can be done, our difficulties will be over." Now, whatever is meant by the difficulties to be overcome, it is suggested that we get rid of them merely by satisfying the principle of general covariance (which here is lumped together with the principle of general relativity). And the theory of general relativity then is introduced precisely by this step. In view of Einstein's adherence to his original standpoint in spite of this admission one can hardly escape the conclusion - believe it or not - that in formulating the principles quoted above Einstein did not succeed in fully expressing the idea that was really important to him. This idea must have been of a sort that would not allow admitting for it what Einstein did admit for its seeming equivalent. Already in the paper of 1916 it is spelled out with respect to both principles. In the sentence immediately preceding the quoted covariance postulate (C) Einstein summarizes a foregoing argument by saying7 : (C') There is no alternative but viewing all possible coordinate systems as to be placed on the same footing in principle (prinzipiell gleichberechtigt) . Similarly, with respect to the relativity principle (R) Einstein concludes a corresponding argument by saying8 : (R') Of all possible spaces in relative motion to each other no one must be preferred a priori. And immediately after this sentence there follows the above quoted postulate of general relativity (R). These facts about the text, I think, leave no doubt that in formulating (C) Einstein wanted to express and thought to have expressed what he had said more informally and provisionally by (C'), and similarly with respect to (R) and (R'). But not only the actual wordings of (C) and (C'), as well as (R) and (R'), but also the arguments preceding these sentences show that something different from (C) and (R) was meant by (C') and (R') respectively. Newton's gravitational theory in its Poisson version can easily be given a generally covariant (equivalent) formulation 9 . But this does not do away with the fact that there are preferred (galilean) coordinate systems. Covariance with 6 7
8 9
Einstein/lnfeld 1938, p.212 Einstein 1916b, sect. 2 ibid. sect. 3 Cartan 1923/4
492
VII.33 Covariance and the Non-Preference of Coordinate Systems
respect to a set of coordinate systems may be necessary in order that none of those systems is preferred. But, as the newtonian example clearly shows, it is by no means sufficient. Therefore, since Einstein wanted to avoid preferred coordinate systems (or, for that matter, reference systems) the question still was which stronger, for the democracy among the systems not only necessary but also sufficient principle distinguished the theory of general relativity from its predecessor. Up to this point my argument concerned covariance as well as relativity. This was because the text of the paper in question shows that Einstein made his mistake with respect to both, and this may make my claim that there is something wrong more plausible. From now on I shall concentrate on covariance, thereby deliberately leaving out the physically more interesting but, at the same time, much more difficult part of the matter. II
To obtain precise explications of the postulates (C) and (C') we have to get a clear idea of that part of a physical theory with respect to which those postulates can be given a precise reformulation. It turns out that the part usually called the "formalism" of a theory is sufficient for this purpose. A concept of formalism recently successfully applied to foundational studies is the concept of species of structures in the sense of Bourbaki lO . Roughly, the axioms of a theory then are statements E[X;s]
(1)
about structures < Xj s > where X is a finite system of principal base sets and s a finite system of elements of scale sets over the X, i.e. of sets generated by successively applying the operation of power set of a cartesian product, starting with the X. In this way the base sets X are structured by the typified sets s. It will be important for us that "concrete" sets such as the set of real numbers may, as so called auxiliary base sets, also take part in the process. Species of structures, well known from mathematics, are the algebraic ones, e.g. groups, rings, vector spaces etc., as well as the species of topological spaces, differentiable manifolds etc., but also mixed forms such as Hilbert spaces, Lie groups, etc. That also the formal part of a physical theory can be reconstructed as a species of structures will become plausible in particular through the species of geometrical structures that we are going to study. Before doing so we have to introduce a second concept on the general level: equivalence of species of structures. For already Einstein's statement, connected with (C), ''that every empirical law can be given a generally covariant formulation" signalizes that general covariance is not the invariable concomitant of a theory. The most we can expect in general is the existence of a covariant version of a theory which then, of course, must be equivalent 10
Bourbaki 1968, Ch. IV
VII.33 Covariance and the Non-Preference of Coordinate Systems
493
to the original one. Similarly, the idea of a class of preferred coordinate systems, related to (C'), essentially means that we could have a version of our theory, equivalent to the original one, in which we can get along with that restricted class. In both cases equivalence includes the actual change of the internal concepts of the theory in question, i.e. a change of the typified sets s. We therefore have to provide for equivalence transformations
t = q[X, s] , s = q-l[X, t]
(2)
which have to satisfy certain obvious conditions. A familiar equivalence of this type is, for instance, the one between boolean algebras and boolean lattices. In our further investigations we shall be concerned with certain species of geometrical structures onlyll. But it was important to define the concept of equivalence on a most general level. Approaching geometrical theories we begin with the rather special but fundamental case of coordinate geometries. A coordinate geometry is about a space X structured by a set F of (local) coordinate systems and by nothing else. The coordinate systems, relating subsets of X to subsets of lR n , are transformed by (local) coordinate transformations of a given pseudo-group G of ]Rn12. The best known coordinate geometry is the species of Coo-differentiable manifolds. in this case G is a very large pseudo-group Goo. But we can choose also a relatively small group as, for instance, the Galileo group or the Lorentz group or the euclidean group (of ]Rn). With them are associated well known classical geometries underlying various physical theories. It is true that these geometries usually are not formulated as coordinate geometries 13 . But they are equivalent (in the sense of (2)) to coordinate geometries. The matter is closely related to Felix Klein's Erlangen program 14 , and we may therefore call species of structures equivalent to coordinate geometries Klein geometries. Evidently not every geometry in the traditional sense, e.g. riemannian geometry, is a Klein geometry, - let alone physical theories in general. Let us, therefore, generalize the concept of coordinate geometry by the concept of analytical geometry. The structures that are the subject of these geometries are species structured not only by a set F of coordinate systems but also by further typified sets s, - geometrical objects as they are sometimes called. Following the somewhat old-fashioned way of presenting geometry 15 we assume the s not only to have but just to be coordinate representations: they are partial (!) functions
(3a) assigning to every coordinate system r.p of a subset Fls of F an element s(r.p) in the representation space Ds of s. Ds is a subset of a scale set constructed 11
12 13 14
15
For the following see Scheibe 1982c (this vol. VII.31) For details see Iyanaga/Kawada 1977. 92 D and (a narrower concept) 108 Z. A recent exception is Dixon 1978, pp. 42ff Klein 1925 The standard monograph is Schouten 1954
494
VII.33 Covariance and the Non-Preference of Coordinate Systems
from JR, and the underlying pseudogroup G is represented on D s such that in the intersection of any two 'P, I}i E Fls, (3b) where rs is the representation of G. Well known examples of geometrical objects in the sense of (3) are fields. For them (4) and the elements of D s in fact are partial functions On JRn with values in JRn k + l • Riemannian metrics, Lorentzian metrics, affine connections are special cases of fields. But also curves and submanifolds can be viewed as geometrical objects covered by (3). It is, of course, true that such objects as well as fields can also be described by an intrinsic method not directly making use of coordinate systems. In fact, the intrinsic representation of geometrical objects has become fashionable of late 16 . However, although this method may be more adequate in some cases, it has to be emphasized that the problems connected with covariance and the non-preference of coordinate systems would become meaningless if we were to follow that method throughout. This may be evident already from the wordings of the principles (C) and (C'): the term "coordinate system" explicitly occurs in them, and it is not to be seen how it could be eliminated without depriving the principles of their essential content. Part of this content is already contained in an assumption that we have to make if we now complete this outline of the concept of analytical geometry. Up to this point we were concerned mainly with the typification of the sets s. If it now comes to the axioms proper it is evident that a condition of invariance has to be imposed on them: Since our geometrical objects are represented by coordinate representations and since these representations are in general different from coordinate system to coordinate system, what we want to say about those objects s in terms of their representations s( 'P) must be invariant under coordinate transformations. If, for instance, in differential geometry (G = Goo) we were to establish the relation between a tensor field g/Lv and an affine connection rA/L that the covariant derivation of the former with respect to the latter vanishes we would express this as usual by
(5) using any coordinate system. Now the essential thing about a statement like (5) is that it is invariant under coordinate transformations (3b): As a 16
See Misner/Thorne/Wheeler 1973. What I am emphasizing is that, although the definition of, say, the concept of a vector field need not refer to coordinate systems, the definition is based on the concept of a differentiable manifold and this concept usually is defined by using coordinate systems.
VII.33 Covariance and the Non-Preference of Coordinate Systems
495
consequence of the special transformation rules for the g!-'v and r>:!-" if (5) holds in one coordinate system it holds in any other. Obviously, this is what we have to require in general: It is on pain of inconsistency that we have to require that something said about s(
(6b)
for the equivalence transformation q that defines the set Fl of coordinate systems of a space < X; F l , 8 1 > belonging to El from the corresponding set F of < X; F, 8> belonging to E. The best known examples of such equivalences are the differential geometric reformulations of the classical geometries. Let us, for instance, define (locally) affine geometry Eaff as the coordinate geometry belonging to the pseudo-group Gaff of all (locally) affine transformations of IRn. Then Ea!! is equivalent to the differential geometry, i.e. Gl = Goo, E~ff of a flat affine connection. Clearly (6a) holds for this case, and (6b) is a consequence of the natural definition of Fl as being just the set of coordinate systems on X generated by F and Goo. Similar situations occur by reformulating (locally) euclidean geometry as a special case of riemannian geometry and (locally) minkowskian geometry as the species of flat lorentzian manifolds. It is worthwhile to pause for a moment and ask how the phenomenon described by (6) is possible. In the examples given so far one of the geometries was supposed to be a coordinate geometry. A case more typical for the general situation is the following. We consider a field theory governed by the simple relativistic wave equation
496
VII.33 Covariance and the Non-Preference of Coordinate Systems
(7a) based on minkowskian geometry as a coordinate geometry. The amazing thing about a differential geometric formulation of this theory is that the equation (7a), though invariant under the Lorentz group, simply is not invariant under arbitrary coordinate transformations of Goo. On the other hand, the wanted formulation certainly has to include an equation that is invariant under the transformations of Goo. How does this come about? The answer is that what is at work in (7a) not only is the wave function f but also the minkowskian metric 9 which, however, is disguised since it enters the stage only through special coordinate systems for which goo
= 1, gkk = -1 for k = 1,2,3}
9/-LV = 0
for J.L
i- 1/
(8a)
Thus in fact our dramatis personae are f and g, and there is the wave equation (7b)
(V /-L the covariant differentiation with respect to g) relating f and an arbitrary lorentzian metric g, and this equation is invariant under Gooin precisely the same sense as (7a) is invariant under the Lorentz group. Of course, (7b) is still much too general. But if we require 9 to be flat by the equation (8b) (R~",).. the curvature tensor), likewise invariant under Goo, we are led back to the original equation (7a) through the existence of special coordinate systems with (8a). The study of such examples does, of course, mean little with respect to the question of general theorems related to our phenomenon. As regards theorems, the principles from which we started come to mind17. First, we have seen Einstein suggesting that the laws of nature should be expressed by equations covariant with respect to arbitrary coordinate transformations. Translated into the terminology developed so far this would mean that those laws have to be formulated as axioms of a differential geometry. One reaction to this proposal is that, since we do not yet know the laws of nature, only the future development of physics will tell us whether Einstein was right. But then there came the objection that the postulate might be vacuous after all, - that we can always satisfy it whatever the final laws of nature may be. In contrast to the intention that Einstein may have had with his original proposal, the intention connected with its analytical version can hardly be 17
A different analysis of the principle of general covariance can be found in Weinberg 1972, pp. 91ff
VII.33 Covariance and the Non-Preference of Coordinate Systems
497
anything but to bring about a proof of this version. If, however, we want to prove something we must give it a fairly precise formulation, replacing such expressions as ''the laws of nature" by some well defined concept of physical theory. Let us take as such a concept the concept of geometry developed in II. Then the logico-analytical version of the principle of general covariance becomes (C+) Every analytical geometry (with C ~ Coo) is equivalent to an (analytical) differential geometry, i.e. an analytical geometry having the pseudo-group Coo. Is this provable? For a proof we could proceed as follows. Let
17[X;F,s]
(9a)
be the given analytical geometry with pseudo-group C. Then the conjunction
CC[X; Foo] A 17[X; F, s] A F ~ Foo
(9b)
where the first member is the coordinate geometry belonging to Cooevidently is equivalent to (9a) with (6) being satisfied. To establish the equivalence we only have to define F00 as being the set of coordinate systems generated by F and Coo. But (9b) is not yet an analytical geometry with respect to Coo. We would still have to bring about the situation described in II, especially by (3). It is far from clear whether this can be done in each and every case, and we will not go any further into this matter18. It must suffice to make the reader feel that as soon as we try to be a bit more precise in this business as is usual we find ourselves in a situation not easy to control. IV With respect to the equivalences of analytical geometries satisfying (6) there is complete symmetry between the two following questions (A) Given an analytical geometry 17 with pseudo-group C. Is there an analytical geometry 171 having a larger pseudo-group C 1, i.e. satisfying (6a), but still equivalent to 17 in the sense of (6b)? Given an analytical geometry 171 with pseudo-group C 1. Is there an analytical geometry 17 having a smaller pseudo-group C, i.e. satisfying (6a), but still equivalent to 171 in the sense of (6b)? In the previous section we have discussed (A) for the extreme case that C 1 = Coo. A far reaching positive answer to (A) in this case was (C+). But we raised doubts as to its validity. The corresponding positive answer to (A') certainly is wrong: There is no logico-analytical version (C~) of (C') as there may be one for (C). Rather we have (A')
18
For some further thoughts on the matter see Scheibe 1982c (this vol. VII.31)
498
VII.33 Covariance and the Non-Preference of Coordinate Systems
(C+) There are differential geometries that are not equivalent in the sense of (6b) to any analytical geometry having a smaller pseudogroup, cf. (6a) with G 1 = G oo .19 An uninteresting instance of (C+) would be the coordinate geometry with pseudo-group Goo, i.e. the theory of infinitely often differentiable manifolds. But also Einstein's theory of general relativity, if it is given a suitable formulation, seems to be a candidate for (C+) although a proof is still missing. However, pointing out (C+) I do not pretend to have found an adequate explication of Einstein's original (C'). Taken literally it in fact is an explication. But it grants a theory its virtue of not distinguishing special coordinate systems simply by letting its axioms being sufficiently weak. And this, in turn, does not seem to be a virtue of a (metatheoretical) principle. It is here where our decision to concentrate on covariance and the non-preference of coordinate systems leads to consequences showing that that viewpoint may be a bit too narrow. Nonetheless I shall conclude this paper by discussing some variations of the idea of non-preference of coordinate systems. To this end let me introduce two concepts related to the one in question. The essential concept entering (C') was: (B) The analytical geometry 171 with pseudo-group G 1 is not equivalent in the sense of (6b) to any analytical geometry 17 having a smaller pseudo-group G in the sense of (6a). Consider now the following concept (B 1) For any relevant condition on a coordinate system, if it can be proven from E1 that there are coordinate systems satisfying that condition then it can also be proven that every coordinate system satisfies the condition. In other words: There is no condition for which it could be proven that some but not all coordinate systems satisfy it. This is perhaps the most direct explication of the idea that in the geometry 171 no coordinate systems are preferred to others: In the field theory defined by (7b) and (8b) there are privileged coordinate systems precisely in the sense that we can prove that in some coordinate systems (7a) (or (8a)) holds whereas in others it does not. The new concept (Bd is stronger than (B). For by virtue of (6b) any reduction of the pseudo-group of 171 immediately leads to a condition distinguishing certain coordinate systems. On the other hand, (B1) would not hold for general relativity because for this theory there are conditions distinguishing certain coordinate systems without reducing Goo. The condition on a coordinate system adapting it to the light cones at every point of its domain is a case in point. 19
There are, of course, explications different from (C+). One possibility is to restrict the whole question to field theorIes in the sense of (4). Yet the problem of proving (C+) thus modified again is a matter not too easily settled.
VII.33 Covariance and the Non-Preference of Coordinate Systems
499
Besides (B 1 ) there is another concept (B 2 ) related to (B) but presumably weaker than it. This concept was suggested by J. Anderson 2o and made more precise by M. Friedman21 . In the following I give my own version of the matter. Let 171 be an analytical geometry whose pseudo-group G 1 of coordinate transformations is a group acting on ]Rn22. It may then happen that 171 is categorical in the following restricted sense: With respect to the arguments "Xl" "F1" and "Sl" in
(10) any two models < Xl; F1 ... Sl ... > and < Xi; F{ ... si ... > are isomorphic. If this happens and < Xl; Fl ... Sl ... > is a model of 171, then Sl is called an absolute object in that structure. There are absolute objects occurring of necessity: Any two models of 171 necessarily are isomorphic with respect to their sets of coordinate systems F1 and F{. If, therefore, Sl is definable in terms of FI, then it will be an absolute object. Such is the case, for instance, if the coordinate geometry on which 171 is based has the Lorentz group as its group of coordinate transformations and Sl is the usual metric definable on this ground. But there are cases of absolute objects not definable in the coordinate geometry. If 171 is the differential geometric formulation of euclidean geometry we have categoricity without the possibility of defining the metric in the coordinate geometry which, in this case, is the species of G oo differentiable manifolds. The case of non-definable absolute objects leads to a stronger version of categoricity: 171 is strongly categorical with respect to "Sl" if it is categorical and "Sl" is not definable in terms of "F1". Our third concept of irreducibility then is (B 2 ) The analytical geometry 171 is a coordinate geometry or it is not strongly categorical with respect to any of its arguments" Sl". One can easily see that (B 2 ) follows from (B). For if (B 2 ) does not hold 171 is not a coordinate geometry. Moreover, it is strongly categorical with respect to at least one of the arguments, say "Sl". Given a model < Xl; Fl ... Sl ... > of 171 we define a set of preferred coordinate systems F C F1 as follows: Because of the categoricity the model is isomorphic to a standard model < ]Rn; G 1 ... Sl ... > of 171. The isomorphism is effected by a coordinate system in Fl. The set of coordinate systems thus distinguished is smaller than F1 because 171 was assumed to be strongly categorical. In this way equivalence to a theory with a smaller group G and the absolute object Sl being eliminated can be shown. Of course, the general concept (B 2 ) does not do away with absolute objects altogether: If G is one of the classical groups we still are where we ever were. Consequently, just as in the case of general covariance the interesting case is the differential geometric one. Anderson 20 21 22
Anderson 1967 and 1971 Friedman 1973 This assumption simplifies the concept formation and the argument. But it seems not essential for the matter.
500
VII.33 Covariance and the Non-Preference of Coordinate Systems
wanted to avoid absolute objects under all circumstances - whether they are definable or not. In order to avoid the definable cases we have to make the group G as large as possible. And this nicely fits into the bunch of ideas originally introduced by Einstein.
VII.34 A Most General Principle of Invariance* I
The subject of invariance or symmetry that I am going to talk about is an interesting subject for various reasons. To the philosopher of science it is particularly interesting because here he finds the physicist saying things that he usually does not say: he finds him making certain metastatements of a nonempirical status. Put a physicist before the work of Carnap and he will shrug his shoulders. He is more than happy, however, to state that the laws of electrodynamics are invariant under Lorentz transformations or to postulate the relativistic invariance of any future law. But in doing this the physicist does exactly the kind of thing that Carnap wanted a philosopher of science to do: he states or postulates something about a physical law. Moreover, his claims might even be of a purely syntactical nature: frequently one hears it said or reads it in print that it is the form of a law that is invariant. It seems, therefore, that invariance, if anything, is a subject of common interest to the philosopher and physicist, nicely suited to be dealt with in a meeting like this. The metatheoretical status of an invariance statement is part of what might be called 'Wigner's hierarchy,.l By 'the hierarchy of our knowledge', as he himself calls it, Wigner means "the progression from events to laws of nature, and from laws of nature to symmetry or invariance principles". Speaking of a progression Wigner obviously has in mind that just as by physical laws we make statements about events so by invariance principles we make statements about physical laws, thus progressing to a higher abstraction. Wigner2 even sees "a great similarity between the relation of the laws of nature to the events on one hand, and the relation of symmetry principles to the laws of nature on the other". The similarity is that just as "if we had a complete knowledge of all events in the world ... there would be no use for the laws of physics" so similarly "if we knew all the laws of nature . .. the invariance properties of these laws would not furnish us new information." This is but another way of saying that invariance statements are analytical, and again it seems a curious observation to see physicists being fond of making such statements. In a recent textbook on particle physics we read 3 : "It is no exaggeration to say that symmetries are the most fundamental explanation for the way things behave (the laws of physics)." In view of such a statement one is inclined to ask whether we are right after all in teaching students that analytical statements don't tell us anything. Moreover, it seems that I need not beg your pardon for drawing your attention to a principle of invariance that * First published as Scheibe 1994b. 1 Wigner 1979, 30 f. 2 Loc. cit., 16 f. 3
Dodd 1984, 41.
501
502
VII.34 A Most General Principle of Invariance
on account of its extreme generality has little chance of having any physical meaning at all. Still by way of introduction let me illustrate the typical situation in which a physicist makes an invariance statement by the following geometrical example. Suppose our theory is about figures in Euclidean space, and the basic law of the theory simply states that a figure is a sphere. Then a valid invariance statement is that this law is invariant under Euclidean transformations: the property of any figure of being a sphere obviously is preserved under all Euclidean transformations. In the numerous books on symmetry in physics that we owe to present fashion 4 the authors usually start with the even more concrete and pictorial invariance of a singular figure, e.g., a sphere as being invariant under all rotations around its centre. However, I may be allowed to start with the more abstract situation in which the question is about the invariance of a certain property of a figure and not the figure itself. Even in such a case, which is nearer to that of a physical law in the proper sense, the invariance statement is fairly specific. Our statement that the property of being a sphere is invariant under Euclidean transformations has the overtone that this property is not invariant under all differentiable transformations of space and not even under all affine transformations. Thus it seems that to a given invariance statement statements of non-invariance with respect to the same law are readily at hand, pointing out the specificity of the former. However, the statement that the property of being a sphere is not invariant under all affine transformations is a rather peculiar statement. It is true only under the tacit assumption that our statements about figures before and after the transformation refer to the same Euclidean metric. The same observation holds, of course, for the positive statement of Euclidean invariance. However, as soon as we admit the very possibility of submitting also our metric to the transformations in question there is a significant difference between the two cases: whereas the Euclidean metric is preserved under Euclidean transformations it is not preserved under all affine transformations. Therefore, in the second case two possibilities open up: either we stick to the original metric or we submit the metric to the very same transformations by which the sphere is transformed. The point now is that, whereas in the former case we get the statement of non-invariance as mentioned, in the latter case we regain invariance: the affinely distorted sphere becomes again a sphere with respect to the affinely distorted metric. We thus see that the apparent specificity of our original invariance statement is brought about by the concurrence of two factors: 1) a more general and, as we shall see in a minute, an even excessively general invariance - in our case the invariance of the axioms, fixing the idea of a Euclidean space with a sphere distinguished in it, with respect to a very general class of transformations, and 2) a special case of this invariance characterized by a restriction to those transformations leaving invariant a given common fragment of the 4
See, for instance, Genz und Decker 1991; Mayer-Kuckuk 1989.
VII.34 A Most General Principle of Invariance
503
physical entities considered by the theory - in our example the one Euclidean space of Newtonian physics. According to this two-fold explanation the theory of invariance to be developed in the following views a physical theory as being a pair consisting of axioms as usual and a frame in which the axioms are interpreted. In the axioms we talk about a physical system, and the frame is some fundamental part of the system ranking on the level of the theory like Euclidean space in our theory of spheres. More specifically, we may think of the axioms as being extensions of set theory talking about structures describing physical systems, the frame being itself a particular structure. Then the transformations with respect to which we look for invariances may be arbitrary isomorphisms of structures, and the entities, being invariant, may be either statements about structures or structures themselves. Under these assumptions the most general principle of invariance mentioned in the title of the paper and already alluded to in the given illustration says that the axioms of a physical theory are invariant under arbitrary isomorphisms. However, since it is only the frame of a theory that gives it a physical meaning, only those isomorphisms leaving the frame invariant are of physical interest. Thus the point of the following analysis is that invariance has two aspects: an unconditional and a conditional one. Under the unconditional aspect an invariance statement is viewed as part of the extremely general invariance principle that, put in terms of mathematical logic, any structure isomorphic to a model of a theory is itself a model of that theory. In other words; you may say about a structure almost anything, what you say, if true, is also true of any structure isomorphic to the first one, and if false then false. But - and here comes the other aspect - invariance statements usually are conditioned by referring only to those isomorphisms that leave invariant some fundamental structure needed for the interpretation of the theory. It is this condition that makes the invariance statement a more or less specific one, leading to names like Galilean invariance, Lorentz invariance, etc. The reason for this specificity is 1) the specificity of the distinguished structure - the frame as I called it - and 2) the fact that, whereas any isomorphism leaves invariant almost any statement about structures, only very few isomorphisms leave a given structure invariant. This double-aspect theory of invariance seems to hold in physics without exception as far as it goes, and it certainly covers the field to a considerable extent. However. not all transformations with respect to which interesting invariance statements have been made in physics are isomorphisms of the physical systems treated by a theory. Therefore, in due course a generalization of our approach has to be indicated.
II After this introductory overview I now come to some details. In the second part of the paper they are details of the general theory of invariance to be proposed. In the first place a word on the concept of physical theory is in order. According to the double-aspect of unconditional and conditional
504
VII.34 A Most General Principle of Invariance
invariance we have to fix our eyes on two parts of a theory: its formal axioms and its meaning-generative frame. For it is the axioms that show the unconditional general invariance, and it is the frame whose own invariance leads to a conditional invariance of the physical laws (as part of the axioms). Let us first address ourselves to unconditional invariance and therewith to the axioms of a theory. Then two things are important for us: 1) that the axioms say something about some physical system, e.g., a field or a system of particles, and 2) that we may view the physical system as a structure in the sense of modern mathematics and mathematical logic. What we wish to say about a physical system we can then formulate as a statement about a structure 5 using some codification of set theory as our language and logic (in the formal sense). The axiom of our physical theory thus assumes the form
E(X;s) == s E O'(X) l\a(X;s).
(1)
Here X and s each stand for a finite system of sets, such that (X; s) makes up a set-theoretical structure. This is expressed by the first member of the righthand side in (1) saying that the sets s are elements of scale sets over the sets X, i.e., of sets generated from the X by successive formation of power sets and Cartesian products. The second member a in (1) is the axiom proper, and thus (1) symbolizes a species of structures for each 0' and a. Simple examples, e.g., groups and topological spaces, are well known from mathematics. For us, however, the point is that by their appropriate combination species of structures can be formed that may be used directly as the axioms of a physical theory6. Now in the abstract treatment of structural mathematics as we find it, for instance, in Bourbaki's encyclopedia a well determined property of the axiom proper a in (1) is presupposed or even explicitly required. And this property is an invariance property - canonical invariance as I will call it 7. It is an invariance of a under arbitrary isomorphisms of the structure about which a is a statement. Let me first briefly recapitulate what an isomorphism is in this context. If we consider any bijections of the principal base sets X onto sets X' then bijections of every scale set over the X onto the corresponding scale set of the same type 0' over the X' are canonically induced. In particular, any structure (X; s) is mapped onto an isomorphic structure (X'; S'). Without giving the formal definitions (which are straightforward) I would like to emphasize that these canonical extensions or representations of originally given bijections are in no way dependent on the species to which a structure belongs. Such a dependence might be suggested by terms like homeomorphism, diffeomorphism, and the like as being isomorphisms of topological 5 6
7
For details see Bourbaki 1968, Ch. IV. The first to have applied species of structures to physics in a systematic fashion is Ludwig in 1978, 2 1990. In Bourbaki, loco cit., Ch. IV, §1 the term for 'canonical invariance' is 'transportability' .
VII.34 A Most General Principle of Invariance
505
spaces, manifolds, etc., respectively. However, given, say, a group we do not have to consider the group axioms in order to construct any isomorphism of the group onto another structure. The real phenomenon to be observed is that, although the isomorphism can be chosen completely independently of the group axioms, the structure to which it leads is again a group. This is not quite the most general situation, though. According to its construction the typification indeed is invariant, i.e., we always have
s E u(X) ++ s'
E
u(X'),
(2a)
where the scale term u is the same on both sides of this equivalence. We cannot expect the same to hold for the axiom proper without exception. However, it is remarkable that the requirement, that also
a(X; s) ++ a(X'; s'),
(2b)
with the same u on both sides holds for every isomorphism, is satisfied for any given theory of physics. We shall see in a minute that this invariance in all its generality may not enjoy life to the full for interpretative reasons. Interpretations reduce symmetries. But this does not alter the fact that the physical axioms are canonically invariant as regards their form 8 . Can we understand why they are? I have no satisfactory answer. Canonical invariance somehow expresses that the axioms don't tell us anything about the nature of the elements of the principal base sets - that these can be chosen or 'interpreted' quite arbitrarily. We cannot, for instance, require that two of the principal sets have a non-empty intersection or that one of them is a particular set without violating our principle (2b). The first case could be a relative and the second an absolute determination of the elements in question. By contrast, for the typified sets s it is uniquely determined what their elements are, once the principal base sets X are fixed. Perhaps one could say that canonical invariance is a very weak condition of lawlikeness of the physical axioms, and this would be desirable at any rate. All further considerations apparently must refer to concrete examples for the time being. Let me now come to the other part of a physical theory in which the aspect of conditional invariance is linked to the concept of the frame of a theory. Up to this point our consideration was essentially about formulas. We now simulate their physical interpretation by a fixed frame structure that is a common fragment of those structures (X; s) that our theory axiom (1) deals with. The frame of a theory thus becomes a common part also of the physical system to which the theory refers indeterminately. It gives a theory meaning without fixing a reference, to use the Fregean terms. The separate development of theoretical physics beside experimental physics is in need of such a distinction anyway. The talk of space and time, of particular quantities like momentum and energy, temperature and entropy, electric and magnetic 8
See Scheibe 1982c (this vol. VII.31).
506
VII.34 A Most General Principle of Invariance
fields, etc., has meaning already within theoretical physics without any real physical system being fixed thereby. We can take account of the frame structure by means of a decomposition
17o(Xo; so) 1\ s E a(Xo) 1\ a:(Xo; so, s)
(3)
of 17 in (1) where (Xo; so) is the frame. Side by side with the axioms the frame is part of the theory such that its change would change the theory, too. By contrast, the set s is still variable within the theory and indicates the many physical systems the possibility of which is stated in (3). Paradigm cases of frame structures in Newtonian physics are, of course, space and time. They are so because of their unique existence. It is hard to see how this can be taken into consideration other than by making these very structures themselves part of the theory. For even if we are in the possession of a categorical theory of, for instance, Euclidean space, our subject is fixed only up to an isomorphism. And we cannot by any means of ordinary axiomatics reduce this multiplicity. We cannot do so precisely because of canonical invariance. Given a frame structure as part of a physical theory it is obvious that the invariance situation changes from an unconditional to a conditional one. Formerly all isomorphisms were to be admitted. Now those isomorphisms leaving invariant the frame structure are distinguished, and whatever their meaning may be it is hard to see what meaning could be attached to isomorphisms not leaving invariant the frame. In this way we obtain the invariance
a:(Xo; so, s) ++ a:(Xo; so, s')
(4a)
as a consequence of unconditional canonical invariance now under the condition that
Xb = X o,
s~
= so·
(4b)
It is these conditional invariances, concerning only s, that we typically meet with in the textbooks as the classical invariances, like the Galilean or Lorentz invariance of a physical law. But we now see that as invariances they are nothing but special cases of canonical invariance following from our general principle together with the special condition that in one case our frame structure is Galilean spacetime, in another case Minkowski spacetime, etc. Once we have decided about the frame no further invariance postulate is needed for any given law. III
After these general considerations it is now time to look at some examples. Although not all invariances to be found in physics are canonical, sufficiently many canonical invariances are left to make the demonstration of their extensive presence in physics somewhat laborious. The following selection is by no means representative. But it may serve to draw our attention to some issues
VII.34 A Most General Principle of Invariance
507
that may easily lead to misunderstandings and later on also to some more serious questions. Let us first look at the unitary invariance of the Schrodinger equation
i?j; = H~ (Ii = 1).
(5a)
The term 'unitary' signalizes that we take the quantum theoretical state space S - a Hilbert space - to be our frame. Precisely if we do this we can restrict further consideration to the canonical representations of automorphisms of S, i.e., of unitary transformations. For unitary U the usual formulation of the corresponding transformations of the time development ~(t) of the states and of the Hamiltonian operator H is ~'(t)
=
U~(t), H'
= UHU- 1 .
(5b)
In case you ever wondered why we take these representations to be the 'correct' ones the theory under discussion has the answer: because they are canonical. For the Hamiltonian operator the argument is that its typification in the state space 8 is
The canonical representation of U on the scale set POW(S2) then already yields a uniquely determined image H' of H whatever H may be. If H is an operator then the transformation has the form given in (5b). Moreover, if H is linear and self-adjoint then the same follows for H'. In this way the example clearly illustrates how a canonical representation beyond its existence in general is provided with additional properties as a consequence of additional assumptions about the structures in question. The same holds for the state functions in (5b), and from both it follows as usual that (5a) is invariant. The quantum theoretical example may evoke the question why we have chosen the state space to be the frame of the theory and consequently have distinguished the unitary group. The simplest answer is: why not? This is to say that here we have a very obvious freedom indeed. From a purely formal point of view we could as well have kept fixed the Hamiltonian operator or have made the metric of the state space variable. Moreover, not only is the decomposition (3) in general arbitrary. To every fragment (Xo; 8 0 ) we may assign its automorphism group and to this in turn its (relevant) canonical representation. In every such case we shall find the corresponding special canonical invariance (4). Insofar the procedure depends on nothing but a given species of structures together with a decomposition (3). As regards the interpretation, however, our example allows us to observe the following: we know of numerous physically realized instances where for a given interpretation of the state space various Hamiltonian operators are applied, describing so many different interactions. But we do not know of
508
VII.34 A Most General Principle of Invariance
anything similar for the metric of the state space vis a vis its linear structure. For a given interpretation the situation here is the same as we found in the case of Euclidean space. Our freedom of choice is drastically reduced for empirical reasons. Yet general quantum theory has no unique frame structure attached to it. The reason is simply that the theory has no unique interpretation in the sense that it could be represented by a frame. Even quantum mechanics proper, i.e., quantum theory extended by a representation of the canonical commutation relations, can be assumed to have a unique interpretation only after fixing the degree of freedom 9 . The same situation is met with in classical Hamiltonian mechanics. But one cannot blame someone for making a frame part of a physical theory by arguing that there is no unique frame to be found in these cases. For either these cases are mere formalisms or, if they are interpreted, a frame will have been attached to them. Already in the introduction it was said that canonical invariance is not the only kind of invariance that has found the interest of physicists. Since non-canonical invariance is not the object of this paper the following example may rather be viewed as an instructive counterexample to canonical invariance than as a beginning of a generalization. The point of canonical invariance is that the solutions of a physical law are always typified sets. Therefore, transformations induced by canonical representations of transformations of those sets by which the former are typified certainly are among all possible transformations. But by no means they do exhaust this class. Gauge transformations are a case in point 10 . They concern, for instance, a quantum mechanical but relativistic particle with state function 'l/J moving in an electromagnetic field with potentials A where both 'l/J and A are defined on Minkowski space-time M. The equations of motion (that need not be given here) are then left invariant by the gauge transformations
'l/J'(x) = eia(x)'l/J(x)
}
(6a)
A~ = AJl(x) - ~ tx~'
where 0: is any real function on M. By contrast, the transformations of 'l/J and A induced by a Lorentz transformation of M would be
'l/J'(x') = 'l/J(x)
A~(x') = g:,~ (x)A>.(x).
}
(6b)
It is obvious how the additive group of functions 0: operates directly on the state functions while a Lorentz transformation on M has to work all the way up to the entities in whose transformations we are interested. The dichotomy between canonical and non-canonical invariance thus illustrated does not coincide with any of the usual distinctions between geometrical and nongeometrical, internal and external transformations and the like. But it is certainly more precisely definable than the latter, and it may even be more important. 9 10
This is due to the von Neumann-Stone theorem, see Emch 1984, Ch. 8.3 f See, for instance, Bethge und Schroder 1986, Ch. 5.
VII.34 A Most General Principle of Invariance
509
IV
With my third example I come back to canonical invariance in order to discuss its relation to the well-known invariances occurring in analytical geometry. Let us take a static scalar field in Euclidean space obeying, for instance, the Laplace equation
£1¢ == O.
(7a)
In this case we are dealing with the Euclidean invariance of this equation under the transformations (7b) where A is a Euclidean transformation. The statement that this case, too, is a case of canonical invariance with Euclidean space as our frame is as trivial as was the corresponding statement for the first example. However, the present case is well suited to discuss a somewhat delicate point that sometimes goes under the name of covariance. I take the occasion to emphasize the essential difference between canonical invariance and covariance in the sense of invariance under coordinate transformations. The statement that the law (7a) via the representation (7b) is invariant under Euclidean transformations is ambiguous. It may mean that the transformations of ]R3(!) leaving invariant the quadratic form (7c) also leave invariant the Laplace equation if the latter is understood to be an equation for real functions on ]R3 transforming according to (7b). But this numerical invariance statement, as it might be called, certainly cannot be its primary meaning as an invariance statement concerning a physical theory. Rather the primary meaning must be such that the numerical invariance statement is but an expression of the invariance statement proper in a Euclidean coordinate system. This transpires already from the necessity to invoke an equation different from (7a) and a function different from (7c) were we to express our invariance statement proper in a non-Euclidean coordinate system. This observation does not, of course, answer the question of how to give a sound formulation of the invariance statement proper in the present case. And even if an answer were produced we would still be saddled with the different question of what it means that the numerical invariance statement expresses the proper one in a Euclidean coordinate system. A precise and general clarification of this matter would be beyond the scope of this paper. The following, I hope, will clarify the situation in its essentials. All geometries that ever have been applied in physics allow the introduction of coordinate systems. Even if a coordinate-free axiomatic is available and used, the existence of a coordinate system can be proved. The simplest
510
VII.34 A Most General Principle of Invariance
procedure, however, is to introduce coordinate systems from the very beginning and to exploit fully the basic principle of analytic geometry, i.e., the principle of saying what we want to say about a physical system by using coordinate representations. In recent publications analytical geometry is dismissed, though not for the very definition of a manifold. Still, the authors try to make their presentations as coordinate-free or intrinsic as possible. Now it is true that equivalent formulations of a geometry can be given where one of them is more intrinsic than the other. At present my point only is that to the extent to which coordinate systems are used a new invariance phenomenon comes into play. I shall be using the term 'covariance' for this phenomenon, being aware of the fact that some authors use the term in a different sense l l . Roughly put, covariance is invariance under coordinate transformations. If, for instance, we wish to do Euclidean geometry in an analytical fashion we start with the requirement that there be a (global) coordinate system of space in which the square of the distance is given by the quadratic form (7c). This requirement establishes the class of Euclidean coordinate systems as a distinguished class, If we now want to formulate a field law we can do so by using a representation of the field in a Euclidean coordinate system, demanding that the representation satisfies, for instance, the Laplace equation. For an arbitrary Euclidean coordinate system this gives us a consistent condition for the field itself only if the equation is (numerically) invariant under all Euclidean coordinate transformations. It is this mathematical or - as I called it - numerical invariance that, in its representative role, deserves to be given a name of its own, e.g., covariance. In the first place covariance is the basis for a consistent introduction of a physical field law by a mathematical equation. So covariance is different conceptually from canonical invariance, even from its conditional version. The two are correlated, though. For the automorphism group of (7c) is simply the coordinate representation of the canonical automorphisms of the Euclidean metric in physical space. The principal difference between canonical invariance - conditional or unconditional - and geometrical covariance becomes even more evident if we admit arbitrary differentiable coordinates in space. In this case we have to find a generally covariant formulation of the Euclidean metric and the field equation. It is well known how this is done by means of the metrical coefficients gik, and it is clear how in this way on the side of coordinate representations the very large group (or pseudo-group) of differentiable transformations (in JR3) comes into play. But all this has only to do with the coordinate representation of our theory, this theory itself still being characterized by Euclidean space and the Laplace law 12 . 11 12
Einstein's favorite view of the matter is discussed in Scheibe 1991£ (this vol. VII.35). For the general notion of analytic geometry see the paper mentioned in the preceding footnote.
VII.34 A Most General Principle of Invariance
511
v Next to the principle of canonical invariance the theory proposed in this paper rests on the notion of a frame structure as part of a physical theory. Canonical invariance is unconditional with respect to isomorphism, and its principle is one for the whole of physics. Different kinds of invariances do not come into play even by the fact that many formally inequivalent theories, i.e. theories inequivalent by their axioms, are used in physics. Apart from non-canonical invariance we meet with a multiplicity of kinds of invariance only on account of the different frame structures of the theories, i.e. their different contents. Consequently, we are not concerned here with a variable invariance behavior of physical statements but with so many different restrictions of the one canonical invariance to the respective theory frames. This situation suggests the question to what extent the variety of theory frames can be reduced, possibly within general theory reduction, and what criteria of irreducibility, if any, are at hand. It is, of course, almost preposterous to raise such a far-reaching question at the end of this talk. Let me make some remarks in conclusion, though, that are related to the invariance business. We have already seen the most trivial case where a frame structure is chosen to be more special than our reductive abilities would require it to be. Even after the emergence of general quantum theory the quantum mechanics of the hydrogen atom is a theory in its own right. More serious is the case where the fact that we don't reduce because we simply are at a loss to do so becomes the basis for attempts to give reasons why reduction is impossible. Euclidean space and its explanation as a pure intuition certainly are historical examples for this situation. The Euclid-Hilbert theory of space is an interesting theory on many accounts. But in matters of invariance the really important thing is the uniqueness of space. As Kant put it 13 : "We can represent ourselves only one space; and if we speak of diverse spaces, we mean thereby only parts of one and the same unique space." This, then, leads to a unique automorphism group occurring in every theory about objects in space. The relation of Euclidean space to its theory throws light also on the next case where alternatives to a given frame are available but only as models of one categorical theory. Thus, theoretically there are different Euclidean spaces, if only isomorphic ones. Is this situation different in principle from the variety of solutions of, say, Maxwell's equations that, of course, include an infinity of non-isomorphic ones? The answer demands a distinction. The Hilbert spaces of quantum mechanics have different empirical realizations although they are all isomorphic. However, they cannot be used as state spaces in this abstractness anyway. There is no interpreted physical theory having a state space in its frame whose elements remain entirely indeterminate. In the contrary case we would immediately ask: what are the states of the theory? In quantum mechanics it is only the spectral decompositions of a Hilbert space for which we can answer this question. But they are no longer isomorphic. 13
Kant, Critique of Pure Reason, B 39.
512
VII.34 A Most General Principle of Invariance
The situation is different for Newtonian space and Minkowskian spacetime. Here the question 'what space points or what spacetime points do you mean?' would leave the physicist rather speechless. The categoricity of Euclidean or Minkowskian geometry gains in importance. At present we are still prepared to say: two Euclidean spaces or two Minkowskian spacetimes, if they occur in a fundamental position within a frame, cannot be distinguished by physical means. We could simply not say which of the two spaces or spacetimes is ours. And it is for this reason that we can replace a categorical theory by one of its models. The proviso mentioned, however, is decisive. There are many isomorphic Euclidean spaces provided by the different inertial systems of Minkowskian (or Galilean) spacetime, and already Newton confronted his absolute space with variable relative and as such empirical spaces. Difficulties come up with general relativity. The outstanding event in the transition from special to general relativity was the new contingency of the metric - a contingency reaching far beyond isomorphic models. There is no longer any question of categoricity on this level, and even topology is in the grip of this process. There is hardly anything left for a universal frame, and there is no excuse as we had it for the mechanical theories where irreducibility was not required. In general relativity no deduction is made from the fundamental position of spacetime. So I don't really know what to say in this case. One thing, however, seems to me to be no difficulty at all. There is still local categoricity on the topological level. If we take this as an occasion to choose the manifold of spacetime as the corresponding frame, our automorphism group would become the fairly large group of all diffeomorphisms of the manifold chosen. I now quote Sommerfeld who said in a popular lecture 14 : "The theory of special relativity amounts to a theory of the invariants of the Lorentz transformations ... General relativity, too, is the theory of the invariants of the natural laws. It is only that here the group of Lorentz transformations is replaced by the total group of all coordinate transformations of the fourdimensional universe." In my view this is an essentially correct description of the situation although it is given in the usual sloppy way of the physicist.
14
Sommerfeld 1948.
VIII. Mathematics and Physics
In the last chapter the subject of our investigations is mathematics and its role as an auxiliary discipline for physics. It is undenied, I presume, that mathematics as a pure construction of the human mind has foundations of its own independent of physics and indeed of any other discipline. Many would, however, say that the reverse does not hold - that the physics of our day cannot exist without mathematics applied to and indeed, as it were, embodied in it. But even this has been denied: there are attempts at an elimination of mathematics from physics to a certain extent. 1 The other extreme was Kant's position. For Kant mathematics and physics are at least partially identical, at any rate in geometry. Elimination of geometry from physics would then be impossible without destroying the latter. In the papers of the present chapter it is assumed without discussion that for the formulation of physical theories the use of mathematics is at least very expedient if not indispensable. There is then, of course, the question of which status one allows mathematics to have by itself and how it occurs as such when applied to physical science. As regards Kant ([35)), his doctrine of the role of euclidean geometry in the acquisition of knowledge about physical reality today is looked upon as being superseded. As an explanation it is sometimes said that Kant did not know the non-euclidean geometries. But this does not get to the core of the matter. For the claim of the apriority of euclidean geometry it may make a psychological difference whether one has it at a time at which alternatives to this geometry are not yet known or else at a time where non-euclidean geometries are taught in high school. Although the former is true of Kant we see him even under such unfavorable circumstances reflect on the possibility "that whoever were to invent conditions different from those prescribed by [euclidean space] would waste his time because he had to use the very concept of [euclidean space] as means for his fiction.,,2 Whether at this point the later interpretation of non-euclidean spaces in euclidean space comes to mind or not, Kant's mistake, if he did not make it already here, was to identify the three spaces of physics, mathematics and our intuition. But if one adds the transcendental program it is fair to say that it had been very suggestive to 1 2
Field 1980 Kant 1770, § 15.E
E. Scheibe, Between Rationalism and Empiricism © Springer-Verlag New York, Inc. 2001
514
VIII. Mathematics and Physics
ascribe space both objective reality and apriority.3 It is mainly this step taken by Kant to which some remarks are made in [35]. The three other articles ([36]-[38]) are motivated by Wigner's famous saying: "The enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious and there is no rational explanation for it.,,4 In this chapter no explanation is given either, but it is made an honest effort to determine the explanandum more precisely. This is done with a view to the fact that the usual textbook formulations of our physical theories, especially those inspired by the so-called mathematical physics, often are mathematically overdetermined. One of the first within the physics community who has clearly Seen this phenomenon is Bridgman. In 1936 he described the situation created by the new quantum theory by saying that "in our elementary and classical theories we have become used to discarding perhaps one-half of the results of mathematics, ... , but here [in quantum mechanics] ... , except for a few isolated singular points [we] relegate the entire mathematical structure to a ghostly domain with no physical relevance.,,5 This statement provokes the question whether, if this really should be the situation, it is not possible by a reformulation of the respective theories - including quantum mechanics - to eliminate all the surplus mathematics, maintaining the physical contents of the theories. A classical example where this is possible is euclidean geometry based on a distance function. In this case two distance functions differing only by a positive factor are physically equivalent. Accordingly, one does not lose physical substance if one replaces the distance function by the congruence- and betweenness-relation. In this way one gets rid of the real numbers functioning as distances in the original formulation. It goes without saying that a case of eliminable mathematical overdetermination like the one just mentioned cannot be responsible for Wigner's "unreasonable effectiveness of mathematics in the natural sciences." But it draws our attention to the cases where such an elimination is not possible and which, therefore, might be the true candidates for an explanation of the said 'unreasonable effectiveness'. Cases to the point are already the most simple physical laws that connect a finite number of physical quantities, e. g. a gas law like van der Waals' law, Galileo's law of free fall, Kepler's 3d law, Planck's radiation law etc. The remarkable thing about these laws is the way in which the relation expressing them is specified. There is first the uniformization of the physically quite different quantities combined in the law by replacing their values by things of one and the same kind: by real numbers. This makes possible the second step in which a numerical relation is specified by using the familiar arithmetical operations on real numbers and eventual limiting processes. Now, what is puzzling in this process is that, whereas the numerical relation used 3 4
5
See especially the references in no.7 of [35J as well as § 13, no.! in Kant 1783 Wigner 1967, p.223 Bridgman 1936, pp.116f
VIII. Mathematics and Physics
515
to represent the law, by the very nature of the process, has got a physical interpretation, no such interpretation is given of the arithmetical operations and the limiting processes by which the relation is defined. Therefore, if the law turns out to be true or nearly true this appears almost as a miracle: We can only be amazed at this effect of physically uninterpreted constituents of our theory. At the same time, this, if anything, is a mathematical overdetermination that, moreover, does not seem to be eliminable. For a physical relation (the law) is here not formed by using elementary physical propositions but mathematical ones instead. Therefore, it is impossible to eliminate the real numbers and the operations with them without throwing also the physics over-board. A more general case than the foregoing is the following. With set theory as our mathematical framework we have an axiom system (la) in the terms
Xi.
There are further terms Yk related to the
Xi
by (lb)
(where ~ means isomorphic). Finally we have (in vector notation) the consequence
3x. E(x) 1\ (y
~
P(x)).
(lc)
from (lab). We assume E to be a species of structures (in the sense of Bourbaki) and the Pk to be intrinsic terms with respect to E. The Xi are mathematical in nature with no physical interpretation, but should respect the typification which E has as a species of structures. It is the other way round with the Yk: They are physically interpreted but we do not know whether their axiom system (lc) is equivalent to one that respects the typification of the Yk induced by (lb). If we take quantum mechanics as our example, (la) would be a purely mathematical Hilbert space axiomatic whereas the terms (lb) would stand for physical concepts like an observable, state, expectation value function etc. The problem is to re-axiomatize (lc) such that the Xi are eliminated in favor of a formulation solely in the terms Yk, respecting their own typification (v. Neumann's program). In the example as well as in general we are confronted with a typical situation of mathematical overdetermination for which, moreover, it is an open question whether it is an innocent case, i.e. whether the re-axiomatization is successful, or a case of true overdetermination where the elimination of the Xk in the way required is impossible and where, therefore, Wigner's 'unreasonable effectiveness' shows itself - so it seems - at its best. However, there may be a situation even more serious. If one looks at the manner in which mathematics is used in modern theoretical physics it is very suggestive to reconstruct this whole business by means of some system of set
516
VIII. Mathematics and Physics
theory, for instance ZFC, including thereby not only the mathematics but the physics as well. One way of doing this is to imagine a set universe in which the mathematical elements are sets constructed out of the empty set whereas the physical elements are introduced via Zermelo's urelemente. Moreover, we would be allowed to have physical axioms containing unrestricted quantifiers. The major problem for such a setting would then again be the question whether there is a re-axiomatization such that in the new axioms quantification is restricted to the sets of a structure (in the technical sense). This would then open the way to replace set theory by a finite type logic where the restrictions mentioned are built into the system from the very beginning. But again we do not know whether this elimination is always possible. 6
6
See [36]. §III as well as Scheibe 1992a. §5 and Schmidt 1992. §2
VIII.35 Kant's Philosophy of Mathematics* My presentation today features three great names. Philosophy and mathematics are two of our basic sciences whose birth more than two and a half thousand years ago marks at the same time the beginning of Western scientific thinking in general. Mathematics, dreaded by many and yet respected by everyone, has always been regarded, since the century of Plato and Euclid, as the paradigm of a science in which our thinking can achieve the highest degree of clarity. In our century, mathematics has experienced another unprecedented advance and is today, also with respect to its applications to other fields of science, more widely branched out within the whole system of the sciences than any other discipline. With respect to philosophy and in particular its traditional core subject, metaphysics, "fate. .. has not been so favorable as to permit it to take the sure path of a science", as Kant had to observe with regret. In spite of Kant's attempt to change this situation fundamentally, even today, two hundred years later, one will have to repeat his dictum. Thus, the respect which we pay to philosophy is also not based on a supposed rigorousness which it cannot have. Rather, it is based on the fact that in this discipline one does not shy away from asking final questions, or at least one attempts again and again to advance into the dark zones of the foundations of our so-called positive knowledge. As C.F. von Weizsacker puts it, one simply may not cease to think, even though this proves to be extremely difficult. Finally, as far as Kant, the third member of the union, is concerned, one would subject oneself to ridicule not only in Germany but on the entire planet, if by way of a preamble one were to canvass for the significance of his doctrines. Thus it seems that inasmuch as each of these items - Kant, philosophy, mathematics - is considered by itself, there is no need for a more elaborate justification of why one deals with such things and even speaks about them publicly, except that with respect to speaking about them, an apology might be in order for the fact that a certain dullness adheres to such a topic. Things look somewhat different, however, if we pay closer attention to the way in which the three components of our topic are put together. According to the contemporary view, the topic "philosophy of mathematics" belongs mainly to a series in which besides philosophy of mathematics we also have a philosophy of physics, and besides that in turn a philosophy of biology and in general the philosophy of any individual science and finally also the philosophy of science as such. Perhaps not within the same series, but at least in a parallel one, we have topics such as "philosophy of language", "philosophy of time", "philosophy of action", etc., that is, topics or entire philosophical disciplines in which another significant aspect of human existence outside of science is made the object of philosophical analysis. In such divisions of philosophy, and this is their characteristic trait, one begins from the respective * Dedicated to Gunther Ludwig on his 60th Birthday. Originally published as
Scheibe 1977 and translated for this volume by Hans-Jakob Wilhelm.
517
518
VIII.35 Kant's Philosophy of Mathematics
science or other subject matter in order to return to it in philosophical reflection. In any case, one remains with the subject matter in a certain isolation. This procedure was still quite foreign to Kant's time and particularly to Kant himself. It arose only in the course of the disintegration of philosophy and the development of ever new individual sciences in the 19th century and only really established itself as an accepted schema in our century. In Kant's work, by contrast, there is no complete piece which we could now call his philosophy of mathematics in the modern sense indicated, to say nothing of him calling or understanding it thus. Instead, we find scattered throughout the entire oeuvre in this or that place, and sometimes in a central place, pertinent remarks and assertions which, if one were to gather them all, would perhaps not yield more than 30 pages!. And even worse, if one were to extract these remarks and assemble them in one document, what we would have in front of us would hardly make any sense at all. Another problem is due to the temporal distance that separates us from Kant. Of course, we always run into problems when engaging a philosopher of the past. But it is possible that this becomes quite unbearable when the object of philosophical reflection is a science which, as has already been observed for our case of mathematics, in the meantime has grown enormously and where this growth has affected the essence of the science. A further difficulty is presented by the fact that Kant really only refers to the ancient stock of mathematics, i. e. to elementary geometry and arithmetic, and that he leaves without comment those novel parts of mathematics which at his time were highly esteemed and widely applied such as the infinitesimal calculus, the theory of infinite series, and the theory of differential equations. Moreover, the neighboring sciences of mathematics which are relevant for Kant's purposes, such as logic and physics, have in the meantime experienced a similarly significant transformation as mathematics itself. Thus, if one wants to return to Kant, not as a historian of philosophy, but rather with a systematic intent, one must seriously ask the question whether with respect to mathematics such a return is at all worthwhile. Although my treatment of today's topic is made to order, as it were, I would not have accepted this topic did I not believe that it is still worthwhile - and not merely for educational purposes, but for a systematic-scientific one. On the one hand, I see such a purpose, if this can be briefly stated in advance, in a consideration of the circumspect and effortless embedding of Kant's view of mathematics in his epistemological enterprise as a whole. The point is that here we are not dealing with an ad hoc philosophy of mathematics. Rather, we encounter mathematics at a certain place in the framework of a much more general undertaking. I intend to make this clear in the first part of my 1
The more important places are: 1747, Sects 9-11; 1764, Erste und Zweite Betrachtung; 1768 passim; 1770, sect. 12-15; 21787. Einleitung, Transzendentale A.sthetik, Transzendentale Methodenlehre 1.1 and 2; 1783, Erster Teil; moreover the letter to Joh. Schultz of Nov. 25, 1788 and to Aug. Wilh. Rehberg, (before) Sept. 25, 1790
VIII.35 Kant's Philosophy of Mathematics
519
lecture. A second point, which I think is important and to which the second part of my lecture will be dedicated, is Kant's attitude towards the question of a specific mathematical objectivity. I believe that he claimed there to be such a thing, even if not in an ontologically relevant sense. In the third and final part of my lecture, I shall again broaden the perspective to include the further development of the issue and the situation today. I
Proceeding now immediately to the first part, I want to try briefly to indicate the position occupied by mathematics in Kant's theory of cognition [Erkenntnis] and the role he accords it in our cognition. Kant's undertaking can properly be characterized as the attempt to steer theoretical philosophy between Scylla and Charybdis. Scylla and Charybdis are the two main trends in the philosophy he encountered: rationalism, represented by the Wolffian school going back to Leibniz, and empiricism with which Kant was familiar through the works of Locke, Berkeley, and especially Hume. The fundamental difference between these two tendencies to which Kant drew attention concerned the extent of our a priori knowledge, that is, of the knowledge which we can have independently of our actual experience. Rationalism tended to view this extent as great, while empiricism tended to view it as small. On various occasions, Kant has explained this difference with respect to the principle of causality, that is, the principle that everything that happens has a cause from which it follows with necessity. The rationalist tradition took this principle to be valid a priori, but it applied the concept of a cause that occurs in it to transcendent subjects as well. Hume, by contrast, while perhaps not doubting the aprioricity of the principle of causality itself, nevertheless argued that we are cannot provide good reasons for it. With respect to the difficulties arising here and elsewhere, Kant's decisive idea was to reformulate the concept of the a priori in such a way that, for example, the principle of causality on the one hand maintains its aprioricity, while on the other hand, as a consequence of this new view, it is limited to objects of experience with respect to its application. Kant's idea of a reformulation of the a priori, which at first glance has something quite striking about it, was the idea "that a priori we can know of things only what we ourselves put into them,,2. Kant himself regarded this turn in the conception of the a priori as a revolution, and he hoped that through it metaphysics could be set "upon the secure path of a science" . According to Kant's view, his idea indeed constituted a revolution in light of the fact that hitherto one had assumed that "all - and here we must emphasize: all - our knowledge must conform to objects,,3 and yet thought that even under this assumption one could a priori know something about the objects. For Kant, however, this constituted nothing less than a contradiction 2 3
CPR, B XVIII (CPR ibid., B XVI
= 21787
from here on)
520
VIII.35 Kant's Philosophy of Mathematics
in terms: For if all our knowledge conforms to the objects, then we must learn everything from them in an experience in the narrowest sense of the word, i.e. in an acquisition of knowledge which takes everything out of the object. Hence, Kant now suggests to try to do the opposite and suppose ''that objects must conform to our knowledge, a supposition which would agree better with what is desired, namely, that it should be possible to have knowledge of objects a priori, determining something in regard to them prior to their being given" 4 . It must be noted that with this reversal of a way of thinking, one possibility is at least implicitly rejected, one which would have been compatible with the older view of the a priori: a theory of knowledge oriented along Platonistic lines could assume that there are two completely different kinds of objects which, with regard to the knowledge we have of them, could be characterized as follows: The first kind consists precisely of those objects of which everything we can possibly know of them we know a priori. To the second kind, by contrast, belong all and only those objects of which everything we can possibly know about them we know only through experience. The first class might include ideas in the Platonic sense and perhaps also the objects of mathematics. The second class would contain those things to which we can in principle only have a contact accompanied by sensation or sense-perception in the usual sense. Such a distinction, however, in which the difference in the nature of the objects brings with it a corresponding difference in the kind of knowledge we have of it, is far from Kant's mind. For him it is not a matter of an either-or, but rather of an as well-as. For Kant, an object in the ontologically relevant sense is above all something about which we can know something both by means of sense-perception, but only once the object is given, as well as a priori, even before it is given. Thus, in one place, he himself says with respect to the more specific question, whether the objects would have to conform to certain of our concepts, that one must assume ''that the objects, or what amounts to the same thing, that the experience in which alone (as given objects) they are cognized, conform to these concepts . .. "5. Here it is stated unequivocally that experience does not stand contradictorily opposed to the new a priori, but that it contains it. Empirical knowledge is thus a mixture, as it were, of aposterioric and aprioric elements, and the latter do not refer to some ontologically independent world, but only to the conditions under which alone experience of the world is possible for us humans. Now, what cognitions of objects of experience do we have, cognitions with which strictly speaking we only cognize what, as Kant says, we put into the objects, and which are in this sense a priori? For everything that follows, it is important to mention above all logic, more precisely in Kant's terminology: general and pure logic. We shall still have to speak about the 4
ibid.
5
ibid., B XVII
VIII.35 Kant's Philosophy of Mathematics
521
fact that Kant had available only a fragment of logic in the modern sense, that is, the Aristotelian logic or syllogistic which, by the way, at Kant's time was in a particularly disfigured state. What he says in spite of this handicap about the idea of logic, however, is still defensible even today. For he says that it is the science "which lays out in detail and strictly proves nothing but the formal rules of all thought,,6. In another place, he says: "Logic abstracts ... from all content of cognition, i.e. from all relation of cognition to the object, and considers only the logical form in the relation among cognitions, i.e. the form of thought in general" 7 . Even the fact that here Kant talks of the rules of thought does not give cause for alarm from today's standpoint. For Kant makes it very clear that here he does not intend to refer to actual thought under subjective empirical conditions, but to thought in the sense in which it is free of all psychological or anthropological conditions, thought as it should be - according to the rules of logic. Now, logic in this sense offers cognition a priori because its rules are valid with necessity, and necessity is one of Kant's criteria for the aprioricity of a cognition. Kant attempted a deployment of logic immediately following the definition of the a priori when he set up his table of judgments and sought to gain the table of categories from it. This attempt, however, is dubious at least in its details, and time constraints alone dictate that we completely leave it out of our present considerations. Thus, in any case, logic offers cognitions a priori. But these cognitions, as we have heard, abstract from all content, and that is why they are not yet worth very much 8 . If I want to know how many people are currently in this room, not much will have been gained if I am told that it is either 100 or not 100. This truth, which is a priori because it is a logical truth, does not determine my object, the number of people present here, in any way, and hence one must ask, whether - again generally speaking - there are cognitions a priori which in spite of their aprioricity contribute something to the content and not only to the logical form of our cognition. This is the main problem of the Critique of Pure Reason. Kant gave this problem a technical formulation by means of the distinction between analytic and synthetic cognitions or as he puts it more frequently - propositions, since it is propositions in which we articulate our cognitions9 . In the definition of this distinction, which is so fundamental to Kant's enterprise, the unsatisfactory state of the logic of his time is especially noticeable. Hence, I do not wish to get into the details (which only cause irritation) at all, but rather accept for now the following interpretation of the distinction in question: An analytic proposition is a proposition the verification (or falsification) of which is possible merely on the basis of an explicit definition of the concept expressions occurring within it. Expressly included in this interpretation is the case in which not even this 6 7
8 9
ibid., B IX ibid., B 79 On this issue cf. CPR, B 85. Systematic Introduction to CPR, B 10 ff and 1783, § 2
522
VIII.35 Kant's Philosophy of Mathematics
much, i. e. this return to the explicit definition of the concept expressions, is necessary. The example that is cited again and again, "all bachelors are unmarried", already illustrates both cases: For having realized, in a first step, that one understands a bachelor to be an unmarried man, in a second step, I recognize (and I do so a priori) the truth of the proposition without having to know anything regarding the further meaning of the concept expressions "man" and "unmarried". With synthetic propositions, defined as nonanalytic propositions, things are different, however - as, for example, with the proposition that all bachelors are unhappy. Here I must, besides understanding the occurring concept expressions, also go through the individual persons in order to establish the truth. With empirical propositions, like the one just mentioned, there do not arise any problems in this regard - at least not for Kant. But how is it with synthetic propositions a priori and are there even any such propositions? This is Kant's main problem. Now, for the solution to this problem, another one of Kant's distinctions is of decisive importance: the distinction between intuitions and concepts 10 • According to Kant, every cognition includes concepts as well as intuitions, and the emphasis in this remark is on the fact that intuition must be included as well. With this remark, we get significantly closer to the place in which Kant locates mathematics, but by no means have we reached it yet. First, we must ask, how Kant understands this new distinction. For Kant, a concept is a general representation, which refers to objects mediately. We think objects with the help of concepts, but this thinking does not already deliver the objects that fall under the respective concepts. An intuition, on the other hand, is a particular representation which refers immediately to an object. It is just that in which an object is given. In the definition of the concept of an intuition [AnschauungJ, it is important that - led by the word "schauen" - one does not already anticipate sensible component parts of an intuition.[Translator's note: 'Schauen' means to 'see', 'behold', or 'view'.) In the first place, Kant only means precisely what he says: In an intuition and only in an intuition are objects given; through concepts, however, they are only thought. The matter becomes even more clear when one considers Kant's further distinction between sensible and intellectual intuition. An intellectual intuition is one in which an object is given in that this kind of intuition at once creates the object, while in a sensible intuition given objects have their existence independently of the intuition. Here too, the word "sensible" should not have us think immediately of its narrow sense, but rather only of what Kant defines, that is, that a sensible intuition is essentially receptive. Even the claim which Kant now stakes with the help of this terminology, namely, that for us human beings intuition is always sensible, for the time being says nothing other than that an object given to us does not owe its existence to us. And in considering this claim, we must above all be mindful of the fact that Kant in the present context understands objects to be exclusively such 10
For the following see CPR, B 33ff, 74ff, 92ff, 145ff
VIII.35 Kant's Philosophy of Mathematics
523
as together with their existence can become known to us only through sensible perception in the narrower sense. This understanding alone leaves open the possibility of a mathematical, albeit ontologically dependent, objectivity which in a certain sense is indeed "created" by us. Now, Kant analyzes sensible intuition further by distinguishing within it an empirical and a pure part: A sensible intuition decomposes, as it were, into an empirical and a pure intuition. We have an empirical intuition of an object insofar as the object affects us and we, as the object's effect on us, have sensations. Thus, the empirical intuition presupposes the actual presence of an object and can hence not be a priori in Kant's sense. At the same time, Kant claims that the other, pure part of a sensible intuition, the one that is not bound up with sensation, is always 1) a priori, and 2) existent: In every sensible intuition, there is a contribution which in its essence does not come from the object, but from our power of intuition: It is the spatio-temporal relations in which an object is given to us. This does not mean, of course, that of a(n) (spatial) object, e. g the next vase we shall see, we now already know what shape it has or that of a (temporal) course of events, e. g. this lecture, we now already know how long it is going to last. What is meant, rather, is the fact that objects which are in the end objects of possible experience simply are not given to us in any other way than in space and/or time and that we know something a priori - that is, in particular even before an object is given to us - about space and time, which knowledge, however, every object of possible experience must conform to just because it can appear to us only in space and/or in time. But what do we know about space and time, and is this knowledge always a priori? For the benefit of the present argumentation, I want to leave time out of consideration and continue to talk only about space l l . In Kant's days, the answer to the question, what we know about space, was completely clear: The science of space was geometry, and what was known about space was precisely what was taught by geometry. At that time, however, geometry was still an essential part of mathematics, and moreover, it was still the most systematized part of mathematics. In particular, its truths were accepted as necessary truths. Now, since for Kant, as was already mentioned with regard to logic, necessity of a cognition was a criterion for its aprioricity, it seemed very natural to him to claim for the propositions of geometry aprioricity in his sense as well, that is to say: All the propositions of geometry only tell us the conditions under which alone an object becomes an object of possible experience. Unlike logic, however, the propositions of which, since they are analytic, are a priori in a trivial sense, geometry, as the science of space, does not deliver propositions which are true merely on the basis of the definition, or, as Kant more frequently puts it, of the mere analysis of the concepts occurring within them. Rather, the truth of geometrical propositions rests 11
For the purposes of understanding what follows from this point on, see esp. 1783, § 6 if
524
VIII.35 Kant's Philosophy of Mathematics
on a (pure) intuition, and it is this intuition as a particular and hence nonconceptual representation which makes these propositions, their aprioricity notwithstanding, synthetic propositions. In my view, the fact that geometry is integrated in this way into an aprioric theory of possible experience constitutes, on the one hand, the philosophical significance, but also ~ as is so often the case ~ the weakness of Kant's interpretation of this discipline. There can be no doubt at all about the fact that Kant's intention really went in this direction. For ~ to add the final piece of terminology in this connection ~ he claimed again and again the objective reality or the objective validity of mathematics, and of geometry in particular, the aprioricity of these sciences notwithstanding 12 . On the contrary, his argumentation for this claim rests on this aprioricity, as he understood it. "All outer objects of our sensible world must necessarily agree precisely with the propositions of geometry" and hence the latter must have objective reality in this sense "because through its form of outer intuition (space), sensibility first makes possible those objects as mere appearances with which the geometer is concerned" 13. This is also why, as Kant often puts it, "the propositions of geometry are not determinations of a mere figment of our poetic fancy" (ibid.) or mere "figments of the mind". This argumentation, no doubt, accentuates once more Kant's striving to secure the synthetic character of mathematics even through, and precisely through, its specific application. Now, whoever is familiar with today's mathematics will perhaps be compelled to classify it, in Kant's sense, as a gigantic figment of the mind, and thus it will be a good idea to look for further arguments which Kant advances for the synthetic character of mathematics, this time from the perspective of mathematics itself. II
Proceeding with this to the second part of my lecture, I now want to try to elucidate further Kant's theory of mathematics from the point of view of mathematics itself Here, I want to combine an interpretation which I personally find suggestive with one recently given by Hintikka. The question to be answered is, of course, how Kant could hold that the propositions of mathematics are at once synthetic and a priori. In the first part, we found mathematics, with respect to this question at least, to be lodged between logic and empirical science, and, in addition, we emphasized a certain proximity to the latter. Now, we shall disassociate ourselves somewhat from this position but still guard ourselves against simply identifying mathematics with logic. In order to effect this turn, I first note that it was never a contentious issue that an intuition in Kant's sense is supposed to pave the way for the synthetic aprioric character of the mathematical propositions. In contrast with many interpreters, however, I share Hintikka's view that, especially with respect to 12 13
Besides the passage cited below, compare also CPR, B 119ff, 194ff, 206f, 298f, 371 note 1783, § 13, no. I
VIII.35 Kant's Philosophy of Mathematics
525
mathematics, one should first understand an intuition in Kant's sense in the way in which he defines it. That is to say, we should understand it in contrast with a concept as a particular representation in which an object is given to us and not right away as the sensible intuition claimed by Kant to be our only intuition, that is, as an intuition in the most narrow sense which only includes space and time as its forms 14 . Hintikka's interpretation emphasizes intuition insofar as it gives us objects such as straight lines, triangles, circles etc. in geometry or individual numbers in arithmetic. Some of the evidence seems to me to suggest that Kant also regards space as a whole together with the possibilities inherent within it as well as the number series as a whole together with possibilities inherent within it as an object. But what does Kant himself say? When he speaks about mathematics from the perspective of mathematics itself, as it were, he speaks primarily about its method. At least as far as the orderly presentation of his own ideas is concerned, he adopts the view, traditionally held with respect to geometry, that the mathematical method is the axiomatic method. I am consciously formulating this in a careful way, for I do not think that this yet gets to the heart of the matter. And if we look at the two main relevant passages 15 , we find that the former adheres much more closely to the idea of an axiomatic than the latter. For our continuing presentation as well, it can only be useful to distinguish, as Kant and the tradition do, between what mathematicians do when they define something (including what they presuppose when they do so) and, on the other hand, what mathematicians do when they prove something (including what they presuppose when they do so). I want to attach particular weight to Kant's doctrine of definition, an area that has been somewhat neglected by critics. For this doctrine seems to me to show that in a certain sense Kant assumes for mathematics a specific kind of objectivity and that its synthetic character must be understood from this angle 16 . Incidentally, we shall have to be cognizant of the fact that, counter to a rationalist trend, Kant strictly distinguishes the mathematical method from the philosophical one, as, of course, he also distinguishes it from the empirical method. Especially statements about what mathematics is not for him can reveal more clearly what mathematics is for him. As regards Kant's doctrine of definition, we have the characteristic view that definitions in the most narrow sense of the word are really only made in mathematics. In the most narrow sense of the word, defining means for Kant "to represent originally the exhaustive concept of a thing within its limits" 17. He calls such a representation original, if the "determination of the limit is not deduced from elsewhere such that it would still need a proof, something that would make the supposed explanation incapable of standing 14 15 16 17
Cf. no.11 and Hintikka 1974, Chs. 6.4 and 8.3 1764, Erste und Zweite Betrachtung; CPR, Methodenlehre 1.1 und 2 On this issue compare also Beck 1955. CPR, B 755
526
VIII.35 Kant's Philosophy of Mathematics
at the top of all judgments about an object" 18. Why is it that, on the basis of this definition, only mathematical concepts can be defined? Kant distinguishes between concepts that are given and concepts that are made. It is not possible to give a quick and yet rigorous explication of what Kant means by this distinction, especially since he himself never addressed the issue in a systematic wayl9. But perhaps the ordinary sense of the word already suffices to understand that given concepts comprehend empirical concepts such as the concept of water, of a tree, of gold etc., that is, concepts of naturally given things, but also aprioric concepts such as the concept of a cause, a quantity, of right etc. Now, if we were to begin to make definitions in the realm of given concepts, that is, if we were to define, for example, water to be (in the sense of equality) a transparent liquid or a cause to be (again in the sense of equality) that which brings something else about with necessity, we immediately sense that such definitions would require a justification or a proof and would thus not be original representations of a concept in Kant's sense. For, in the case of a priori given concepts, the definition is subject to the danger that another person thinks something else, say, in the concept of a cause than what the definition provides. And in the case of empirically given concepts, the definition is additionally in danger of being empirically refuted, for example, in the case of water, through demonstrating a transparent liquid that is not water but alcohol. Using a technical term still common today, we would say that such concepts can only be explicated. And an explication must be preceded by an analysis of the concept, which is why Kant sometimes also calls explications analytical definitions. One task of philosophy, for example, is to make analytical definitions. For philosophy is bristling with, as Kant calls them, "muddled" (verworren) given concepts. Things are different in the case of concepts that are made or arbitrarily thought (precisely because they are not given). "In such a case, I can define my concept at any time" - says Kant. "For, after all, I must know what I had wanted to think, since I myself purposely made it,,2o. Here there are other difficulties, however. If a definition is not merely supposed to create a word, but indeed a concept, then it must at the same time demonstrate the possibility of the objects that fall under the defined concept. In the empirical realm, Kant holds real definitions, with which we are concerned here, to be impossible. Although there are ships and there are clocks, nevertheless the possibility of a ship clock remains open to doubt until empirical intuition has shown them to exist. In the aprioric realm, as far as the philosophical concepts are concerned, we have the additional fact that an arbitrary fabrication of further concepts would simply be inappropriate, given the actual situation in which we are primarily dealing with an overwhelming abundance of given 18
19
20
ibid. note Apart from the passages of the CPR cited below, the reflections on logic 29053008 (Akad. Ed. vol. 16) must be considered as well. CPR, B 757
VIII.35 Kant's Philosophy of Mathematics
527
concepts. Simply putting together into a formal consistent system a host of familiar words through arbitrary definitions that do not respect the givenness of the related concepts, as Spinoza has done, for example, does not lead anywhere or leads, as Kant puts it, at best to a house of cards 21 . But there is a further class of aprioric concepts - the mathematical concepts - which are definable in the proper sense of the word. Here, the definition is a construction of the concept, and to construct a concept for Kant means "to represent a priori the intuition corresponding to it'122. But what is this supposed to mean? For the answer, it is important to see that, according to Kant, a mathematical definition contains in general a part which is, in the final analysis, inessential to it, and which he occasionally calls "the mere definition" 23 . For Kant too, a definition was first of all a definition of one concept from other concepts. Thus, a triangle, for example, was for him a figure ''which is enclosed within three straight lines" 24 , and he says moreover that the mere definition is what one "actually thinks" in a concept 25 . Thus, what one thinks with a "merely defined" concept is the form in which it is put together from other concepts, quite independently of the meaning of these other concepts. In relation to the content, I think what meaning the defined concept would have if the meaning of the concepts joined together within it were known. That Kant had something like this in mind is shown by the example he repeatedly Uses of "the concept of a figure which is enclosed by two straight lines,,26. Kant emphasizes that this concept can be thought without contradiction, and, of course, he can only have this view if in this thought we abstract from the meaning of a straight line in (Euclidean) space. The practice of finding the meaning of a concept by going back to other concepts through mere definition comes to an end, of course, when we come upon those concepts which Kant calls the basic concepts of a discipline. The answer to the question, what we think with these concepts which no longer have a "mere definition", must be: we think nothing at all in them. When Hilbert says in his "Foundations of Geometry", as he introduces the basic concepts of a point, a straight line, and a plane: "We think three different systems of things ... "27, Kant would say: thus we think nothing at all. If in this case we are still concerned with meaning - and that is, of course, Kant's concern - then we have no choice but to bring the objects into play to which the basic concepts refer. Definition properly speaking, which begins at this point and only at this point, is the procedure by means of which we provide a 21 22 23
24
CPR, B 755. CPR, B 741. CPR, B 746. CPR, B 744, and hence the concept of a triangle is certainly a concept which is defined from other concepts. Insofar as the latter is the case, Kant speaks of a mere definition
25
26
27
CPR, B 746 CPR B 268
Hilb~rt
71930, Ch.1, sect. 1.
528
VIII.35 Kant's Philosophy of Mathematics
concept with its object. And once this has been settled for the basic concepts, then the mere definitions of derivative concepts will determine which objects correspond to them as well. Kant calls the procedure in question the aprioric representation of a concept in intuition precisely because, by definition, it is the intuition in which objects are given to us. And in order to elucidate this procedure, it might be useful to begin by considering once more an empirical concept, say, the concept of water, and ask, what it is in this case that helps us to settle the question of meaning once we have left the conceptual plane. Obviously, the only thing that helps us in this case is that we show water to someone or throw the person into it and say: this is water if you really wanted to know. In such a case, we bring about an empirical intuition. We have no choice but to do this, and, in particular, we depend on the existence of the object concerned, which in turn does not depend on us. Someone who has realized what a triangle is, on the other hand, has realized that "he must not investigate what he saw in the figure or the mere concept of it, and read off its properties, as it were. Rather, he must bring about (through construction) what he himself, according to concepts, a priori thought into and presented in the figure, and he ... must not attribute anything to the matter except what necessarily follows from it, i. e. what he himself in accordance with his concept has put into it,,28. With this example, Kant describes what one might call the primordial mathematical phenomenon: Because we know certain procedures for the generation of pure, non-empirical objects, we have the power of making or defining concepts. What is peculiar about these objects in distinction from an empirical object is the fact that a single one of them already adequately expresses the generality of the concept under which it falls. This is so because with such an object we only pay attention to its principle of generation, something that is not possible with empirical objects just because their existence does not depend on us. The possibility of such conceptual definitions, however, does depend on the givenness of an object into which we make the definitions. Kant expressly stated that he thought of space as an object. For he speaks of "space represented as object (as one actually requires it in geometry),,29. With regard to numbers, however, the matter is not quite as clear, since their generation is, on the one hand, regarded as a pure act of the spontaneity of our understanding, while, on the other hand, there is talk of an 'intuition of numbers'30. The difficulty with which Kant saw himself confronted in this regard probably consisted in the fact that for geometry he only had to go back to space, while in arithmetic he would have to have recourse to numbers and to time - a difficulty which I cannot get into now, however. 28 29
30
CPR, B XII (my emphasis) CPR, B 161, note Cf. letter to Rehberg in no. 1
VIII.35 Kant's Philosophy of Mathematics
529
I have dealt Kant's doctrine of definition in some detail because without peculiar features of this doctrine, the other part with which we shall be concerned now - namely, his view of the propositions of mathematics and of their possible proofs - cannot become intelligible at all. What is probably less important in this regard is the formal analogy which consists in the fact that just as one can define concepts from other concepts, one can also prove propositions from other propositions and that in both cases this procedure must eventually change into something fundamentally different. What is important is rather the fact that, according to the doctrine of definition, mathematics is credited with an objectivity sui generis and that this offers us a starting point for understanding how Kant could have thought of the propositions of mathematics as synthetic. It is clearly true that mathematical objects are not objects in the usual Kantian sense, that is, they are not objects which are only given to us in an empirical intuition. Nevertheless, in mathematics, unlike in logic, one does not abstract from all content so as to be left with nothing but the form of thought. In addition to the form of thought one is also dealing with the form of intuition, or, as Kant expressly emphasizes with regard to space, one is dealing with an object. It is here, if anywhere, that the synthetic character of mathematics must have its origin. Now, it is always a question, which of Kant's statements one wants to cite on behalf of an interpretation, especially since these statements are often very difficult, if not impossible, to reconcile. For a closer examination of the issue in question, I, for my part, want to begin from the following argumentation given by Kant: Mathematical judgments are altogether synthetic. Hitherto, this fact seems to have escaped the observation of the analysts of human reason. More than that, it seems to be directly opposed to all of their assumptions, even though it is incontestably certain and very important for what follows. For since it was noticed that the inferences of the mathematicians all proceed according to the principle of contradiction (which every apodictic certainty demands by its very nature), one convinced oneself that the fundamental propositions too were known from the principle of contradiction. This was an error. For though it is true that a synthetic proposition can be discerned in accordance with the principle of contradiction, it can be so discerned only if another synthetic proposition is presupposed from which it can be inferred; it can never be so discerned in and by itselfH. First of all, we find in this argumentation the formal analogy between proof and definition: Just as we can define concepts from other concepts, so we can prove propositions from other propositions. And just as there are concepts that can only be defined - as we can now say retrospectively - in and by themselves, namely, through an original presentation in intuition, so there 31
CPR, B 14
530
VIII.35 Kant's Philosophy of Mathematics
are propositions the truth of which is only discerned in and by themselves, i. e. without other propositions being presupposed. Moreover, I think it is clear that for Kant this analogy extends to the fact that just as the meaning of a concept defined from other concepts is not fixed without the meaning of the latter being fixed, so the truth of a proposition inferred from other propositions is not fixed without the truth of the latter. If this is presupposed, then we can say, still in analogy to the act of definition, that the proof of a mathematical proposition in general divides into two parts. The one part, which is possibly inessential with regard to the synthetic character of the proposition to be proved, consists in the proof of the proposition from other propositions. I say, ''possibly inessential", for the following reason. In the cited passage, Kant, on the one hand, says only that a synthetic proposition can be discerned in accordance with the principle of contradiction and that, if this procedure is employed, the proposition presupposed must likewise be synthetic. Besides this, on the other hand, he also says that the inferences of the mathematicians are all of this kind, i. e. inferences in accordance with the principle of contradiction. If this were to be taken seriously, then, of course, the proof of a proposition from other propositions could in no case ground the synthetic character of the proposition to be proved; rather, the latter would always go back to the respective premises and not to the mere conclusion. Other, more representative, passages, however, give us reason to doubt this interpretation. Hintikka's interpretation, of which we shall speak in a moment, is partly based on this doubt. However this may be, it is clear in any case that the other part of the proof of a mathematical proposition must ground its synthetic character. By this I mean the part which consists in the insight into the truth of some sentences in and by themselves. For, in this regard, Kant says that this insight cannot be gained in accordance with the principle of contradiction, and with this, of course, he means that we are concerned with an insight which at the same time reveals the synthetic character of the propositions. When it comes to examples, however, e. g. the geometrical proposition that the sum of the angles in a triangle is equal to two right angles or the arithmetic proposition that 7 +5 = 12, Kant, unfortunately, does not proceed in a consistent exemplification of what he says in the cited passage. That is, in the example given, he then no longer distinguishes the two generally so clearly distinguished parts. Instead, his ceterum censeo is the following. He says that the insight to be gained into the truth of a mathematical proposition is not, as in philosophy, a discursive cognition according to mere concepts, but rather an intuitive cognition through construction of the concepts. In other words, if a mathematical proposition is presented, then, regarding the insight into its truth, one must ask, how the concepts that occur in it are defined. If, in so doing, one limits oneself to the discursive element, i. e. to the mere definition of concepts from other concepts, one will find that one does not make any progress in this way. Only the inclusion of the full def-
VIII.35 Kant's Philosophy of Mathematics
531
inition, including the aprioric generation of the objects falling under these concepts that is characteristic of mathematical concepts, will lead to the desired insight. And it is just this recourse to specific objects that constitutes the synthetic character of the relevant proposition. As evident as this argumentation is for axioms of mathematics, where on the basis of self-evidence we effect an immediate transition from an object to a proposition about it, Kant exemplified this argumentation also in proofs which obviously contain steps leading from propositions already presupposed to other propositions. Thus Kant gave a suspicious commentary to the proof of the geometrical proposition that the sum of the angles in a triangle is equal to two right angles. In this proof, according to Kant, "we are always led by the intuition through a chain of inferences to a completely evident and at the same time general solution to the question,,32 What does it mean, however, that in a chain of inferences we are led by the intuition? Later this became a focal point of attack by Kant's critics, and recently it has received the following interpretation by Hintikka. What today we call the proof of a proposition from other propositions and what we carry out by logical means alone divides into two parts in a tradition that began with Euclid and that was still binding for Kant. The first part introduces the objects with which the proposition to be proved is concerned, e. g. an arbitrary triangle, and in addition, in an auxiliary construction, it introduces those objects which are required for the proof such as, for example, in the angle sum theorem, the parallel to one of the sides of the triangle. In the second part of the proof, no new objects are introduced. Rather, one only draws inferences on the basis of what was constructed as a whole. Hintikka now argues that Kant acknowledges that the second part of the proof is carried out purely logically and accordingly does not hold it responsible for the synthetic character of a mathematical proposition. In the first part, however, intuition is used precisely in the sense which Kant primarily gives to intuition: That is to say, one or the other object is introduced in its own individual representation. And this is just what introduces a synthetic element into the proof, an element which in the second part - where we have already assembled everything, as it were - does not appear any more. In modern terms, a cut is made through quantificational logic which no longer classifies inferences concerning new individuals as analytic. And there is certainly something to be said for this interpretation, considering the fact that the logic available to Kant had its weak point precisely with regard to quantifiers. 33 III In my concluding third part, I can only give a very brief outlook on the post-Kantian development of mathematics and its situation today. And here I want to formulate the result of the considerations so far with a certain 32 33
CPR, B 744f (my emphasis) See Chs. 7 and 8 of Hintikka 1974
532
VIII.35 Kant's Philosophy of Mathematics
liberality, i.e. in abstraction from the details of Kant's definitions and argumentations, to be that Kant claimed the synthetic and nevertheless aprioric character of mathematics in the following dual sense: On the one hand, mathematics, in the form of its theorems, delivers aprioric conditions of possible experience. Viewed in this way, mathematics thus plays a positive role in our knowledge of objects, that is, of objects, the existence of which does not depend on us. On the other hand, mathematics also has its own proper objectivity. This, however, is not ontologically relevant in the traditional sense, i.e. it does not have an existence independent of the human power of cognition. Thus, when we speak, for example, about the spatial form of empirical objects or about the duration of processes, or if we count a set of empirical objects, we create a connection between the two kinds of objects, and this connection is somehow also responsible for the determination of the empirical objects and for our knowledge of them. Now, I want to ask: Was this view of mathematics and its relation to the science of nature fundamentally mistaken - fundamentally, that is, apart from the details of its technical formulation and apart also from the historical conditions to which it was subject? My answer is that it was not. It is true that, as far as mathematics itself is concerned, it might have seemed for a time as though Kant's starting point had to be discarded completely. For mathematics seemed to be on the way to becoming a science of which Russell could say that" ... [it] may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true". Thus, first, geometry was deprived of its traditional object in the course of the logical investigations into the independence of the parallel axiom and the discovery of the so-called non-Euclidean spaces. For mathematicians became more and more convinced by the idea that what they were doing as mathematicians when they were doing geometry was nothing but to establish logical inferences, that is, of theorems from axioms which, as logical inferences, are completely independent of a particular object of geometry. Then there was the logicism of Frege and later of Russell in which one attempted also to get rid of natural numbers as independent objects by conceiving them as properties of concepts. Thus, for example, the 1 was to be conceived as the property belonging to an object precisely when there exists an object x which falls under the concept while at the same time all objects which also fall under it are equal to x. In addition to these processes of the elimination of original objects of mathematics, there was an extension of traditional logic, in particular through the introduction of the existential and universal quantifiers, a development which in retrospect could be interpreted as a reduction of the stock of theorems of mathematics. Beyond the developments mentioned so far, however, the most important relevant event by far at the end of the 19th century was the creation of set theory by Cantor. Its historical significance consists in the fact that, on the one hand, it positively determined the self-understanding of mathematics in
VIII.35 Kant's Philosophy of Mathematics
533
our century and led to a completely new orientation of this science, while, on the other hand, together with the development mentioned earlier, it led to foundational crisis of mathematics. For the contradictions revealed both in set theory, which at that time was still pursued in a naive fashion, and in arithmetic, which Frege had already equipped with all the formal refinements, resulted in a radical re-consideration in which, characteristically, mathematics itself, to a greater or lesser extent, became an object of reflection. The characteristic feature of these investigations, which were initiated especially under the leadership of Hilbert, was the fact that - to use Kant's words - the mathematicians did not ponder their science with a philosophical eye, but with the eyes of the mathematician and specifically with those of the mathematician in Kant's sense. For, in the shape of formal theories, mathematicians created new objects for themselves which, suitably interpreted, were able to represent a more or less extensive stock of mathematics. A formalized logic, a formalized arithmetic, a formalized set theory are all examples of formal theories. Now, if, as a metamathematician, one subjects a formal theory to the characteristic questions, for example, regarding their consistency or their completeness, it is clear that in no case can the potential insight into whether or not these properties obtain be achieved by purely logical means. For these are insights concerning a particular object. All great insights gained in this area - insights into the incompleteness of arithmetic, into its consistency, into the incompleteness of set theory, into the independence of the axiom of choice etc. - have a non-logical core. Without being able today to offer a current definition of the concept "synthetic", it seems to me that, if anything, it is the theorems just mentioned that are synthetic propositions a priori. But even at the mathematical, as opposed to the metamathematical, level, a similar picture emerges today. With respect to a formalized set theory which today is best able to represent the stock of mathematics, a mathematician can take the position that his activity is to be construed as a carrying out of purely formal inferences in this theory. In that case, he basically sees himself as a metamathematician; his results concern a certain object, and as such they perhaps have a certain meaning but certainly no universality in the logical sense. Or he takes the position that his formal system is interpreted by a model of set theory and serves in getting to know it. In this case, most of what the mathematician does is a logical activity in the sense of the inference of propositions from axioms. What remains, however, is on the one hand the insight into the truth of the axioms, just as in geometry and arithmetic, and on the other hand, because of the incompleteness of the axiomatization, the possible necessity of an immediate return to the model. And again, neither can be achieved by purely logical means. A final remark now about the role of mathematics in empirical knowledge. Kant's achievement consisted in the fact that he overcame the Platonic schism which until his time had either dominated or not yet been understood, that is, the gap between atemporal ideas on the one hand and the objects
534
VIII.35 Kant's Philosophy of Mathematics
of sense on the other. Until the rise of modern natural science, this perhaps did not present a serious problem. After it had become apparent, however, that the successes of the new physics in the science of nature were due above all to the application of mathematics, there was the need for an explanation of how it is that the two worlds happen to fit each other, as it were. So as not to leave this phenomenon in the state of a miracle only to be marvelled at, Kant stripped the world of the mathematical ideas of its independence by declaring it to be a product of the human imagination and by making it - besides other things - the measure of possible cognition of empirical objects. Even this starting point, I still consider worthy of further investigation, only, one must not go as far as Kant and declare - to put it somewhat modernistically - a certain empirical interpretation of a mathematical discipline to be absolutely binding. Nevertheless, mathematics makes possible an empirical science such as physics in the weaker sense that in any testing of a physical theory, together with the measurements to be carried out, certain mathematical propositions play the decisive role. So far, one could not have given assent to or rejected a single theory of physics without making, besides measurements, also calculations, and at times very complicated calculations. This is, of course, due to the fact that the theories in question are already formulated mathematically, and the original ground, as it were, of this possibility must be reconsidered in light of the present state of mathematics and natural science. I am not sure how much of a difficulty this still presents for us today. But I think it is possible that to accomplish this task we could use a Kant of the 20th century.
VIII.36 Mathematics and Physical Axiomatization* Introduction It was in Paris where, at the dawn of our century, David Hilbert presented that famous lecture to the Second International Congress of Mathematicians at the end of which he gave a list of 23 mathematical problems still waiting for their solution l . Most of these problems really were of a purely mathematical nature. But there were also questions about the foundations of mathematics, and the 6th problem seemed to be no mathematical problem at all. It was the problem 'to axiomatize those physical sciences in which mathematics plays an important role'. As is well known about twelve years later Hilbert himself embarked in the project of ax iomati zing physics, commenting this escapade by saying that 'physics is much too hard for physicists'. However, it soon turned out that it was to hard even for a mathematician. In the opinion of Hermann Weyl Hilbert 'greatly enjoyed this widening of his horizon and his contact with physicists. .. The harvest however can hardly be compared with his achievements in pure mathematics ... Hilbert's vast plans in physics never matured'2. For somebody working in the same field the failure of so great a man is, in an obvious sense, frightening and encouraging at the same time. For this paper I shall take it to be an encouragement. For in the case of its failure I shall at least be in good company. Why is it - this will be the question to guide us into my subject matter that Hilbert restricted his project ofaxiomatization to those parts of physics 'in which mathematics plays an important role'? An answer is suggested by the complete wording of Hilbert's 6th problem: 'The investigations on the foundations of geometry suggest the project, with these investigations as our paradigm to axiomatize those physical sciences in which mathematics plays an important role already now: in the first line these are the probability calculus and mechanics'. According to this formulation Hilbert's view seems to be that the presence of mathematics favors the axiomatization of a discipline because, as in the case of geometry, mathematics itself is - by its very nature - an axiomatic science every part of which either is already axiomatized or easily lends itself to axiomatization. So Hilbert seems to tell us: Reorganize your physics more geometrico wherever the presence of mathematics allows you to do so. We shall see presently that, apart from objections coming from different quarters, it is doubtful whether Hilbert himself really meant to say just this. However, on the face of it our answer is supported by an old and honorable tradition well known to a man like Hilbert. According to this tradition mathematics actually is the form of physical thinking. Such a view is evident * First published as Scheibe 1986d. 1
2
Hilbert (1901). The passage quoted is on p. 306 of the 1935 reprint. Weyl (1944). The passage quoted is on p.l71 of the 1968 reprint.
535
536
VIII.36 Mathematics and Physical Axiomatization
already in Galileo's famous words: 'Philosophy (i.e. physics) is written in this grand book - I mean the universe - which stands continually open to our gaze, but it cannot be understood unless one first learns to comprehend the language in which it is written. It is written in the language of mathematics . .. without which it is humanly impossible to understand a single word of it'3. That the tradition opened by Galileo is still alive need not be emphasized. But I may quote Truesdell who puts Leibniz in the place of Galileo and after having deplored that 'in the period between the two wars, the program of Leibniz was neglected' gives but another formulation of the 17th Century view by saying: 'Modern natural philosophy returns to the old program of making the physical concepts themselves mathematical from the outset, and mathematics is needed to formulate theories 411 • Clearly for somebody thinking on these lines the systematization of mathematical and physical concepts becomes one and the same thing at least in principle. Up to this point I have mentioned some views of mathematicians and physicists, and so far everything seems to be in complete harmony. But the views are somewhat ambiguous, and this may become visible when we introduce a bit of philosophy to destroy the harmony. In philosophy we find the platonic tradition holding the view that mathematics is about some things sui generis, things that came to be called 'abstract entities'. The tradition of mathematically minded physicists mentioned a moment ago is not committed to platonism. It is compatible with it, but from a modern point of view it is also compatible with some kind of formalism in Hilbert's sense. If it is combined with platonism then the physicist would look at mathematics as a vast store of ideal structures some of which are recovered in nature in a less perfect state but good enough to be identified. The statements of physics would then be statements of quasi-isomorphisms between the mathematical and the physical structures. If, on the other hand, our mathematically minded physicist is a formalist then his business would consist in finding out which parts of his mathematical formalism can be given a physical interpretation. Whereas the first mentioned position may be called applied platonism its formalistic counterpart would be an applied formalism or weak nominalism. Since formalism in the sense of Hilbert allows for arbitrarily strong axiom systems for mathematics the two positions may turn out not to be too far apart from each other if the emphasis is on the viewpoint of application of mathematics. It is different with a decidedly anti-platonic, radical nominalism. This can be seen anew with particular clarity from a recent contribution to it by Hartry Field. In a remarkable book with the provocative title 'Science without Numbers' Field proposes 'to show that the mathematics needed for application to the physical world does not include anything which ... contains references 3 4
Quoted from Seeger (1966), p. 5l. Truesdell (1967), p. 45.
VIII.36 Mathematics and Physical Axiomatization
537
to ... abstract entities like numbers, functions, or sets,5. Explaining his position in more detail Field goes on to say that 'the part of mathematics that doesn't contain references to abstract entities is really just applied logic: it is the systematic deduction of consequences from axiom systems ( ... containing references only to physical entities). Very little of ordinary mathematics consists merely of the systematic deduction of consequences from such axiom system: my claim however is that ordinary mathematics can be replaced in application by a new mathematics which does consist only of this'6. It is evident that, whereas applied platonism, and perhaps even weak nominalism would approve of the idea that the mathematics applied in physics be our guide in physical axiomatization, Field's claim is at variance with this idea. For his claim is that a great deal of ordinary mathematics as we actually find it in the usual formulations of physical theories can be eliminated without loss of physical content and, therefore, should be eliminated at least in principle since its existence can only be misleading as regards the proper physical ideas. Let me illustrate the situation by two extreme examples taken from physics: empirical laws and quantum mechanics. An empirical law is a relation between physical quantities that is isomorphic to a numerical relation, - a relation between the numerical values of those quantities. Galileo's law of free fall, Kepler's third law, Planck's radiation law, the van der Waals equation and many other similar physical laws are cases in point. The remarkable thing about these laws is the way in which the relation expressed by them is specified. There is first the uniformization of the physically different quantities combined in the law by replacing their values by things of one and the same kind: by real numbers. This makes possible the second step in which a numerical relation is specified by using the familiar arithmetical operations on real numbers together with limiting processes based on the standard topology of the reals. Now, what is puzzling in this process is that, whereas the numerical relation used to represent the law, by the very nature of the process, has got a physical interpretation, no such interpretation is given to the arithmetical operations and the limiting processes by which it is defined. Consequently we cannot run through the process of understanding that we are used to whenever some relation is defined in terms of other, basic relations. If, for the sake of simplicity, the numerical relation is assumed to be rational then its truth conditions are contained in a diagram of atomic sentences giving a complete list of the elementary arithmetical 'facts' about numbers. But there does not exist a physical reduction corresponding to this mathematical one: The truth conditions of the physical relation expressed by the law cannot be traced back to elementary physical facts corresponding to those arithmetical ones or to whatever. There are no further physical facts, 5 6
Field (1980), p. 1 f. Field (1980), p. 107, n. 1.
538
VIII.36 Mathematics and Physical Axiomatization
and, consequently, no further physical insight is provided by the mathematical representation of the law. It is for this reason that if an empirical law turns out to be true or nearly true then this appears almost as a miracle: We can only be amazed at this effect of physically uninterpreted constituents of our theory. The Pythagoreans, discovering the isomorphism between musical intervals and numerical ratios, may hav~een the first to have experienced such a miracle. The empirical laws of modern physics are even more impressive illustrations of a state of mathematical overdetermination as regards interpretation, and Field seems to be justified in his desire to get rid of any surplus mathematics in physical theory. On the other hand, the very same laws are also good examples of the difficulties to be expected in pursuing this goal: It is hard to see how anyone of them could be reformulated without loss of physical content and without using numbers. There is, to be sure, the possibility of reducing an empirical law to a more comprehensive theory, and indeed this reduction always has been among the most distinguished aims of physics. But in principle the problem is only shifted thereby to the question what and how much of mathematics is used in formulating the reducing theory. And this cannot be a trifle since it is the mathematical formulation of the empirical laws that is derived in the reduction. My second example leads us back to Hilbert's project and the way he approached it. The example - quantum mechanics - shows that also in the physical axiomatization of mature and comprehensive theories the double aspect of a system of physical concepts and a formalism having something specifically mathematical about it is embodied in the enterprise. At the same time in this case we can be almost certain that more mathematics is involved than would be needed on account of physical reasons. In a paper of 1928 on the axiomatization of the new quantum mechanics Hilbert and his co-authors von Neumann and Nordheim give a general account of their endeavors 7 . Probabilities being the basic entities in quantum mechanics the authors say: 'Certain physical requirements are imposed on the probabilities, suggested by our experience. .. and implying certain relations between the probabilities. Then we look for a simple analytical formalism involving quantities that satisfy just these relations ... The aim is to formulate the physical requirements with just sufficient completeness to define precisely the analytical formalism'. The authors then mention the geometrical paradigm where the analytical formalism is the arithmetical interpretation of geometry, carefully to be distinguished from the geometrical concepts and axioms proper. Already the geometrical example, comprising the arithmetical realization, shows that the presence of mathematics is not a guarantee that we have found the physically appropriate axiomatization. Analytical geometry certainly is a very convenient tool for solving mathematical problems in geometry. But 7
Hilbert et al. (1928). § 1
VIII.36 Mathematics and Physical Axiomatization
539
from a purely physical point of view too much mathematics is involved in it. An axiom system for Euclidean geometry not mentioning the real numbers will - ceteris paribus - be 'more physical' than one that makes use of the reals. In quantum mechanics the analytical formalism mentioned by the authors is functional analysis in a complex Hilbert space. This again is too much mathematics and this time not only with respect to what is involved in a spectral representation of Hilbert space. Contrary to what is expressed in the text quoted, in quantum mechanics it is not possible to define the Hilbert space on the basis of sufficiently complete physical requirements: For all we know two vectors in Hilbert space belonging to the same I-dimensional subspace have only one possible physical referent. Therefore an axiomatization of quantum mechanics would have to abstract from Hilbert space a new structure better adapted to real differences admitted by nature. In a literal sense this amounts to the elimination of Hilbert space although the process is by degrees and in the present case does not take us away very far. In his book Field is mainly engaged in a case study. He there tries to find a nominalistic ally admissible substitute for scalar field theories in flat spacetime such as Newton's gravitational theory in its field version. I am rather skeptical about so ambitious a program as that of finding a physics without numbers. However, my introductory remarks were meant to indicate that it is one thing to have this or that piece of mathematics in a physical theory and quite another to be in the possession of a physically lucid axiomatics. It may therefore be worthwhile to start an elimination program and use it as a method to learn more about the role of mathematics in physics.
I. Set Theoretical Axiomatization of Physical Theories Coming now to a more systematic development of the matter the first question to be answered is: What kind ofaxiomatization of a physical theory shall we choose as the starting point for all subsequent investigations? I will choose set theoretical axiomatization for two reasons: First there is general agreement that classical mathematics can be reconstructed in one or the other of the usual set theoretical systems. Second, most current work on higher level theories of physics and their axiomatization makes ample and almost reckless use of informal set theory. Therefore at present a rigorous reconstruction of these endeavors can be obtained most easily within a set theoretical system. I begin with the formalization business in which I want to stick to the common tripartition logic - mathematics - physics and try to articulate it. Accordingly the following subdivision of the formal part of a physical theory will be accepted:
540
VIII.36 Mathematics and Physical Axiomatization A) B) C)
First order logic in one of its codifications. The set theoretical system ZF of Zermelo-Fraenkel. A species of structures in the sense of Bourbaki based on ZF and expressing the physical axioms: E(X,Ai S ), containing a typification S
E
a(X, A)
where a are terms constructed from the X and A by means of Cartesian products and power sets. The X and s are new constants and the A are defined const ant S8 . According to these requirements a physical theory is a first order theory, and its basic concepts
are first order concepts. However, the presence of set theory has the consequence that species of structures have an internal type theoretical structure: The formation of the scale terms entering the typification exactly corresponds to the formation of higher order predicates in type theory. It is obvious that this opens a way to an elimination program, and I shall come back to this aspect later on. Examples of species of structures abound from mathematics: In point of fact all the well known concepts of a group, ring, vector space, topological space, manifold, fibre bundle, etc. are defined by axioms that can easily be reconstructed as so many species of structures. It is likewise a fact that these mathematical concepts are frequently applied in theoretical physics, especially in its higher level theories. This application does not yet show that physical theories themselves can formally be reconstructed as species of structures. However, as recent investigations seem to show, such a reconstruction is possible9 . Let us now look at our set theoretical foundations. Field seems to think that if physical axiomatics is based on set theory at all then we have to resort to a version allowing for individuals (= Urelemente in the sense of Zermelo)10. The argument is, of course, that besides mathematical objects, accounted for by pure set theory, there must be room for physical objects as well. However, it can be shown that if we start with a version of set theory admitting individuals then to every structure belonging to a species there is an isomorphic structure belonging to the same species and consisting of mathematical ob8 9
10
Bourbaki (1968), Ch. IV. X, A and s are abbreviations for finite series Xl, ... , Xn; Ai, ... , Ae and Sn respectively. Direct application of set theoretical predicates was first suggested in Suppes (1957), Ch. 12; see also Suppes (1970) and Sneed (1971). Species of structures, a subclass of set theoretical predicates, are used in Ludwig (1978) and (1981). For a comparison of the two approaches, differing also in other respects, see Scheibe (1982b), this vol. 111.12, and (1983). Field (1980), p. 9.
VIII.36 Mathematics and Physical Axiomatization
541
jects onlyll. From a purely structural point of view we can, therefore, restrict ourselves to ZF as our set theoretical framework. How then would applied formalism proceed in its attempt to give a purely physical interpretation of ZF completed by some physical axioms? Such an enterprise seems preposterous if by an interpretation we understand what is usually understood by it in first order semantics. For in this case the natural expectation of looking for interpretations in order to find models would amount to no less than to expect to find a physical model of ZF. In view of this hopeless situation I shall indeed qualify the idea in question but not before having made one comment in matters of principle. As a matter of principle the situation in question is only an extreme case of the normal situation in which we find ourselves whenever we are going to apply any piece of formalism in physics. We not only never know in advance whether our formalism will be satisfied by nature. We sometimes do not know it for a long time, and frequently we actually know in advance that it will not be satisfied in some respect. Physicists have created an euphemism for describing this situation: They say that in mathematics we idealize the real physical situation, meaning thereby that we knowingly make a mistake or admit some essential incompleteness. This notorious accompaniment of physical theory does no harm as long as we are successful with respect to some part of our formalism. Moreover, we should learn from it that no mathematical formalism, including set theory, is taboo as regards its direct physical application. In principle, every formalism is applicable, and it is only for historical reasons that in some cases, the most prominent of which may be number theory, we hesitate to give them a physical interpretation 12. 11
In a set theory admitting individuals this follows immediately from the quite innocent axiom (or theorem) that to every set there exists a set with the same cardinality consisting of mathematical objects only. A mathematical object is a set x such that every descending chain ..... cX n c ..... cX1cX
12
terminates with the empty set. (See Suppes (1960) for a formulation of set theory admitting individuals, Kunen (1980), p. 8 f, for the concept of a mathematical object and Takeuti/Zaring (1982), p. 21 f, for the finiteness of descending chains). The following is a quotation from Hilbert (1918), p. 149 of the 1970 reprint: "In the theory of real numbers it is shown that .. , the so-called Archimedean axiom is independent of the other arithmetical axioms. This finding . .. leads to the following result: the fact that by adding up terrestrial distances we finally obtain distances of cosmic dimension ... and, likewise, the fact that distances in the atom can be expressed in terms of the meter rule are by no means a logical consequence of the theorems on congruences and geometrical configuration. Rather they have to be established by independent empirical investigation. In this sense the validity of the Archimedean axiom in nature has to be confirmed by experiment just as it is the case in a well-known sense with the theorem on the sum of the angles in a triangle". In this passage an arithmetical axiom (although one of an immediate geometrical significance) is exposed to empirical test pari passu with geometrical theorems proper. But still we seem to be prejudiced against an
542
VIII.36 Mathematics and Physical Axiomatization
Moreover, with physical axiomatics as our goal, why don't we look at set theory as a natural extension of first order logic, - a quasi-logical extension to be employed whenever sets are coming up. As regards membership physical entities seem to be members of sets of such entities in exactly the same sense as, for instance, numbers are members of sets of numbers. Another reason for the quasi-logical character of membership is that we are hardly willing to step from set theory to physics by postulating further axioms about membership, - just as we would add the axiom of choice or the continuum hypothesis. It is true that if we add a species of structures 17(X, A; s) as a physical axiom then V~TJ· 17(~, A; TJ)· follows, and this is a formula of ZF. But it is very natural to make it a consistency condition on 17: that the formula in question already follows from ZF: G') f-~TJ·17(~, A; TJ)·
V
It then follows from G') that 17(X, A; s) (with new set constants X and s) is a conservative extension of ZF in the sense that any formula of ZF provable in ZF U17{X, A; s) is already provable in ZF. Thus the physical axioms never strengthen the set theoretical ones. However - and here comes the qualification announced a moment ago , we need not require the physical interpretations to be models of set theory but only partial models in the following sense. An interpretation of a (first order) theory is a partial model of it if and only if it is a restriction of a substructure of some model 13 . That this is a reasonable concept may at once be seen from geometry. There was a time when Euclidean geometry was viewed as an a priori science. When it became empirical it should have been clear from the outset that its actual realization (by rigid bodies and light rays) had no chance whatsoever to be a model of Euclidean geometry if only because it was so hopelessly incomplete and rudimentary. And whoever believes Euclidean geometry to be refuted by general relativity does so because its existing realization cannot even be a partial model of the latter. In the present context the concept of partial models suggests itself because we could restrict the interpretation of our formal language to the constants X and s and to the membership as far as it does concern their referents and still have every reason to expect this interpretation to be a partial model even of ZF, let alone the physical axioms. It is in this sense that we may look at ZF as a reasonable candidate for being directly applied in physics 14
13 14
empirical meaning of arithmetic. Can we think of a development, analogous to that of geometry, that would adequately be described by saying that ordinary arithmetic has been empirically refuted in favor of such and such other number systems? For this definition it is assumed that no function symbols are admitted lest the concept of substructure becomes to narrow. It has to be noted that a partial model of ZF may be a structure without auxiliary base sets although the proof that it is a partial model may involve these sets. It is in this way that the most suspicions candidates for purely mathematical entities are taken into account.
VIII.36 Mathematics and Physical Axiomatization
543
II. Replacement Within Set Theory Having argued in favor of set theory as a most convenient and fairly reasonable formal basis for the reconstruction of a physical theory we must now recall that the main question of this paper as it was conceived at the end of the introduction would now have to be reformulated as being the question: Having accepted set theory as our basic formalism, how can we get rid of it again? Or, more accurately: How much of it do we really need in order to do physics and, on the other hand, how much of it can be missed without loss of physical content? In the part of the paper I am now entering our logico-mathematical framework, i.e. the system ZF of set theory, will still be retained. But the problem of its elimination will already be prepared. In this part I am going to point out a general scheme for the replacement of one species of structures by another one. If the species of structures are used as the formal part of a physical theory, this replacement may under certain favorable conditions be an improvement in the formulation of the theory. Moreover, under very special conditions the replacement in question may even initiate the elimination of set theory in favor of some more modest logical framework for physical theory. To introduce the scheme let me first give an illustration that, although perhaps only of mathematical interest, is particularly suitable for introduction. Starting with a structure. <XiS> where s is a set of subsets of X, closed with respect to intersection and complement, we may deduce from this structure of species E a set algebra in the usual way by providing s with set theoretical intersection, union and complement. Suppose we then ask ourselves: What is it for a structure to be isomorphic to a set algebra? With this question we are looking for a second species of structures () such that a structure would belong to this species if and only if it is isomorphic to a set algebra. According to a well known theorem one reasonable answer is: Let () be the species of boolean algebras. Indeed, the set algebra deduced first is a boolean algebra, and any boolean algebra is isomorphic to a set algebra. I say that we here have a 'reasonable' answer because our question has no unique answer. The answer given is reasonable because it is formulated, so to speak, in the natural language for universal algebras. But a species defined as consisting of structures isomorphic to set algebras would have been another, if only trivial, answer to our question. Sometimes the achievement mentioned first is paraphrased by saying that we have succeeded in characterizing in abstracto (here: as a boolean algebra) what was originally only given in concreto (here: as a set algebra). If we are looking at the process in the reverse direction we would say that we have found a most general representation of a species of structures (here: boolean algebras) by structures (here: set algebras) deduced in a certain way from structures of an originally given species. The foregoing example is typical for the most general case of an abstraction or representation scheme because it includes a change in the principal base sets of the structures involved. An example from physics of this type
544
VIII.36 Mathematics and Physical Axiomatization
comes from quantum mechanics where we have reasons to replace Hilbert space by a certain algebra with a new principal base set. But I will introduce the matter from the physical point of view by a more elementary case taken from geometry. Suppose we define an n-dimensional Euclidean space X with a distance function de Pow(X 2 X IR) by the requirement that there be a bijection of X onto IRn which maps d canonically onto the usual Euclidean distance function on IRn. Although this would be a perfectly precise definition of a species of structures, from the view-point of physical axiomatization it may be critisized for various reasons. As regards physical meaning it could be argued that no unit of length is intrinsically distinguished by the nature of space. Although the distance function d would have physical meaning, its meaning would contain a completely arbitrary element waiting for elimination. This elimination could improve on another weak point of the formulation: the explicit use of the real numbers as possible values of distances. However, the most inadequate part of our axiomatics of Euclidean geometry presumably is the introduction of coordinate systems by bluntly requiring their existence in an axiom instead of proving it from axioms directly dealing with the geometrical subject matter 15. Now it is well known that an axiomatization of Euclidean geometry meeting all these objections has been given by Tarski 16. And if we look how it is related to the one mentioned a moment ago then we have another example of our representation scheme before us: Starting with a Euclidean space as defined we deduce a congruence relation and a betweenness relation on the space, prove Tarski's axioms for them, and finally show that any space directly introduced by Tarski's axioms can be obtained from a space as introduced here in the way indicated. Since in this case there is no change of the principal base set (the space) there is no need for inserting an isomorphism for the representation. Apart from the improvements mentioned Tarski's axioms are 'almost' of the first order in the sense of the internal order structure of a species of structures given by its typification. This is an aspect I shall resume in the last part of the paper. For the moment the case in point is the elimination of the real numbers from the original axioms. In his work Field extends Tarski's results to scalar field theories based on Euclidean geometry 17. There are physical laws for 15
16 17
A particularly clear case of introducing physically meaningless elements would be a structure < X; d, y > where < X; d > is a Euclidean space as before and y is one of the Cartesian coordinate systems distinguished by d. From the physical viewpoint the further distinction of one of these coordinate systems appears to be completely arbitrary. On the other hand, the distinguished coordinate system y and indeed any other Cartesian coordinate system leads to an arithmetical interpretation of geometry and thus provides for that 'simple analytical formalism' that was postulated by Hilbert independently of the requirement of a physically reasonable axiomatization, see no. 7. See Tarski (1959), Theor. 1, with axiom A 13 replaced by the second order axiom on p. 18. Field (1980), Chs. 6-8.
VIII.36 Mathematics and Physical Axiomatization
545
scalar fields in space that are invariant against linear transformations of the field value. Pretty much as in the case of the geometrical distance function we may in such a case say that neither a unit nor a zero point is intrinsically fixed for these fields. Examples are temperature with respect to the laws of heat conduction and gravitation in the Poisson version of Newton's theory. The true subject matter of such theories is not a scalar field with well determined numerical values. Rather it is a certain equivalence class of such fields, and since such equivalence classes are even more suspicious entities than are the fields themselves one has to characterize them by more elementary objects in the same way as Tarski succeeded in characterizing distance functions modulo a positive factor by the first order relations of congruence and betweenness. The scalar field is replaced by a sort of congruence relation concerning quadruples X,y,u,V of points in space telling us whether the absolute field difference between x and y equals that between u und v. Likewise a betweenness relation involving three points x, y und z tells us whether the field value in y is between that in x and z. Then physically reasonable axioms are proved about these new relations, and these axioms are strong enough in order to show every structure satisfying them to be representable by a scalar field. Thus in pursuing his goal Field tries to find what I have called representation schemes. In fact his representation theorems all belong to the class with no change in the principal base sets of the relevant structures. It goes without saying that the representation scheme indicated is only the first station on a long trip that eventually leads to a reformulation of, say, the laws of gravitation. On this trip Field has to look after those dependent concepts of the conventional theory that happen to make sense also on account of the new basic concepts. Together with the scalar fields, represented by numerical functions, the physical laws as expressed by differential equations go over board. They have to be replaced by laws directly referring to the new physical entities, and that has to be done in such a way that, roughly speaking, the new entities obey the new laws if and only if the scalar fields replaced by them obey the original differential equation. And this is what has to be done in most cases of representation schemes if they are used for physical reaxiomatization. Since my aim is only to lay the general foundations of what happens in these endeavors I need not pursue the further development, and may now quickly give the precise definition of the concept illustrated so far. I shall call a representation scheme (or alternatively: an abstraction scheme) any triple consisting of a species of structures E, terms (P,q) intrinsic with respect to (j, and another species of structures () such that the following two conditions are satisfied 18 : 18
The first condition is essentially what Bourbaki calls "a procedure of deduction of a structure of species () from a structure of species E", Bourbaki (1968), p. 266 f.
546
VIII.36 Mathematics and Physical Axiomatization
(1)
() follows from an extension of E by means of the defining terms
(P, q), i.e. E(X, Aj s) /\ Y = P(Xj s) /\ t = q(Xj s) f-ZF ()(Y, Bj t) (2) () is a maximal consequence in the sense that if ,),(Y, Bj t) is any invariant consequence of the premise in (1), formulated in the (Y, t)-language, then ()(Y, Bj t) f-ZF ,),(Y, Bj t). Viewing both sides of the implication (1) as first order extension of ZF the condition (2) turns out to be the usual condition that the left side be a conservative extension of the right side, this condition being confined to invariant formulas 19 . The condition (2) of the general formulation is not the one used in the illustrations given before. But there is the following remarkable connection: Given (1) the condition (2) is equivalent to (2') ()(Y, Bj t) f-ZF V~17· E(~, Aj 17)/\ < Yj t >~< P(~j 17)j q(~j 17) > . where ~ means that the two structures are isomorphic (with respect to the typification of ())20. The concept of a representation scheme is an obvious asymmetric generalization of the concept of equivalence of two species of structures, in particular of strict equivalence where the structures are left untouched and only the axioms are changed. This can be seen by realizing that the passage from E to () can be divided into two steps, the first leading to the right side of (2') and the second being a mere re-axiomatization in the sense of strict equivalence. It goes without saying that in the physical context such a transition is not by itself an improvement of the formulation of a physical theory. On the other hand, most physical axiomatization affairs known to me can be subsumed under the general representation scheme. According to the nature of the case it will be impossible to specify precise conditions sufficient to guarantee improvement of physical axiomatization in a representation scheme. However, by way of illustration we have seen cases in which species of structures having no 'immediate' or a somewhat questionable meaning are replaced by others having meaning. These are cases in which the representation is not unique. Other important cases of this kind come from quantum theory21. Secondly, in the geometrical case and also in the field theories studied by Field number systems are eliminated as auxiliary 19 20
21
For the common concept of a (syntactally) conservative extension see Shoenfield (1967), p. 41. The general idea of a representation scheme is implicit in Field's work, see Chs. 3, 7 and 8 of his (1980). However, he is also influenced by Krantz et al. (1971) where a different concept of representation is used, see the discussion in Ch. 1. 4. The concept presented here grew out of a clarification of the concept of an axiomatic base in Ludwig (1978), §7.3. It was accepted by Ludwig in his (1981). As can be seen immediately from (2') our concept is a straightforward set theoretical reformulation of the so-called Ramsey elimination of 'theoretical' terms, these terms here being the X and s occurring in E. Typical results are Varadarajan (1968), Theor. 7.40, p. 179, for the lattice theoretical approach to quantum theory and Bratelli/Robinson (1979), Theor. 2.1.10, p. 60, for the approach using C' -algebras
VII1.36 Mathematics and Physical Axiomatization
547
base sets from the typification. Whereas this elimination concerns the explicit occurrence of lR. in a typified set (distance function and scalar field respectively) there is also the implicit mention of lR. by talking about coordinate systems in the axiom proper. Again, as we have seen, this reference can be avoided in the new formulation. Other highly non-trivial results of this type are known from geometry22. Finally, it has been indicated that the axioms proper may be improved by lowering their order. Thus in Tarski's axiomatization of Euclidean geometry all axioms except one continuity axiom, are of the first order, the latter being of the second. It is this aspect to which I will draw our attention in the last part of my paper.
III. Replacement of Set Theory Already at the end of Part I the idea was contemplated that a direct physical application of set theory would be reasonable only if the part of the world to which our theory is applied is not required to be a model in the usual sense but only a partial model of set theory. In Part II it was assumed that structures satisfying additional axioms, i.e. belonging to some species of structures, were the subject matter of a physical theory. Now these structures taken by themselves may very easily be conceived as being partial models of set theory. But if we take into account what is said about them in the axioms of a species of structures then this may still involve the whole of a model of set theory. The idea of a partial model of set theory is, therefore, illusory as long as we don't show that the content of the specific, the physical, axioms does not depend on the whole of a model of set theory but only on the structure under investigation. There is one chief source of trouble in this respect, and this are the bound variables in the axioms. They may make their appearance directly as, for instance, in the usual axioms defining a species of free algebras. Thus in the definition of a free boolean algebra we have that part of it in which it is said: given a mapping of the generators into any boolean algebra, etc. Since every set is the principal base set of some boolean algebra our bound variable here ranges over the whole of a universe of set theory. In the present case the critical requirement can be shown to be equivalent to a quite harmless statement with all bound variables restricted to the structure one is talking about 23. However, there is no guarantee that every species of structures is 22
One of them is the amazingly far-reaching solution of Hilbert's 5th problem contained in Yamabe (1953): The species of topological groups that are Lie groups is strictly equivalent to the species of topological groups that are locally compact and have a neighborhood of the identity containing no non-trivial invariant subgroup. Although this result has no direct physical significance it has recently been used in a group theoretical characterization of Euclidean geometry, see Schmidt (1979), Ch. 5.
23
Sikorski (1969), §14.
548
VIII.36 Mathematics and Physical Axiomatization
equivalent to one having its bounded variables restricted in this way. Again, unrestricted quantification may enter the scene in a more disguised form as part of the usual definition of, say, the set of natural numbers or of some larger number set. Since these definitions are absolute, i.e. not relative to some additional constants, they must contain unrestricted quantifications. In such cases the only way of avoiding them would be to replace the definition by an abstract, possibly categorical, characterization of the defined set. Having indicated what our chief obstacle is I am now introducing a condition saying what it would mean to have removed it. It is a condition imposed on the axiom of a species of structures B(Y, B; t) making precise the idea that the axiom does not transcend the structure < Y, B; t >: (3) The axiom of B(Y, B; t) is typified in accordance with the typification of B(Y, B; t). This condition is to be viewed as an additional requirement for a representation scheme. An improvement would be achieved with such a scheme if condition (3) did not hold in the original species E. But what is meant by saying that the axiom of B is typified? Roughly it means that the axioms of B are the natural translations of sentences of a higher order language having basic types corresponding to the (principal and auxiliary) base terms of B. But first we should try to find a formulation without referring to a separate language. Very briefly this would be the following. First, it was to be avoided that the bound variables in the axioms vary over the whole universe of set theory. This is avoided by restricting quantification to scale sets over the (principal and auxiliary) base sets Y and B of the axioms in () (Y, B; t). Second, the elementary formulas occurring in the axioms have to be in accordance with the restrictions of quantification. Thus the occurrence of
(6,··· '~n E~) would correspond to restrictions (with the scale sets {i ~l E {il,··· '~n E
{in
and
== (iCY, B)).
~ E POW({il X ...
x (in),
the whole axiom being, in this sense, stratifiable. Third, if auxiliary base sets B occur in the typification sets typified by them should also occur and some extra axioms should give a sufficiently complete account of the resulting structures. It is now fairly obvious how our physical theory could be detached from its set theoretical framework if a reformulation of it satisfying the new condition (3) were obtained. Under this condition B can be viewed as the canonical set theoretical translation T' of a selfcontained higher order theory T. This translation is essentially the one we use when we say what it means for a finite type structure to be an interpretation of a finite type language 24 . If we 24
The details are as follows. (For the sake of simplicity we confine the explication to the first order case. The generalization to higher orders is straightforward.)
VIII.36 Mathematics and Physical Axiomatization
549
now assume that we have chosen a logic L for this language then we may ask ourselves whether the following main theorem holds: If a is a sentence in the finite type language of the theory T then TI--L a
if and only if
ZF
UT' I-- a'
where T' and a' are the translation of T and a respectively. The importance of this theorem as regards the role of mathematics in physical theory is obvious: The theorem, if it holds, would amount to the elimination of set theory in favor of the finite type logic L. But does the theorem hold? Or rather: in which cases does it hold? For there is the dependence on the logic L. The arguments that were given by Field seem to me to be good enough for an answer in the affirmative if T and L are of the first order 25 . Unfortunately, however, I can think of no first order reduction Assume anyone-sorted first order theory T without function symbols and with finitely many constants only to be given. We can then represent T by a species of structures () (over ZF) in the following way. Choose any new set constant Y and for every constant r of T a new set constant r'. If r is an n-ary relation constant write down the typification
if r is an individual constant give r' the typification r'cY. This already settles the typifications. If now a is an axiom of T (all axioms being sentences) let a' be the set theoretical formula resulting by the following modification of a: Map the variables ~ of T injectively on variables of set theory. Then throughout a by A(.( tY -+ .. . replace A~ . . . V~... II Ve'.~'EY /\ .. .
e
r6,···,~n ~=TJ
25
<~~,
...
,~~>ey'
(=r/ and TJ may also be individual constants. If
where in the two latter cases the ~ we let the a' thus constructed be the new axioms we have obtained a species of structures T'(Y; ... r, ... ) satisfying (3). Evidently, with the necessary but trivial precautions this assignment is uniquely reversible, and we have obtained a 1 - 1 correspondence between all first order theories T (with the restrictions mentioned) and some species of structures satisfying (3). In the general case of a finite type language underlying T special additional conditions have to be imposed on T in order to obtain the essential invariance condition required for species of structures. Field 1980, Ch. I. It has to be noted, however, that Field's setting differs from ours in the following respect: He considers an extension of the first order theory T itself by ZFU whereas we have considered an extension of ZF or - for the sake of the argument - ZFU by the set theoretical imitation of T. In Field's approach one has to connect set theory with T by comprehension axioms saying that, given any formula A[x] of T, there is a set u such that x E u iff A[x].
550
VIII.36 Mathematics and Physical Axiomatization
of physical theory such that all descriptive symbols admit of an essentially physical interpretation, - as opposed to the case where we are coming from. On the other hand, although not every species of structures E will be reducible via a representation scheme to a species () satisfying (3), I can think of no physical theory that could not be reconstructed in this way if (3) is not confined to the first order case 26 . However, as regards the main theorem, if T and L are of higher order, although the only-if-part will present no problems I am not sure whether the if-part can be trusted. We therefore end up with two open questions: 1) Can we reconstruct physical theory directly as finite type theories, - 'directly' in the sense that all descriptive terms are interpreted by specific physical concepts with the eventual exception of some mathematical terms belonging to a restricted part of mathematics that is really needed for the reconstruction? 2) What elimination theorems for set theory can be proved leading to such finite type theories? As regards the first question I think we should take the trouble and develop physics directly in some finite type predicate calculus 27 . Although such a calculus is a formulation of only an incomplete portion of the general notion of set, even in mathematical practice there are very few theorems that can be obtained only in full set theory. On the other hand, elimination results for set theory will be of interest as long as common usage in theoretical physics consists in set theoretical reconstruction. Moreover, the higher order logics approach will presumably also involve interesting elimination results, for instance, to the effect of reducing the order 28 •
IV. Outlook Let me now briefly summarize my argument and conclude it. I have distinguished between three positions as regards the application of mathematics in physics: Platonism, applied formalism and radical nominalism. And, for the time being, I want to recommend applied formalism as a medium position between the two other ones. As opposed to radical nominalism applied formalism doesn't insist on first order theories with a total physical interpretation: it allows for physically uninterpreted descriptive terms and for partial models. As opposed to platonism applied formalism doesn't give those terms a supernatural interpretation. Rather a formalism is viewed as a mental construction that is used to bring order into some part of reality. If a formalism 26 27
28
Field has the additional problem that he works with cardinality quantifiers, see Ch. 9 of his (1980). Again it has to be emphasized that this has to be done in a 'direct' manner that can hardly be expressed in precise terms. But the whole argument of this paper should have made it clear that we did not want to eliminate first order set theory only in order to reintroduce it on a higher level. See Kreisel/Krivine (1971), Ch. 7 ; Takeuti (1975), Ch. 3.
VIII.36 Mathematics and Physical Axiomatization
551
doesn't fit then the position in question gives no definite answer to the question: What shall we do about the non-fitting rest? It tries to improve the situation. But at the same time it freely confesses that at present we simply don't have better means to do the job. The situation can be compared with a similar one that emerged within mathematics during the 19th century. It was summarized by Kronecker when he said: The natural numbers have been created by God. All else is the work of man. By this Kronecker meant to say: Within mathematics only the natural numbers are given to us as something that really exists. Correspondingly, the semi-ring of natural numbers is a partial model of many theories the total models of which are only convenient completions of that structure, adapted to the minds desire, for instance, for calculating differences and quotients. Similarly, a physicist could claim that only matter has been created by God, and everything else was manmade. Moreover, physical reality could be essentially finite so that, on this view, not even the total system of natural numbers, but only finite sections of it with an unknown upper limit would be realized. Again, these sections would only be partial models of, say, first order Peano arithmetic, and also in general mathematical structures and theories would only be infinite limits of the finite structures of reality. At the same time, although in a sense mathematics is non-trivial precisely because it is a theory of the infinite, from the viewpoint of application finite mathematics would be hopelessly complicated, and it is only the infinite idealizations that make theorizing about nature humanly possible. The general situation is perhaps most adequately described in terms of the real and the possible. Already in ordinary thinking we grasp the real by making drafts of the possible. This strategy assumes a very peculiar form in physical theory. With the help of a physical law we conclude that, if such and such possibilities, admitted by the theory, are real then such and such further possibilities, admitted by the theory, also are real. In the reconstruction of a physical theory given in Part I the structure distinguished by a theory is - roughly speaking - a structure of possibilities restricted by the axioms of the theory. In applying the theory more and more parts of the hypothetical structure are realized. They are then added to the structure and predictions are made as to what further parts of the structure are rea1 29 . Now in a wide sense of the word mathematics is the theory of the possible. Therefore insofar as we need the possible in understanding the real we need mathematics in physics. And this would be the case even if we did not make idealizations. For the predictions made with a physical law it doesn't matter that the premises are real. In this sense a law always covers cases that may never be realized. I would not say that it is for this reason that mathematics with all its specificity plays so important a role in physics. But it is for this reason that we should not be afraid of introducing mathematical ideas 29
The details of this procedure have been investigated by Ludwig in his (1978), § 10.
552
VIII.36 Mathematics and Physical Axiomatization
representing those physical possibilities. I am quite prepared to accept the view that once the world is deprived of its material basis then no Popperian world of numbers, sets propositions, etc. will remain. But I am convinced that, if the world is deprived of the mathematics as we actually have it in our minds then the structure of matter could no longer be understood. The question raised by Field as to what extent mathematics has to be invoked in science cannot be ultimately answered because it depends on the development of science and the growth of our knowledge 3o • But it is a question we must not loose sight of. Although the physicist, interested in the progress of his discipline, will always be looking for new parts of mathematics to be applied to nature, the philosopher has to control this widening and to determine the true borderline between the real and the imaginary.
30
In 1930 Hilbert still could say: "Pure number theory is that domain of mathematics that by now has not found any application", Hilbert (1930), p. 386 of the 1935 reprint. Recently a book has been published expressly devoted to the applications of number theory in physics, biology and other parts of science: Schroeder (1984).
VIII.37 Calculemus! The Problem of the Application of Logic and Mathematics* 1. The Dream of a mathesis universalis As the title of my address indicates, I am going to treat a systematic subject, but in doing so I will not fail to take Leibniz as my point of departure - which is indeed the very least one may expect of an address opening a congress dedicated to Leibniz. As we all know, the numerous plans entertained - but never completed - by Leibniz included also a plan for a so-called characteristica universalis or lingua generalis, so let's say: for a universal language with the wonderful properties that its mere grammatical mastery would make one speak truths and nothing but truths, including truths that would be novel ones in a very essential sense. Earlier, Descartes had, under certain conditions, dared "to hope for a readily recognizable universal language, easy to pronounce and to write, which, to mention the main point, would also help the human intellect in presenting all objects so clearly to it that it would be well-nigh impossible for it to be deceived ( ... ), and by means of which peasants could judge on truth better than philosophers can now" 1 . That was in 1629, and less than half a century later we find Leibniz entertaining similar ideas:
"If one could find characters or symbols", he says, "which would be capable of expressing all our thoughts as clearly and precisely as arithmetic expresses numbers and analytic geometry expresses lines, then one would evidently be able to do with all objects, insofar as they are subject to rational thinking, that which one does in arithmetic and geometry". 2 Hence the example after which the universal language of thought is to be patterned is for Leibniz - as it was in a sense for Descartes, too - mathematics, and it is also clear just what it was about mathematics which one hoped to exploit in the new, far more sweeping enterprise: the things one desired to make philosophical capital of were its proofs and its mechanically reproduceable calculations, of whose stringency and simplicity one wished that even the very process of thinking itself should benefit. What blissful state of rationality, once one had accomplished that! "One would", wrote Leibniz, "convince everyone of one's findings or discoveries, since the calculations could easily be checked out ( ... ). * Originally published as Scheibe 1990b, translated by J. Zwart 1 2
Descartes/ AT,I, pp. 80f. See also Leibniz VE 7, pp. 1480f Leibniz/Couturat p. 155 553
554
VIII.37 Calculemus!
And if anyone should doubt my words, I would tell him: 'Let's calculate, Sir!' and, taking pen and ink, we would soon extricate our embarassement". 3 Leibniz also left us clues as to how he let himself be guided by mathematics in constructing a characteristica universalis. The mental germ-cell was some sort of a principle of greater explicitness of language or the reduction of arbitrariness in the symbolic representation of contents. Let us take, for example - to follow Leibniz 4 - the arithmetical fact that three times three equals nine. In the decimal system we express this truth in a form by which no one can tell how this equation came about. The correct formulation of this equation in the decimal system is a mere matter of designation: In the binary system, on the other hand, this question is already disposed of with the first two numbers zero and one, and the representations of the numbers three and nine are already expressions of facts in the binary system. In particular, when calculating in the usual fashion we will obtain together with the product also, in a way, its designation. Correspondingly, in the case where non-mathematical and in particular philosophical subjects are included, the intention probably was to construct the universal language in such a way that in its formal structure it would become, to the highest possible extent, an image of the contents of the objects it was designed to express. As Leibniz gushed as late as 1695: "If God grants me enough time of life and freedom, I hope to design
a kind of philosophy no one has yet seen the likeness of, for it will rightly possess the clarity and certainty of mathematics, containing as it will something similar to calculation. Admittedly, it is not yet possible to decide all questions with its aid, but such decisions as are taken on this basis are indisputable. ( . .. ) Once the trail has been blazed, posterity will march forward on it". 5 Has it so marched forward, and where do we stand today? These are the questions on which I wish to say something in the following - but not, mind you, as a historian, which I am not, but in a reflection by a philosopher of science. 6 In so doing I hope to be able to proceed from the assumption that people like Descartes and Leibniz positively felt that the mathematical disciplines of arithmetic and geometry, already available then as more or less complete, self-contained systems, not only were capable of being developed further intrinsically, but also still fell short of being representative for the entire realm of the mathematically possible in the first place. The development 3 4
5 6
Ibid. p.156. See also ibid. p. 176 and Leibniz/Gerhardt, voL VII, pp. 124£, 198££. lowe the reference to the collection of Calculemus citations to Hide Ishiguro. Leibniz/Couturat, p. 284 Leibniz, Acad. Ed. ser.l, voL 11, p. 420f For an interpretation of the relevant undertakings of Leibniz and his contemporaries see Arndt 1971 and Schneider 1988
VIII.37 Calculemus!
555
of mathematics in the 16th century was certainly conducive to strengthening such a feeling in any person. The new algebra, the beginnings of analytic geometry and the invention of infinitesimal calculus were clear indications of a beginning expansion of mathematics both in a methodical and an objective respect. It took all the philosophical optimism of the epoch, however, to jump right away to entertaining, and seriously pursuing, the idea of a universal language of thought or a mathesis universalis. Even in the present age of giant computers and artificial intelligence we are far removed from imagining that, in the end, all rational thinking is - let alone: should be mathematical thinking. But we can all the more readily sympathize with the expectation of the time that mathematics was about to undergo a major expansion, knowing, as we do, with all the undeserved superiority granted by historical hindsight, that that is exactly what happened. Our reflections in the following will not, however, be restricted to the questions of in how far the dreams inspired by the mathematics of the epoch of a lingua generalis, an ars inveniendi, a mathesis universalis have led at least to a new and expanded vision of the mathematically possible. In the very spirit of the aforementioned classical authors, the concept of the universality of the mathematical includes more than doing justice to the full structural richness in abstracto. It also includes the concrete occurrence of abstract structures in as many fields of reality as possible - for example the far-reaching embodiment of the mathematical in nature. Together with the question "Just what is generally understood by the term mathematics?", Descartes raises the further question "Why not only (arithmetic and geometry), but also astronomy, music, optics, mechanics and several other (branches of science) are designated as mathematical disciplines". 7 Today one will be the most readily understood if alongside the question of the scope and systematics of mathematics itself one poses the question of its applicability in principle and the extent of its actual application. Now right here is the point where we have reached the main title of this address, having crossed, as it were, the bridge leading to it from Leibniz's "Calculemus!". The question at issue is how and to what extent the rationalistic claim of the universality of the mathematical, presumptuous though it probably was at the time, has meanwhile been discharged in theory and practice. In making a few remarks on this subject in the following, and thus speaking about mathematics and also a little bit about logic, I will be speaking about something which is not everyone's cup of tea. Although everyone will at some point in his or her life have come into contact with mathematics, for many one the upshot of this experience will be no more than the recollection of seemingly endless hours of mathematical lessons at school. Mathematics and logic have entered into everyday language in seemingly different ways. We hear people say that this or that matter is just "higher mathematics" to them, or that some other thing is just "logical", meaning in the first case: 7
Descartes/ AT,X, p. 377
556
VIII.37 Calculemus!
"This I don't understand, it is beyond me", and in the second case: ''that goes without saying; it is crystal clear". Thus, logic seems to be making out even a little better in popular language than does mathematics. In actual fact, however, what is meant by the second locution is just as little logic in the proper sense as the first one is mathematical in the proper sense. Despite this, on the whole, none too encouraging situation I may of course be assured in this circle of Leibniz scholars and Leibniz fans that the subject I have selected will not appear to be out of place. In view of my ensuing remarks my references to Leibniz will not be in the nature of a cloak covering up a merely casual interest of this great man in mathematics. There is a nice story about Hilbert. When at a gathering everyone was asked to say what question he would ask when being waked up from three hundred years' sleep of death and being permitted to ask one single question as to how things had meanwhile progressed on earth, Hilbert said he would ask whether Riemann's conjecture had meanwhile been proven. Now if Leibniz were given this opportunity here and now, he might well ask us, I think, how matters were with his mathesis universalis. So let's tell him!
2. Two Internal Achievements of Mathematics To start with a formality: We already learned that from ancient time mathematics was subdivided into arithmetic and geometry. Added to them in the course of time were a few fields of application we heard Descartes mention, and in the 17th century mathematics in a narrower sense was joined by algebra and infinitesimal calculus. As far back as 1868, the yearbook on Progress in Mathematics subdivides mathematics (including its fields of application) into 12 subfields, followed, for greater clarity, by a still more detailed subdivision into 38 fields. In the Mathematical Reviews of 1979, two comparable subdivisions produce 60 and approximately 3400 subfields respectively. 8 Thus, particularly within the past 100 years, we are confronted here with an expansion and differentiation of mathematics which actually defies description: An absolutely fantastic development which even our bold prophets of a mathematical universal science would certainly have rendered speechless. At the same time it is clear that it would be simply ridiculous to try to present, in a lecture, an adequate impression of the state of things, let alone of their development. Nevertheless, in this second part of my address, still with the whole of mathematics before our eyes, I propose the following threefold subdivision for consideration. Unlike the classifications already mentioned, intended as means to organize the immense mass of material, our division into three is oriented to the question, just what, in a more qualitative sense, mathematics accomplishes. And here the possibility suggests itself of distinguishing between an algorithmic, a demonstrative and a descriptive accomplishment. 8
See Davis/Hersh 1982, p. 29
VIII.37 Calculemus!
557
This distinction is not one that has just become possible for modern mathematics. All three accomplishments have been known ever since antiquity, all of them are present in Leibniz's design for a universal mathematics, and each one of them has undergone a tremendous expansion since then. Algorithms are known to us all in the form of the first four rules of arithmetic concerning rational numbers in the decimal system. Everyone knows how two natural numbers are to be added, and if the numbers are not too large, he or she is also able to actually perform the addition. This is simply a matter of calculating the value of a function for given values of the independent variables. Another function one is taught at school to calculate is the function by which the greatest common divisor of two natural numbers is obtained: one calculates this with the aid of the so-called Euclidean algorithm. Quite generally an algorithm is a - so it is said - purely mechanical procedure which in a finite number of steps yields a well-defined result from given data. The decisive thing is that it has been prescribed by wholly unambiguous instructions just how every single step and how the sequence of steps is to be carried out. The availability of an algorithm is in the given case the compliance with Leibniz's "Calculemus!" While the pertinent basic idea is as old as elementary calculation, it is only since little more than fifty years that we have a precise conception of the algorithmY The definition of this concept and thus the establishment of a strict science of the calculable is, in this first field of accomplishment of the mathematical, the outstanding event par excellence since the 17th century. The adequacy of the definition is expressed in Church's thesis that every intuitively calculable function is also calculable in the sense of the precise definition, a thesis which today is accepted by every mathematician. This statement on the theory of the matter cannot be made without mentioning also the corresponding practice. It is well known that besides the, shall we say, Platonic tradition of philosophy with its high esteem of mathematics there has also been the tradition of a rather anti-mathematical orientation and that e. g. Hegel has found less than kind words on the mathematical activity of the human mind. These negative judgments pertain predominantly to the algorithmic accomplishment of mathematics, and in fact, of course, the mere adherence to an algorithm, once one has it, is so stupid an affair that one may assign it to a machine. On the other hand, we know better today than any preceding generation that a disavowal taking place in so isolated a fashion is totally out of place. For on the one hand the computer revolution we are witnessing today - and I believe we may really speak of a revolution here - is not, on its part, a mere algorithmic accomplishment. Rather it is a highly complicated technological development based not only on mathematical, but also on physical progress. And in any event it is based indirectly, by way of physics, on a mathematical progress which has nothing at all or 9
See Davis 21982, p. 10; in Davis 1965 the basic works have been reprinted; see also Fisher 1982, Ch. 8
558
VIII.37 Calculemus!
little to do with algorithms. On the other hand the fact remains that the transformation of our world through the computer is based on a thoroughly effective integration of its algorithmic capability with other accomplishments. I need not describe here in greater detail what undreamed-of influence modern computers are meanwhile exerting not only on our everyday life, but also on the progress of science. There is only one thing I wish to mention expressly. Normally the use of computers for scientific purposes has a conclusive character: within the framework of a sizable project they furnish e. g. numerical data which form a decisive part of the overall result, and this they do also e. g. in computer-assisted proofs within pure mathematics, for example in proving the Four Color Theorem. lO In addition, however, computers also playa heuristic part in research. True, an ars inveniendi such as meant by Leibniz and held possible until well into the 19th century we consider today to be impossible. But that the heuristic use of computers in the recent past has brought research ahead cannot be overlooked. A typical example is the theory of deterministic chaos. l l Here the problem is the description of processes, e.g. turbulent motions in a liquid, which obey a quite simple mathematical law, but which both in the individual case and in their totality may take place in an extremely complicated way, in a word: chaotically. To obtain an overview of such processes seems to overtax even the brains of trained mathematicians. A computer, on the other hand, gives one quite rapidly a vivid impression of the processes going on and of essential structural characteristics. Usually this is quite sufficient for the physicists, and the mathematicians will find that the theorems they will have to prove are now occurring to them. Euler is reported to have said: "If I only had the theorems already! I would have no trouble finding the proofs". At least in the first part of this task computers have an essential part today. So much about the algorithmic accomplishment of mathematics. Now, as next thing, a word about its demonstrative function. Mathematics - it is said - is the proving science par excellence. What is a mathematical proof? This, too, is something most of us will probably have been confronted with at least once at school. That, of course, is not sufficient to give us an impression of the fact that the finding of proofs is the main business of mathematicians, or of how they go about it. Characteristically, however, all professional attempts undertaken so far to subsume the proofs of mathematicians under a precise concept have failed to be as successful as they were in the case of the algorithm. 12 There is no Churchian theorem for the concept of 'intuitive' proof. We have several explications, but the practice of proving is not identical with any of them. In comparison, it would make little sense to apply an algorithm without, however, striving to be absolutely precise in doing so. If we want lO
11
12
See Appel/Haken 1986 See, for instance, Hofstatter 1982 As an introduction into the 'many faces' of logic Bell/Machover 1977 is useful, and for further reading take, for instance, Gabbai/Guenthner 1983
VIII.37 Calculemus!
559
to know the exact sum of two numbers, we must apply the rules of addition exactly. In contrast, mathematical proofs are often more plausible when they do not exactly follow the rules of an explicit proof concept. Nevertheless it must now be said here, too, that certain insights into the concept of proof which we have gained in the past 100 years through explicatory attempts constituted a giant step forward when these efforts are viewed in the light of Leibniz's aspirations and compared with the then state of things. The essential recognition was that to a decisive, formerly underestimated extent the mathematical proof is simply a logical inference. The drawing of logically correct inferences has first of all, like calculation, the formal aspect that it occurs according to precise rules which can be combined to describe, in the aggregate, a calculatory procedure - a procedure governed by logic. Seen thus, the drawing of conclusions is, therefore, related to calculation. A big difference, however, is that the rules of calculation prescribe what - step by step - one is obliged to do, whereas those for drawing conclusions prescribe only what one is permitted to do. Permissible - roughly stated - is anything which preserves the truth - which, without limitation of generality, leads from true premises to a true conclusion. The freedom left the proving person within this framework, as contrasted with the blind "thou shalt" of calculation, is at the same time that which makes proving harder than calculating. The realization that proofs are essentially logical inferences - only inferences - seems to reduce mathematics to applied logic, which is something mathematicians loathe to hear. In addition to that, there is the fact that the proof of a thesis, although not being an algorithm per se, may, in certain cases, quite well be replaced by one - by a decision procedure, as they call it here. That this trivialization of mathematics does not come to pass in the more interesting cases is expressed by a limitation theorem of Codel. 13 The dream of a complete algorithmization of mathematics, which Leibniz, too, entertained, has been dreamt to its unsuccessful conclusion. The metamathematical analysis of proofs and possibilities of proof must, however, not be regarded anyway as an attempt to describe what mathematicians actually do. Rather, the sole issue at hand is the problem of relating mathematical proofs to a concept so that, on the basis of this proof concept, essential parts of mathematics should become reconstructible. The aforementioned solution by having recourse to logic is the best solution we know. 14 It may well come as a surprise to the outsider that the reconstruction of mathematics' proof-producing apparatus as being a logical apparatus is an insight that was gained only in the post-Leibniz period, in fact only little more than 100 years ago. Was not logic invented by as ancient a community as the Creeks, and had not, since Euclid's opus, the demonstratio more geometrico 13 14
See the references of note no. 9. On the concept of rational reconstruction in comparison to history of science see Scheibe 1984a (this vol. 111.13)
560
VIII.37 Calculemus!
become a paradigm of scientific thinking? Both, the one and the other are perfectly true, and indeed logic and mathematics have continued since then to be felt time and time again to be somehow related. But this does not yet mean - far from it - that e.g. the proofs given by Euclid had been expressly based on logic as it was known then. As we know today, the underdeveloped status of Greek logic at the time completely ruled out this happening in the first place. It is only toward the end of the 19th century, first and foremost in Hilbert's Grundlagen der Geometrie (Foundations of Geometry), that it becomes transparent that the mathematical share in geometric proofs consists of no more than logical conclusions from the axioms of geometry. 15 The step forward taken in this connection was a step of logic, not of mathematics. For the possibility of logical deduction is based on the occurrence, in the propositions connected by a proof, of components of purely logical significance, such as e. g. the words 'and', 'or', 'not'. But for the formulation of mathematical statements and the insight into their logical interrelationships it is only the correct treatment of generality and existence - hence of the logical components of statements we express in everyday language with 'for all' and 'there is' - which is absolutely decisive. Now these statements had, however, since Aristotle, hence for more than 2000 years, been explicated only rudimentarily in the syllogistic basic forms 'B belongs to all A' and 'B belongs to some A'. Even the simplest theorems of geometry are not correctly analyzable syllogistically. Unbelievable though it may sound, it was not until close to the end of the 19th century, that the mathematically fully relevant use of generality and existence was correctly recognized, particularly through the works of Frege l6 , to which this and that was added later, but which undoubtedly constituted the breakthrough.
3. The Description of Nature The characterization given so far of the demonstrative power and accomplishments of mathematics is possibly incomplete. When a mathematician is asked what the purpose of a proof is it will be natural for him or her to answer that the purpose is the insight acquired in the truth of the theorem proven. He (or she) might also say that the purpose is the establishment of a logical implication: the theorem proven follows from these or those other propositions. This latter answer would definitely close our subject. But the former answer, the one putting the truth issue in the foreground, is heard more frequently. For those mathematicians are probably in the majority who believe that they are dealing with a mathematical subject sui generis and unearthing truths about it. However, a proof as described so far leads only 15 Hilbert 1899 (51922); compare the description of the development in Becker 1954, Ch.4, sect.1 16 See Frege 1879 and 1893
VIII.37 Calculemus!
561
and this as a matter of principle - to a shift or a postponement of the truth question rather than to its resolution. For in every case the question of the truth of those propositions remains open from which, as premises, the proof was arrived at. If one wants more than that, the description of the demonstrative accomplishment becomes dependent of the question as to the object of mathematics. With this question one penetrates right into the center of the philosophic discussion of mathematics - to the question as to the - as I will call it - descriptive power of mathematics, which question will as of now occupy us until the end. In this third section we will first of all examine the separate, subordinated question of to what extent mathematics itself will be able to provide us with an answer. The answer I will give to the question as to the subject and descriptive power of mathematics, will - in accordance with this dual formulation - be a twofold one. For one thing, the descriptive power of mathematics is essentially - to put it somewhat paradoxically - an abstractive power which, in far-reaching independence of the object, presents only some such thing as its form and the form of what can be said about it, this, however, with a certain completeness in that all possible forms susceptible to application are shown. In the terminology that has become customary for this accomplishment of mathematics one might express this also by saying that mathematics considers structures and types of structures in abstracto. And, as we already did before in the case of algorithms and proofs, we can now also say with respect to structures and types of structures that we have developed for them in our 20th century a conceptuality granting us expanses and depths of vision which would have made the heart of a Leibniz beat faster. Similar to and in connection with the concept of proof, it again is the expansion of logic and of its languages which has made this new perspective possible. But, again, we find that here, too, the conceptuality of structure has not been definitely settled. For this, too, we have no Churchian thesis. Yet the conceptuality developed so far leaves everything far behind it which is understood elsewhere by 'structures' - a vogue-word, a fashionable expression of the 20th century. Above all, however, we are truly confronted here with a mathesis universalis: an incredibly wide formal description framework which leaves the contents to a large extent open. This framework is far wider than what Descartes understood by order and measure when he said "that, to be precise, everything must be considered as mathematics which is marked by a search for order and measure"17. And when he continues "that it does not at all matter here whether this measure is to be looked for in the numbers or in the figures or the stars or in the tones or in any other object", then we can, with far more right, say the same thing of the modern mathematics of abstract structures. Somewhat move precise and illustrative of developments since then is how George Boole expressed himself 200 years later with the words: 17
Descartes/AT, X, p. 378
562
VIII.37 Calculemus! "They who are acquainted with the present state of the theory of Symbolic Algebra, are aware, that the validity of the processes of analysis does not depend upon the interpretation of the symbols which are employed, but solely upon the laws of their combination. Every system of interpretation which does not affect the truth of the relations supposed, is equally admissible, and it is thus that the same process may, under one scheme of interpretation, represent the solution of a question on the properties of numbers, under another, that of a geometrical problem, and under a third, that of a problem of dynamics or optiCS.,,18
But Boole, too, is standing - in the mid-19th century - only at the beginning of the uninterrupted upswing toward the universal mathematics of structure. This upswing was only made possible by Cantor's theory of sets and Hilbert's formalistic program. Under the influence of Hilbert, including his interest for the physical applications of mathematics, it thereupon was the G6ttingen school of mathematicians, particularly Emmy Noether and her students, who contributed essentially to the development of the new views. Van der Waerden's Moderne Algebra of 1936 probably was the first textbook in the new style, with Bourbaki's mathematical encyclopaedia of the 1950s and 1960s forming the crowning conclusion. 19 Now what are structures and species of structures in the sense of modern mathematics? With a view to traditional mathematics one will assume that e. g. geometric figures - straight lines, circles, polyhedrons, etc. - are mathematical structures, as are, without a doubt, the natural numbers of arithmetic. That is quite correct, too, if in addition the following essential consideration is made: When we say of a geometric figure that it is a circle, or of a number that it is a prime number, then in doing so we are referring to a larger entity - to the system of all numbers or to space as a whole -, and we relate also certain universal structures to these entities - e. g. multiplication, or the function of distance -, and without our doing this we would be wholly unable to say anything about the individual structures, so familiar to us, of a number and figure. Structures in the sense of modern mathematics are, therefore, fairly comprehensive, usually infinite formations consisting of one or more basic domains whose elements, subsets, etc. are structured by properties and relationships. Against traditional logic, the matter to be particularly emphasized here is the many-termed (proper) relation, which to understand was a source of difficulties until far into the 19th century. Here in the descriptive field, matters are exactly the same as they are in the logical field with respect to existence and generality: Without including proper relations in our considerations a reconstruction worthy of the name of scientific assertions is out of the question. The second essential insight which made the modern concept of structure possible was the inclusion into the consid18 19
BoDle 1847, p. 3 Van der Waerden 1971; Bourbaki 1939ff
VIII.37 Calculemus!
563
erations of properties and relations of higher order 20 . The property of being a prime number is in the system of natural numbers a property of the 1st order, since it concerns the elements of this system. On the other hand, the property of being a circle no longer concerns the points of the given space, but its subsets. Here we are dealing with a concept of the 2nd order, and even concepts of a still higher order are continually being used today in applications of mathematics. Many-termed concepts of higher order form today the germ cell for a recursive procedure for introducing within a theory of sets or a logic of types the general concept of structure 21 . Now to what extent is use being made within and outside mathemCJ,tics of this newly-acquired generality? When we look first of all to the applications, the answer, in a strict sense, must be: to a minute extent. In all strictness, however, this is only intended to mean that by the very nature of things we can only make a finite use of a potentially infinite diversity of types of structures, and in principle there is nothing at all we can do to change this ratio. But in comparison with the situation in the 17th century the situation existing then has meanwhile been considerably expanded. First of all there have been expansions in the sense that wholly new types of structures have had to be resorted to in order to arrive at an adequate description of the objects of application. The most impressive examples of this are furnished us by physics, still constituting as it does the most mathematics-oriented empirical science we have. In generalizing the Newtonian space-time, but simultaneously in deviating from it, the general relativity theory has led us to consider the so-called Lorentzian manifolds. A particularly dramatic turn was brought about by the quantum theory, when Hilbert spaces and Banach algebras were used to describe states or properties of an atom or elementary particle. This marked the first time that, to the great surprise of physicists, non-commutative algebras were introduced into physics. Likewise, the classic probability spaces resorted to describe common statistical phenomena must be included in the list of novel structures frequently being applied today ~ far beyond physics ~ in the empirical sciences. 22 Whereas these expansions occupied physicists particularly in the first half of this century, we are since recently confronted with the realization that internal expansions of already known types of structures are becoming physically relevant. As an example we may mention number-theoretical structures. To the outsider this may sound surprising, thinking as he does that, if anything, the natural numbers have been populating physics for a long time. This is undoubtedly correct, but only in the sense that, from a mathematical 20
21 22
The modern understanding of relations and the introduction of concepts of higher order likewise are due to Frege, see no.16 as well as the later Whitehead/Russell 1910 For a set-theoretical introduction see Bourbaki 1968, Ch. IV; for a modeltheoretical treatment see Gabbai/Guenthner 1983, vol. I, Ch. 4 Also textbooks in theoretical physics accept the modern view of mathematics, see Thirring 1977
564
VIII.37 Calculemus!
point-of-view, the number structures that had found application were fairly uninteresting ones. Number theory in the narrower sense has always been the l'art-pour-l'art show-object of mathematics. The British number theoretician Hardy even prided himself of the utter uselessness of his doings, and in the Anglo-Saxon realm one speaks of Hardyism as the attitude that claims the self-sufficiency of mathematics. 23 But things have changed since recently, and Steven Weinberg reported only the other day of his satisfaction over having been able, in a paper on the string theory of the elementary particles, to quote Hardy, whose determination of the so-called partitio numerorum - the number of additive splittings-up of a natural number - he has used in his work 24 . But also into a field so close to life as room acoustics - to mention only one further example - number-theoretical structures have penetrated. To improve acoustics in modern concert halls with a too low ceiling, a ceiling profile has been proposed which follows the powers of a primitive root of a Galois field 25 . Another class of structures whose recent appearance in physics came as a surprise are the so-called fractals 26 . If a hundred years ago mathematicians had made bets on what mathematical structures would most certainly never find application outside mathematics, highly plausible candidates for such bets would have been, for example, the so-called Cantor sets or the function, found by Weierstrass, that is continuous everywhere but nowhere differentiable. Now how do such adventurous structures ever find application? The Greeks never made even so much as a start on physics, since the natural goings-on on earth appeared immeasurably complicated to them. Modern physics lived for 300 years off the discovery that these complicated goings-on nevertheless obey simple laws. Now that we have come quite far already in knowing and understanding the laws of nature, interest is increasingly being directed toward the contingent happenings in all their complexity. And there we find e. g. in the deterministic chaos theory already mentioned that for characterizing the behavior of the solutions of quite simple equations such exotic sets offer themselves as e.g. the aforementioned Cantorian sets27 . Such a set is arrived at by starting from a finite interval which is divided into three equal parts, of which one leaves out the middle one (without its end points), following which one performs exactly the same procedure with the two remaining intervals, then again with the intervals remaining after this second round, and so forth. The residual set will cover the original interval as thinly as desired, yet it still contains exactly as many points as the original material. The discovery of such a monstrosity was worthy of a Cantor. What we are 23 24 25 26 27
See Hardy 1940 a well as the literature quoted in Davis/Hersh 1982, pp. 85ff. Mathematics 1986, esp. p. 731 Schroeder 21987, sections 13.9 and 26.6 Mandelbrot 21983 Peitgen/Richter 1986; Grossmann 1983; Devaney 1986
VIII.37 Calculemus!
565
confronted with is the problem why we find such structures in textbooks of mathematical physics. So far I have spoken of the descriptive power of mathematics only insofar as it can be left open where the structures come from in concreto which mathematics considers in abstracto. At the close of this section a word is still needed on whether mathematics itself does not already furnish us structures. Two remarks on this question must suffice us for the following. On the one hand the remark that models of a theory of sets answer this question adequately at least when in the spirit of the purpose of this address present-day mathematics is regarded in the light of the idea of a mathesis universalis. 28 Evidently this answer is not unequivocal, but each one of its intended specifications would permit, in a superabundant measure as far as the applications are concerned, a uniform construction of mathematical structures. In this connection it is not necessary at all ~ and here comes the second remark ~ to visualize a model based on the theory of sets as a platonic heaven. Sufficient to us is the empirical fact that man is capable of the mental constructions concerned. No matter how he may have reached this point, we can furthermore note that in this spiritual world truths apply which we are able to realize without resorting to experience, without experiments and without observations on material objects and which realizations are accompanied by an uncommon measure of certainty. Now what, under these assumptions and in the light of everything said so far, does the application of mathematics to nature look like?
4. The 'Unreasonable Effectiveness' of Mathematics Ever since the beginning of modern physics, physicists have been convinced that ~ as Galileo already put it ~ "the book of nature is written in the language of mathematics" 29 . Furthermore, it has been expressed time and time again that the positive usability of mathematics for our understanding of nature borders on the miraculous. To Kepler and Galileo this miracle consisted in our being able here, if anywhere, to directly read God's thoughts. A modern physicist, Eugen Wigner, says: "The enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious, and there is no rational explanation for it,,3o. The only possibility of an explanation thereupon suggested by Wigner is an aesthetic one, adopted by him from Einstein: "The observation which comes closest, to an explanation ( ... ) is Einstein's statement that the only physical theories which we are willing to accept are the beautiful ones". But Einstein still had other things to say on the matter, and in this final section I will take up his cue and that voiced in 28
29 30
See, for instance, Jensen 1967 Galilei/Opere, vol.VI, p. 232 Wigner 1979, pp. 222ff, 229f
566
VIII.37 Calculemus!
a parallel remark by Steven Weinberg, one of the founders of the theory of electroweak interaction. Einstein and Weinberg likewise make no secret of the fact that they find themselves confronted here with a miracle of sorts. Einstein speaks of the "riddle which has troubled researchers of all times so much. How is it possible that mathematics, which after all is a product of human thinking independent of all experience (and whose theorems are absolutely certain and indisputable), fits the objects of the real world so perfectly?"31. Weinberg presents as it were an empirical confirmation of the miracle in enumerating the many cases in which a species of structures used by physics had been found already before by the mathematicians and now merely needed to be correctly applied. "It is positively spooky", says Weinberg, "how the physicist finds the mathematician has been there before him or her". 32 The mathematician becomes so-to-speak the physicist's Man Friday. Is there any explanation for this teamplay? Einstein has tried to solve this riddle through his now famous statement: "Insofar as the theorems of mathematics refer to reality they are not certain, and insofar as they are certain they do not refer to reality". Weinberg offers us, in contrast, the following explanation: "Mathematics is the science of order; so perhaps the reason the mathematician discovers kinds of order which are of importance in physics is that there are only so many kinds of order". These two explanations seem to state wholly different things. In actual fact, however, they form part ofthe same picture and complement each other. Each is associated with a specific basic feature of modern universal mathematics as I pictured it: The attempted reduction of the mathematical in the proper sense to the logical-formal drawing of conclusions, thus simultaneously gaining the immense richness of possible structures which lend themselves to such drawing of conclusions. Einstein elucidates his view by remarking that it was only through modern, axiomaticly-oriented mathematics that we received absolute clarity as to the fact "that through it a clean break was achieved between the logical-formal and the objective ( ... ) contents (and that) only the logical-formal ( ... ) (forms) the object of mathematics". It is thus precisely through this isolation that mathematics acquires its much admired certainty. But as soon as we take mathematics out of this isolation and apply it to reality it loses this certainty, or, to put it more precisely, it acquires as applied mathematics an uncertainty: the uncertainty, namely, of the decision which ones of the infinitely many species of structures that can find application we should select in a concrete application case. This, now, is the point where Weinberg's statement intervenes. Formulated roughly, his statement says: Some kind of structure will do the job. It is like shopping in a department store: Some suit will fit. Modern mathematics offers us, in its present-day form, all forms of exact thinking man is capable of. By selecting 31 32
Einstein 1989, pp. 119£ Cf. Mathematics 1986, pp. 725 ff. See also Steen 1988
VIII.37 Calculemus!
567
one of them to use, we do the one and only thing we are in a position to do at all. And the choice we have is gigantic. Small wonder that we find the right thing. Does the Einstein-Weinberg view explain the pre-established harmony of mathematics and reality? On this, many a thing could be said: I would like to conclude my address with the attempt to describe a difficulty which is left out in this explanation and which still surrounds the functioning of the matter with the aura of the miraculous. To begin with, it is of course correct that in comparison with the traditional stock of mathematics the immense structural richness of present-day mathematics scales down the miracle of its applicability. In the 17th century the rejection of geometry would have meant the rejection of the entire half of mathematics. One would not have known at all what to put in its place. Once, however, the new universal-mathematical perspective had been gained, the abandonment of the old geometry in favor of another one appears simply as a transition of one kind of structure to the next one. This does not mean that we or our descendants will never have to be astonished again. No one can tell whether we won't find ourselves compelled some day, for reasons coming e. g. from physics, to abandon the aforedescribed contents-oriented mathematics in favor of an alternative. In quantum field theory, and thus in a solid piece of fundamental physics, a variety of 'mathematics' is used today which does not possess a set-theoretical model, thus constituting insofar a riddle. 33 Likewise, we are acquainted today with mathematically or physically motivated expeditions into border areas of mathematics in the contemporary sense such as e. g. non-standard analysis, non-Cantorian theory of sets, multivalued logic, quantum logic and the like. 34 But on a mathematics of quantum field theory we still lack even the beginning of an idea, and the other undertakings have not, in any case, led so far to a revolution of mathematized science which one would be compelled to follow. But also with respect to our present-day understanding of the subject there remains, as stated before, a rest. I will call it the phenomenon of the mathematical overdetermination of physics. 35 Roughly put, it consists in our having, in the theories of physics, frequently more mathematics than we can interpret physically. Let us get the intrusion of this surplus straight in a very simple case, e. g. that of the state equation of a gas. With a gas equation the physicist would like to formulate a lawlike relation, valid for many gases, between pressure, volume and temperature. Although united in one gas, these quantities are rather dissimilar in nature, and at first glance it is not evident at all where a possibility should come from to formulate a relationship - any relationship - between them. The trick by which this is de facto done goes as follows: pressure, volume and temperature have this in common that their 33 34 35
Physicists tend to ignore (rightly) this circumstance since they have great successes with this method in their renormalization theories. Compare the literature quoted in no. 12 For the following see Scheibe 1986d (this vol. ch. VIII.36 )
568
VIII.37 Calculemus!
values can be described by numbers. Through this uniforming, that which first seemed impossible now all of a sudden becomes possible: the entire fullness of three-termed relations between numbers is available for the formulation of a gas equation. However, a price must be paid for this: these relations between numbers likewise do not gratuitously fall down from heaven; rather, they are based on the elementary calculatory operations and on the limiting processes possibly involved. And the mathematical entities thereby appearing on the scene have no significance in the gas theory arrived at in the given case. Hence we did not acquire our physical law here by reconstruing it as a proposition in concepts that are physically understandable throughout. Instead, we have acquired the physical structures sought for by imbedding them into richer structures at the price that their elements will, and even should remain physically unintelligible. And that we obtain physically useful laws in this fashion is really a miracle. Nevertheless this miracle would not have to upset us if it were an isolated case here. In fact, however, this is only a description of what happens normally. It is wholly normal that in physical theories - semantically formulated - terms occur for which no physical significance, however indirect a significance may be, has ever been even so much as intended, although these terms occur in a descriptive position. Anyone not knowing how the formalism is to be interpreted in the first place might well regard these de facto non-interpreted terms with equal justification as interpreted as the actually interpreted ones. For this reason there can, at first glance and without further consideration, be no question of the borderline between forms and contents coinciding, according to Einstein's ideas, with that between mathematics and physical reality. Rather, theories formulated in this fashion are mixed forms which describe a material world by relating it to a mathematical one. Do we now also have an explanation for the phenomenon of mathematical overdetermination? It is noteworthy that the attempts at an explanation have mainly consisted in causing the phenomenon to disappear, i.e. in showing that theories manifesting it possess physically equivalent formulations from which it is eliminated. 36 Paradigmatic for this continues to be, even to this day, Euclidean geometry. Its modern version, preferred in physics, as analytic geometry employs coordinate systems in space and thus numbers. It can be shown, however, that one can also do without this analytical apparatus and that an equally strong formulation in purely geometric concepts exists. 37 We will consider another case somewhat more precisely. Geometry is, in the common view, equipped with a distance concept which lets the distance between any two points in space be an unequivocally determined number. This distance structure contains somewhat more than is given in a physically objective fashion. We will obtain a specific number only if we 36 37
As a program this reaxiomatization was first formulated at the beginning of Hilbert/v.Neumann/Nordheim 1927 For this and the following case see Tarski 1959
VIII.37 Calculemus!
569
arbitrarily lay down a unit of measure. Objectively given is only the equality of two distances: the so-called congruency. Now it is indeed possible to present a formulation of Euclidean geometry which proceeds exclusively from the congruency and betweenness relations and from which distance numbers have disappeared. What has thereby been achieved? When we say that the distance from Hannover to Heidelberg measures some 400 km we have interrelated two places on our planet by a number. It is difficult to argue the fact out of existence that into this distance relationship the number concerned enters in exactly the same fashion as the two spatial partners. Now two places materially defined in space are just as certainly physical realities as a number - the third partner in our relationship - is not. Why is it necessary to talk in physics, besides on material realities (in a broad sense), also on something entirely different, e.g. on numbers? One is tempted to answer that there is something wrong here already in the very question - that the numbers do in fact playa different role in the given theory than its actual objects. That may well be so. But unfortunately we do not possess a reconstruction which would make this difference plain and thus explain our phenomenon. In the given case we can instead make the phenomenon disappear: as stated before, things will work here also without distance numbers. But is this answer satisfactory and will this always work? Both questions, I am afraid, must be denied. The newer field theories, including the quantum field theories, have all been formulated with spacetime coordinate systems being resorted to. Now even many physicists have a tendency to keep the further development of these theories free of coordinates. But this does not remove the sting placed here in the very beginning. From the part of philosophy of science, the attempt was recently made to eliminate, by the same process as just outlined for the distance function, numerical values also from true field functions 38 . The result is in these cases of appalling complexity. A preferred object of reaxiomatizing attempts has been furthermore, ever since its physical establishment 60 years ago, quantum mechanics. Its original formulation, used today in all textbooks, possesses a not even particularly conspicuous, but - in its consequences - far-reaching mathematical overdetermination in the form of complex Hilbert space. Here the reformulations have frequently been attempted for wholly different purposes and, accordingly, have yielded nothing that would help us in our question. Other attempts have not yet been sufficiently clarified to permit a clear decision as to their success. 39 From the point of view of physics as a whole, all these undertakings are only punctual, even though the points where they are undertaken may be crucial ones. If nevertheless we wish to draw a lesson from them already now, we seem to find the rule confirmed that the attempted economizing on ontological assumptions, hence here the avoidance of mathematical entities in the position of objects - if practicable at all -, 38 39
Field 1980 Cf. Ludwig 1985
570
VIII.37 Calculemus!
frequently leads to undesirable complications. But this rule, too, cannot yet be considered as fully understood. Where - so I ask in definite conclusion - has this investigation led us? I have tried to outline in what the decisive advancements of logic, mathematics and their applications since Leibniz's times can be seen to lie if developments are viewed in the light of the idea of a universal language and universal science. For this purpose, three domains of accomplishment were distinguished. The algorithmic success is the most conspicuous one: Leibniz's little calculation machine has been replaced by our worldwide, even satellite-wide integrated large-scale computing systems. And these systems can calculate anything regarded as theoretically calculable today. The success achieved in the field of proof theory consists above all in logic having caught up with mathematics, so that appreciable parts of the latter can now be treated axiomatically. This did not involve, however, a complete reduction of mathematics to logic. Noteworthy, finally is the immense gain in descriptive potency and the updating thereof, which appear to express the universality of present-day mathematics most clearly. Quite a few things have come to pass here which no one could foresee in the 17th century. Other hoped-for things have not been realized. In all, mathematics has achieved greater independence vis-a.-vis other forms of knowledge, thus netting us the so-called application problem. This wide-branching problem I have pursued only along one line. Starting out from the amazement at the "unreasonable effectiveness of mathematics", as Wigner calls it, I have described an attempt at a solution which starts out from the universalistic gains achieved by modern mathematics. We found, however, that difficulties are encountered here, which to overcome has, admittedly, been attempted but not yet really achieved. The difficulty here is that mathematics is more than logic and shows us its teeth on the descriptive level. Thus an important idea, which Leibniz, as an early forerunner of logicism, had entertained, too, has not been fulfilled. That, too, we would have therefore have to tell him in our story. If he were not merely allowed to ask us the one question we started out by permitting him to ask, but also capable of counseling us in this situation, we would not be assembled here and now in so large a number without lending him our ears.
VIII.38 The Mathematical Overdetermination of Physics* I
You are fully entitled not to know what I mean by the "mathematical overdetermination of physics". As a first approximation to such an understanding I would like to remind you of the related though different so-called theoretical overdetermination of a corpus of observational data. Just as a physical theory often exhibits an unnecessary rich structure when compared with the observational data to be explained by it, so the mathematics introduced to formulate a physical theory frequently brings a wealth of structures into play that cannot be matched by the physical elements of that theory. One might even be tempted to identify the two cases. But the distinction between theoretical and observational terms, on which the second overdetermination is based, is generally different from the distinction between mathematical and physical terms. Theoretical terms may be intended to have physical referents, though unobservable ones. By contrast, mathematical terms occurring in a physical theory may be meant to have no physical interpretation within this theory. Yet there are structural similarities between the two cases. In both we find ourselves deluded in the expectation that for a precise reformulation of an informally given corpus of statements only two things have to be considered: (1) the concepts characteristic for the corpus in question, and (2) the logical notions binding together those concepts. Rather a third component has to be taken into account. Let me illustrate this for our case through the example of empirical laws. An empirical law establishes a relation between physical quantities by giving a numerical relation isomorphic to the former. By a gas law, for instance, the physicist wants to express a relation between pressure, volume and temperature of a gas. Though united in one and the same gas, the quantities in question are of entirely different kinds, and no further physical concepts are available in order to bridge the gulf. In this situation we have to borrow from mathematics. There is one thing that our quantities have in common: Their values can be described - one by one - by real numbers. With one stroke this uniformisation makes possible what seemed impossible before: the wealth of ternary relations between numbers is at our disposal to formulate the law. However, for this gain we have to pay a price. Whereas the mathematical relation chosen can be understood by its definition in terms of the elementary arithmetic operations, no corresponding physical understanding is possible. It is not that we do not understand what is meant by the law in order to test it. But the arithmetical operations defining the law having received no physical interpretation, the truth conditions of the physical relation assumed * First published as Scheibe 1997a
571
572
VIII.38 The Mathematical Overdetermination of Physics
by the law cannot be traced back to elementary physical facts, as is possible for the corresponding mathematical relation with respect to elementary mathematical facts. Rather our physical law bears the burden of a piece of seemingly uneliminable surplus mathematics. There have been essentially three attitudes towards the said state of affairs. There is first the Pythagorean tradition, renewed in modern times by Galileo's saying that "the book of nature is written [by God] in the language of mathematics". According to this tradition, the example given is looked at as just being a miracle, first discovered by the Pythagoreans when they found the isomorphism between basic musical intervals and simple numerical relations. This attitude is still alive. Einstein speaks of the "enigma that researchers of all times has worried so much about: How is it possible that mathematics, a product of human mind independent of any experience, so excellently fits the objects of physical reality?". 1 Wigner states "that the unreasonable effectiveness of mathematics is something bordering on the mysterious". 2 . And for St. Weinberg "it is positively spooky how the physicist finds the mathematician has been there before him or her" 3 . Alongside with such awesome utterances there are secondly statements that would rather downgrade the phenomenon and view it in a more sober attitude. Such is the case with P. W. Bridgman who is particularly impressed by the extension that mathematical overdetermination has assumed in the quantum theory. With respect to it he says:4 The mathematical structure ... has an infinitely greater complexity than the physical structure with which it deals. In our elementary and classical theories we have become used to discarding perhaps one-half of the results of the mathematics ... But here ... except for a few isolated singular points, we relegate the entire mathematical structure to a ghostly domain with no physical relevance. On the other hand Bridgman is not willing to become puzzled about this situation. He is aware of the tradition that I just alluded to: The feeling that all the steps in a mathematical theory must have their counterpart in the physical system is the outgrowth ... of a certain mystical feeling about the mathematical construction of the physical world. Some sort of an idea like this has been flitting about in the background . .. of the thinking of civilization at least since the days of Pythagoras .... However, in Bridgman's view 1
2 3 4
Einstein 1989, pp. 119f Wigner 1979, pp. 223 and 229f Weinberg 1986, pp. 725 and 727 Bridgman 1936, pp. 116f, 67, 66 and 65
VIII.38 The Mathematical Overdetermination of Physics
573
There would seem to be no necessity ... that all mathematical operations should correspond to recognizable processes in the physical system . .. All that is required of the theory is that it should provide the tools for calculating the behavior of the physical system, and it is capable of doing this if there is correspondence between those aspects of the physical system which it engages to reproduce and some of the results of the mathematical manipulations. Behind Bridgman's considerations a hidden nominalism is at work. But as a physicist Bridgman is simply not interested in the question whether or not the mathematics actually used in physical theories is really necessary. There is a third approach in which mathematical overdetermination is neither stared at in wonder nor left untouched with shrugging shoulders but is investigated in detail. A case in point is Hartry Field's forceful attack. In his book Science Without Numbers Field proposes to show "that the mathematics needed for application to the physical world does not include anything which ... contains references to ... abstract entities like numbers, functions, or sets"s. Explaining his position in more detail Field says: "Very little of ordinary mathematics consists merely of the systematic deduction from axiom systems: My claim however is that ordinary mathematics can be replaced in application by a new mathematics which does consist only of this" 6. Substantial work to the same effect of eliminating some mathematics from physical theory has also been done by Gunther Ludwig7. In his case however he himself does not claim, and presumably would not even agree, to have done just this. His major claim is to have found an axiomatic basis for quantum mechanics, i.e., an axiom system whose "[interpretationj can be deduced solely by means of concepts already interpreted by pretheories [of quantum mechanicsj"s. In this way, however, Ludwig eliminates de facto that part of the mathematics of Hilbert space that has no physical interpretation, e.g., the absolute phase. Of these three attitudes I sympathize with the third, at least insofar as it includes an honest attempt to clarify the role of mathematics in physics. There are of course various ways of doing this. In what follows I will briefly touch upon two problems: (1) the problem which frame of logical systematization we should use for a reconstruction of physical theories, and (2) the problem of the elimination or, conversely, introduction of a piece of mathematics on the basis of one particular frame of systematization, namely set theory9. 5 6 7
8 9
Field 1980, pp. If ibid. p. 107, no.1 Ludwig 21990 Ludwig 1985 Cf. Scheibe 1986d (this vol. ch. VIII.36 ); Scheibe 1992a
574
VIII.38 The Mathematical Overdetermination of Physics
II
As to the problem of possible systematization frames I will first mention the extreme view according to which every physical theory we shall ever come to know of can, in principle, be reconstructed within first order logic with all non- logical symbols being interpreted directly by physical entities. A binary predicate, for instance, may then not be interpreted by set membership. Rather its meaning might be that, for instance, one body has a larger volume than another one, or that the temperature of the first is higher than that of the second, etc. A physical theory thus reformulated would not even contain genuine mathematical statements - let alone be about mathematical entities. For the time being I take this extreme view to be unrealizable. In spite of the admirable efforts made by Field, the view is unrealizable certainly for technical reasons and probably for reasons of principle, as well. Present theoretical physics is formulated by making reckless use of modern mathematics, and although, as we shall see, some purifications are possible, the total elimination of the mathematics embodied in physics would put us far beyond our present capabilities - not to mention the question whether a total elimination program is desirable after all. Mathematical entities would still be avoided if we succeeded in reformulating physics by resorting to higher order logics, all non-logical symbols being directly interpreted by physical entities as before. In such an approach the logically true higher order statements could be viewed as mathematical statements although they were not about mathematical entities. I think there is a chance for such an undertaking to be partially successful. We could deal with physical structures of higher order, as they are at least very convenient in the more advanced theories of physics, and we could make higher order statements on first order structures, as they are needed in all cases involving a continuum. With our next step we still remain within the boundaries of first or higher order logics, but now allow for the explicit introduction of mathematical entities into the theory. In a many-sorted version of the logic chosen, this would mean we have to assume that one or the other sort of variables run over the natural numbers or the real numbers or a real number space of any dimension or the complex numbers or what have you. If the remaining sorts of variables are given a physical interpretation we already have a very powerful instrument at our disposal, and I am pretty sure that, as far as physics can be axiomatized at all, it could be axiomatized within this frame. However, once we have assumed higher order logics and non-physical descriptive terms we have, so to speak, passed the Rubicon, and the question becomes urgent, having gone that far, why not then proceed to the last step and introduce set theory right at the beginning. There are at least four reasons for taking advantage of this opportunity. In first and higher order logics the languages are usually interpreted by abstract structures, i.e., by certain systems of sets. Sets, therefore, come in anyway,
VIII.38 The Mathematical Overdetermination of Physics
575
and it is suggestive to make them the object of investigation quite explicitly. We would thus obtain with one stroke what otherwise would be the piecewise introduction of structure by structure as the nature of the case demands it. With regard to logic we would not have to go beyond the first order, and yet the whole stock of mathematics would be at our disposal. So if we want to analyze the role of mathematics in physics a reconstruction on the basis of set theory seems to have considerable advantages even if our final goal is the controlled destruction of this edifice by gradual elimination of its mathematics. Now set theory as a rational reconstruction of mathematics is a common place which, in a sense, is a curious thing. Let me first remind us that our common understanding of what sets are is entirely neutral with respect to the distinction of abstract and concrete entities. There are sets of cows in exactly the same sense as there are sets of numbers, and I am a member of the assemblage of people now in this room in exactly the same sense as 3 is a member of the set of prime numbers. So sets and membership are universal, but the sets we meet in daily life more often than not are sets of concrete entities. We are even familiar with sets mixed from both abstract and concrete entities as, for instance, a list enumerating the individuals attending this meeting would be. And we are certainly less familiar with lists of abstract entities. From the viewpoint of daily life set theory seems farther away from mathematics than from any science concerned with bodily beings and matters of fact. So, why not try to reformulate physics on the basis of an adequate set theory? In doing this a second point will bring us back to mathematics. The point is that the sets we meet in physical theory usually are not sets of real things either. They are sets of real possibilities as, for instance, the set of events considered in special relativity or the set of states of a quantum mechanical system. Although we commonly speak of events and states, what we actually mean are possible events and possible states. A physical theory, even in its application to an experiment actually performed in the laboratory, is universal precisely because it transcends reality. It gives us an explanation of the experiment by telling us how this particular real system might have behaved, had the determining conditions been other than what in fact they were. It is because of this kind of use made of sets that a set-theoretical reconstruction of physics is not trivial and becomes related to mathematics. One could think of a set theory dealing with both physical and mathematical entities. Such a thing was developed by Zermelo already in 1908 10 . Zermelo introduced in a set-theoretical universe entities different from the empty set but, like it, having no elements. These individuals or Urelemente, as Zermelo called them, are not needed in the construction of the universe of mathematical entities. They can be freely identified, for instance, with points of physical 10
For a modern presentation see Suppes 1960
576
VIII.38 The Mathematical Overdetermination of Physics
space or spacetime, with mass points, field strengths or other fundamental physical entities. Now in the usual Zermelo-Fraenkel set theory containing an axiom of foundation there are no infinite chains Xo :;1 Xl :;1 ••• :;1 Xn :;1 ••••
(1)
This gives us the following tripartition of our universe of discourse. Given xo, either all descending chains (1) end up with the empty set, or all end up with an individual, or some do the first and some the second thing. Accordingly, if all individuals are assumed to be physical, in the first case Xo is mathematical, in the second physical, and in the third it is mixed. The most interesting sets in physics seem to be the mixed ones. With respect to a Zermelo-like set-theoretical system one can in principle ask whether it has models. If this is done with the intended interpretation of the individuals as physical entities, then we have to face the further question of possible empirical confirmation or falsification of set-theoretical statements like the powerset axiom or the axiom of choice. If this approach makes us feel somewhat uneasy, it is perhaps only because we are used to looking at set theory as a purely mathematical theory which, as such, becomes interesting if we enter the domain of inaccessible cardinals and similar things. I will not speculate that the situation may change one day in this respect, just as it did change for certain set-theoretical curiosities like Cantor's set or Weierstrass' everywhere continuous and nowhere differentiable function. For the time being there is the more modest possibility of partial interpretations of our mixed set theory with respect to its physical terms. Parts of a fictitious set universe, that lend themselves for such restrictions are certain classes of structures. Set-theoretical formulas defining such classes - so-called species of structures ll - are conjunctions with two members. The first member (2a) typifies the sets Si, the structures proper, by means of the base sets Xk in the sense that the former are elements of scale sets ai("') constructed from the latter by successively forming power sets of Cartesian products. In this manner type-logical predication is simulated in set theory to deal with manysorted structures of arbitrary types and (finite) orders. With regard to the previously introduced division of all entities into physical, mathematical and mixed, the base sets in most (but not in all) cases are either physical or mathematical. The typified sets then are generally mixed. The topology s of a topological space X, for instance, has the typification
11
See Ludwig 21990, Ch. 4, and Bourbaki 1968, Ch. IV
VIII.38 The Mathematical Overdetermination of Physics
577
if s is taken to be the set of open subsets of X. Whereas s would be a physical set if X were one the typification of the distance in a metric space X, as usually understood, is
s E Pow(X 2
X
JR)
(with JR as the set of real numbers) which makes s a mixed set. If, finally, we wish to express the triangle inequality for s we have to introduce addition into JR, and this gives us a mathematical entity typified by
The second member of a species of structures, the axiom proper, is of the form
(2b) It is required that a be invariant under arbitrary isomorphisms of the structure (X; s) - the mathematical sets being kept fixed -
a(X;s)
t-t
a(X',s').
(3)
This invariance being fulfilled by the typification automatically, our requirement can be paraphrased by saying that what is said by a species of structures is true or false of a structure no matter what the nature of the elements of the base sets of that structure is. All physical theories share this canonical invariance, and many of the named invariances like Lorentz invariance, Galileo invariance etc. are but special cases of it, resulting from restricting the transformations to base sets with particular structures. On the other hand, the property of being a physical structure is not canonically invariant. Moreover, under quite modest general assumptions every physical structure is isomorphic to a mathematical one. Therefore, from a purely structural point of view, we could dispense with physical entities as explicit objects of our investigation and be satisfied with mathematical descriptions of them. In the following I will stick to the two-sorted approach which has some advantages for the purpose of the intended demonstration. III
The main question that poses itself in view of a multitude of systematization frames for physical theories is the question of their pairwise equivalence with respect to physical content. The pursuit of this question leads to deeper investigations that I shall not include in this paper. The preparation of such work requires the investigation of equivalences within one of those frames. In the last part of the paper I will give some examples of such equivalences for set theory as our systematization frame, and under the aspect of the occurrence or non-occurrence of mathematical entities in the respective physical
578
VIII.38 The Mathematical Overdetermination of Physics
theories. According to the foregoing considerations the formal parts of the theories are species of structures. The equivalences admitted are then given by
17(X;s) 1\ t = q(X;s) I- 17'(X;t) 17'(X; s) 1\ t = ql(X; t) I- 17'(X; s)
(4)
where 17 and 17' are the two species of structures and q and q-l are appropriate equivalence transformations. We are interested in cases where 17 contains a mathematical term that is missing in 17'. In a first case the terms may essentially be either defined constants or bound variables occurring in the axioms. A very simple case of the first kind is given by any number statement. In the axiom proper of 17 we would have a statement like card (X)
=2
(5a)
where X is a physical base set. The explicit occurrence of a mathematical term in (5a) can be easily avoided by replacing this sentence by its equivalent 3xy: x, y E X 1\ x
1= y 1\ Vz.
z E X -+ z = x V z = y,
(5b)
in which no number term appears. Moreover, it is obvious how this elimination procedure can be generalized for any finite cardinal. The case may serve as a paradigm solution for the kind of problem under discussion. My second example concerns Euclidean geometry. Let us think of space and the point relations of congruence and betweenness as being physical sets typified according to
(6a) (X being the space). There may be a controversy about the precise meaning of saying that space points are physical entities. But in view of what is now to come we certainly can put aside all quarrels in this respect. In our axiom proper we introduce the real number space lR by requiring that there exists a coordinate system ¢ on X such that the congruence and betweenness relations are carried into certain numerical relations according to
(6b) and (xyz) E be
VIII.38 The Mathematical Overdetermination of Physics
579
respectively (xf being the components of x in ¢). Again the details do not matter. All that is required from us now is to be impressed by the way in which formulas (6bc) give an answer to the question what congruence and betweenness are like in Euclidean space. In the spirit of analytic geometry the answer is given, not in physical, but in mathematical terms of numbers and algebraic operations with them. The mathematical overdetermination is evident from the group of Euclidean transformations relating any two coordinate systems in which congruence and betweenness have representations according to (6b) and (6c) respectively. Yet the case of Euclidean geometry is also a "solvable case". The tradition of synthetic geometry, going back to Euclid and culminating in the work of Hilbert and Tarski, has provided us with axiom systems equivalent to the foregoing one, in which no non-geometrical entities are mentioned. 12 As a representative example I will mention the axiom of segment construction '
(6d)
in which both relations appear. The solvability of the two foregoing examples does not mean that the subclass under discussion - mathematical sets mentioned in the axiom proper - is unproblematic in general. There are amazingly far reaching results indeed as, for instance, the solution of Hilbert's 5th problem: The species of topological groups that are Lie groups is equivalent to the species of topological groups that are locally compact and have a neighborhood of the identity containing no non- trivial invariant subgroup. 13 But in general the situation stands ill. Many physical theories, including classical and quantum mechanics, not only are extensions of Euclidean geometry, but usually are presented and developed by the method of analytic geometry. What has to be said on the essential physical entities is said, not in unquestionable physical terms, but in terms of coordinate representations. Numerical functions and systems of differential equations dominate the scene. The physical and mathematical elements of the theory are not clearly separated. The mention of analytic geometry brings us to the second subcase of the elimination problem for mathematical constants. In it we envisage the occurrence of a mathematical constant already in the typification and, therefore, a fortiori in the axiom proper of the theory. This is the case where the very object of the theory is a mixed structure. Such a sphinx is most typically brought about by making sets of coordinate systems part of the structure under investigation. To a certain extent there is a rather trivial way of avoiding coordinate systems in this function by applying the method used in the example of Euclidean geometry: We just shift their occurrence from the typification to the axiom proper by existential quantification. But this method may become very clumsy and it is then much more convenient to work with 12 13
Tarski 1959 Yamabe 1953
580
VIII.38 The Mathematical Overdetermination of Physics
differentiable manifolds as we typically do in Hamiltonian mechanics and general relativity theory. The method of coordinate systems (or: analytic geometry) is perhaps the most ingenious invention ever made in the field of mathematical physics. This manifests itself in particular if we try to do without it. The introductory example of empirical laws is a case in point. It makes tacit use of coordinate systems in the sense of scaling physical quantities. In such cases one distinguishes one coordinate system, by making the values of the quantities real numbers. If several quantities are to be related, scaling is a process of uniformization which allows for the relating just by exploiting the mathematics of the real numbers. Though it is a simple example it seems to be an open question with respect to how this business can be done otherwise. Moreover, if here, and similarly in the more general case where we describe a spacetime event by a quadruple of numbers, it were asked "what is the meaning of the numbers or number quadruples?" the answer would have to be: "it is the value of a physical quantity", and "it is the event described". If the same were asked with respect to the corresponding symbols we could hardly give the same answer. Rather their referents are numbers and number-tuples because in using a coordinate system we want to relate a physical entity to a mathematical one. We may try to eliminate this whole method. But if we accept it, as it seems, we also accept mathematics as being about genuine mathematical entities. It is interesting to observe that the development of differential geometry has undergone a turn from coordinate representations of geometrical objects to so- called intrinsic or coordinatefree formulations. As far as this happened under the influence of physical applications the move to intrinsic representations seemed a move back to physical meaning. However, as long as the notion of a manifold based on local coordinates lurks behind the scene, the result of this move is not completely convincing. The typification of, for instance, a linear connection V' in coordinate representation is given by
(7a) where M is the (n-dimensional) manifold. If, on the other hand, the connection is viewed intrinsically as differentiation of a vector field by another vector field its typification becomes
V' E {Pow [M x Pow(Pow(M x lR) x lR)]} 3
(7b)
which clearly shows that the situation, though improved in the relevant respect, is not completely restored. In conclusion let me briefly look at the other kind of elimination problems. It concerns the bound variables in the axiom proper of a species of structures. In set theory quantifications are by definition over the whole set universe, and this is an extremely undesirable situation in a case as ours where we
VIII.38 The Mathematical Overdetermination of Physics
581
want to make a statement on our particular physical system, reconstructed as a set-theoretical structure. This structure being a system of finitely many sets, why must we go into the depths of the whole set universe in order to make a statement on such a tiny fragment of it? There is one rather effective method for mitigating the situation: We simply restrict the bound variables to appropriate scale sets over the base sets of our structure. This is indeed done in many cases of species of structures well known in mathematics. In the axioms defining a group, for instance, quantifications are restricted even to the base set of the group. But there are exceptions. Part of the usual definition of a free Boolean algebra B with a set G of generators reads: Given any Boolean algebra B' and any mapping h from G into B', h can be extended to become a homomorphism from B into B'. Now in this case the axiom can indeed be reduced to one where quantifications are restricted in the manner indicated. We can equivalently require that G be independent in the sense that all finite conjunctions of its elements or their complements be different from zero. However, the possibility of such restrictions is certainly not the general situation, and it seems a non-trivial problem to obtain criteria for it. There would be no need to exert oneself in this respect if set-theoretical reconstructions of physical theories that take into account the restrictions in question were straightforward. Many of them are. But there are also very important cases that present considerable difficulties. Quantum mechanics is a case in point. It is well known that the mathematics of quantum mechanics is the mathematics of Hilbert space, but also that this identification cannot be extended to the physical part of the theory. One reason for this are the complex numbers. But there is another and independent reason. It seems that no two different but proportional Hilbert vectors represent different physical states. No Hilbert vector, therefore, is a physical element of quantum mechanics, and Hilbert space is not even a mixed set. It is purely mathematical and with it (and the complex numbers) all its typified sets. Moreover, strictly speaking this is true for every derived structure as, for instance, the C* -algebra of bounded operators or the orthocomplemented lattice of closed subspaces. In case one ever wondered what certain linear mappings or subspaces of Hilbert space have to do with observables or properties of physical systems one was entirely justified in doing so. The best we can do in some cases of derived structures, like the ones mentioned, is looking at them as mathematical descriptions of certain physical structures in the sense of being isomorphic to the latter. Of certain clearly physical structures made up of observables, states and an expectation function we could assume that they be isomorphic to certain structures derived from an infinite dimensional Hilbert space. It is in the spirit of such considerations that, for instance, the last axiom in Mackey's axiomatization of quantum mechanics reads:
582
VIII.38 The Mathematical Overdetermination of Physics The partially ordered set of all questions in quantum mechanics is isomorphic to the partially ordered set of all closed subspaces of a separable, infinite dimensional Hilbert space. 14
Mackey hurries to the comment that this axiom is entirely ad hoc. Why does he do this? The general situation before us is characterized by an axiom proper of the form :3Xs E Ma. E(Xj s) /\ Y
C:::'
P(Xj s) /\ t
C:::'
q(Xj s)
(8a)
Here the variables X and s are restricted to mathematical sets, E is a species of structures, P and q are appropriate terms and "C:::''' means "isomorphic". Above all (Yj t) is a structure about which (8a) is a statement. The peculiarity of this statement calls for two comments. First, at face value the existential quantification is reminiscent of cases of the first kind where coordinate systems were introduced in order to get the physical sets represented by mathematical ones. But coordinate systems can be typified and thus be made part of the structures being investigated. Although the resulting species of structures are mathematically infected by this method, they will not suffer any more from quantifications going beyond the structures investigated. By contrast, in quantum mechanics no such solution seems possible. Nevertheless the problem has been attacked by other methods since 1936 when Birkhoff and v. Neumann obtained a relevant result for the finite-dimensional case. Further investigations have been made by the Jauch school and recently most successfully by Gunther Ludwig. 15 Unrestricted existence quantification in (8a) recedes into the background in certain other cases while the danger of mathematical overdetermination is still imminent. In electrodynamics it seems that two potentials leading to the same field strengths are physically indistinguishable. The formulation in terms of potentials is, therefore, mathematically overdetermined and can indeed be replaced by one in terms of field strenghts alone. This is a case (8a) where existential quantification would concern only typified sets in which case the isomorphisms in (8a) may even be replaced by equality. (8a) then reduces to :3s E Ma. E(Yj s) /\ t = q(Yj S).
(8b)
Another example of this kind is Euclidean geometry axiomatized in terms of a distance function d. This could be criticized by pointing out the arbitrariness of fixing a unit of length. The elimination of d in favor of congruence and betweenness would be a case in point that moreover eliminates the real number set implied by d. It is true that this is another solvable case, and 14 15
Mackey 1963, p. 7l. To a large extent the material is in Hooker 1975 and 1979. See also Varadarajan 1968, Ch. VII; Ludwig's results are presented in Ludwig 1985
VII1.38 The Mathematical Overdetermination of Physics
583
that truly mysterious mathematical overdetermination would occur only in an unsolvable case where the strange manner in which (8) makes a statement about (Y; t) could not be replaced by ordinary statements composed according to the standards of (higher order) predicate logic. Even if there were such cases it would be very difficult to prove that there are. If, on the other hand, all physically relevant cases (8) were solvable, the solutions, although of interest in principle, may turn out to be too complicated to be used in practice. The situation would still be half-mysterious in that only a mathematical roundabout way would make things acceptable to our intellectual capacity.
Acknowledgements
In the following the editor would like to thank the publishers and the author for their kind permission to reprint the articles in this volume. 1. "Remarks on the Concept of Cause" reprinted from: Christensen, D. E., et al. (eds.): Contemporary German Philosophy, Vol. 3. University Park: The Pennsylvania State University Press, 1984, pp. 223-243. ©1984 by the Pennsylvania State University. Reproduced by permission of the publisher.
2. "Aspects of Wholeness in Science and Philosophy" reprinted from: Medical Systems with a Holistic Approach. Ed. by S. N. and Y. B. Tripathi, Varanasi 1993, pp. 13-29, with kind permission of Y. B. Tripathi. 3. "Kant's Apriorism and some Modern Positions" reprinted from: Scheibe, E. (ed.): The Role of Experience in Science. Proc. of the 1986 Conf. of the Acad. Int. de Philos. des Sciences (Bruxelles), held at the University of Heidelberg. Berlin: Walter de Gruyter, 1988, pp. 1-22. Reprinted by permission of the author. 4. "C. F. von Weizsacker and the Unity of Physics" translated by HansJakob Wilhelm from "C. F. von Weizsacker und die Einheit der Physik", in: Philosophia naturalis 30 (1993), pp. 126-145. Printed by permission of Vittorio Klostermann. 5. "Between Rationalism and Empiricism: The Path of Physics" translated by Hans-Jakob Wilhelm from "Zwischen Rationalismus und Empirismus: Der Weg der Physik", in: VernunftbegrifJe der Moderne. Stuttgarter HegelKongrefl1993. Ed. by F. Fulda and R. P. Horstmann, Klett-Cotta: Stuttgart - Bad Cannstatt 1994, pp. 73-95. Printed by permission of the author.
586
Acknowledgements
6. "The Physicists' Conception of Progress" reprinted from Studies in the History and Philosophy of Science, Vol. 19, pp. 141-159, ©1988, with permission from Elsevier Science. 7. "Erwin Schrodinger and the Philosophy of the Physicists" reprinted from Erwin Schrodinger's World View. The Dynamics of Knowledge and Reality. Gotschl, J. (ed.). Kluwer: Dordrecht 1992, pp. 25-34. Reprinted with kind permission of Kluwer Academic Publishers. 8. "Albert Einstein: Theory, Experience, Reality" translated by Charito Pizarro from "Albert Einstein: Theorie, Erfahrung, Wirklichkeit", in: Heidelberger Jahrbiicher XXXVI (1992), pp. 121-138. Printed with kind permission of Springer Verlag Heidelberg. 9. "Heisenberg's Concept of a Closed Theory" translated by Hans-Jakob Wilhelm from "Heisenbergs Begriff der abgeschlossenen Theorie", in: Werner Heisenberg. Physiker und Philosoph. Ed. by B. Geyer et al. Heidelberg: Spektrum Akademischer Verlag 1993, pp. 251-257. Printed with permission of Spektrum Akademischer Verlag. 10. "The Origin of Scientific Realism: Boltzmann, Planck, Einstein" reprinted from Observability, Unobservability and Scientific Realism. Ed. by M. Pauri. Kluwer: Dordrecht (forthcoming). ©1999 by Kluwer Academic Publishers. Reprinted with kind permission of Kluwer Academic Publishers. 11. "A Comparison of Two Recent Views on Theories" reprinted from: Metamedicine 3. (1982), pp. 233-255. Reprinted with kind permission of Kluwer Academic Publishers. 12. "On the Structure of Physical Theories" reprinted from: I. Niiniluoto and R. Tuomela: The Logic and Epistemology of Scientific Change. Amsterdam: Elsevier Science 1979, pp. 205-224. Reprinted by permission of the author. 13. "Towards a Rehabilitation of Reconstructionism" translated by HansJakob Wilhelm from "Zur Rehabilitierung des Rekonstruktionismus", in: Rationalitiit. Philosophische Beitriige. Ed. by H. Schniidelbach, Suhrkamp: Frankfurt a. M. 1984, pp. 94-116. Printed by permission of the author. 14. "Paul Feyerabend and Rational Reconstructions" translated by HansJakob Wilhelm from "Paul Feyerabend und die rationalen Rekonstruktionen",
Acknowledgements
587
in: Wozu Wissenschaftsphilosophie? Positionen und Fragen zur gegenwiirtigen Wissenschaftsphilosophie. Ed. by P. Hoyningen-Huene and G. Hirsch, de Gruyter: Berlin 1988, pp. 149-171. Printed by permission of the author. 15. "Coherence and Contingency: Two Neglected Aspects of Theory Succession" reprinted from: Nous XXIII (1989), pp. 1-16. Reprinted by permission of Blackwell Publishers. 16. "Predication and Physical Law" reprinted from: Topoi 10 (1991), pp. 3-12, with kind permission of Kluwer Academic Publishers. 17. "Substances, Physical Systems, and Quantum Mechanics" reprinted from: Advances in Scientific Philosophy. Essays in Honor of P. Weingartner. Ed. by G. Schurz and G. Dorn. Rodopi: Amsterdam 1991, pp. 215-230. Reprinted by permission of Editions Rodopi B. V. 18. "General Laws of Nature and the Uniqueness of the Universe" reprinted from: Philosophy and the Origin and Evolution of Universe. Ed. by E. Agazzi and A. Cordero. Kluwer: Dordrecht 1991, pp. 341-360, with kind permission of Kluwer Academic Publishers. 19. "On Limitations of Physical Knowledge" reprinted from: Philosophia naturalis 35 (1998), Heft 1, pp. 41-57, with permission of Vittorio Klostermann. 20. "The explanation of Kepler's laws by means of Newton's law of gravitation" translated by Hans-Jakob Wilhelm from "Die ErkUirung der Keplerschen Gesetze durch Newtons Gravitationsgesetz", in: Einheit und Vielheit. Festschrift fur F. v. Weizsiicker zum 60. Geburtstag. Ed. by E. Scheibe and G. Siifbmann, Vandenheock & Ruprecht: G6ttingen 1973, pp. 98-118. Printed by permission of the author.
c.
21. "Are there Explanations of Theories?" reprinted from: Christensen, D. E., et al. (eds.): Contemporary German Philosophy, Vol. 3. University Park: The Pennsylvania State University Press, 1984, pp. 141-158. ©1984 by the Pennsylvania State University. Reproduced by permission of the publisher. 22. "A Case Study Concerning the Limiting Case Relation in Quantum Mechanics" translated by Hans-Jakob Wilhelm from "Eine Fallstudie zur Grenzfallbeziehung in der Quantenmechanik", in: Grundlagenprobleme der modernen Physik. Festschrift fur P. Mittelstaedt zum 50. Geburtstag. Ed. by
588
Acknowledgements
J. Nitsch et al., Bibliographisches Institut: Mannheim 1981, pp. 257-269. Printed with permission of Spektrum Akademischer Verlag. 23. "A New Theory of Reduction in Physics" (pp. 249-271) reprinted from Philosophical Problems of the Internal and External Worlds: Essays on the Philosophy of A. Grunbaum, John Earman, Allen I. Janis, Gerald J. Massey, and Nicholas Rescher, eds., by permission of the University of Pittsburgh Press. ©1993 by University of Pittsburgh Press. 24. "The Rationality of Reductionism" reprinted from: Natural Sciences and Human Thought. Ed. by R. Zwilling. Springer: Berlin 1995, pp. 101-109. Reprinted by permission of Springer. 25. "Quantum Logic and Some Aspects of Logic in General" reprinted from: Recent Developments in Quantum Logic. Ed. by P. Mittelstaedt and E.-W. Stachow. Bibliographisches Institut: Mannheim 1986, pp. 115-128. Reprinted with kind permission of Spektrum Akademischer Verlag. 26. "What Kind of Hidden Variables are Excluded by Bell's Inequality?" reprinted from: Foundations of Physics. 7th Int. Congr. of Logic, Methodology and Philosophy of Science, Physics Section. Ed. by P. Weingartner and G. Dorn. Holder-Pichler-Tempsky: Vienna 1986, pp. 252-271. Reprinted with kind permission of 6bv & hpt Verlagsgesellschaft. 27. "The Copenhagen School and its Opponents" translated by Hans-Jakob Wilhelm from "Die Kopenhagener Schule und ihre Gegner", in: Wieviele Leben hat Schrodingers Katze? Zur Physik und Philosophie der Quantenmechanik. Ed. by J. Audretsch und K. Mainzer, Bibliographisches Institut: Mannheim 1990, pp. 157-182. Printed with permission of Spektrum Akademischer Verlag. 28. "J. von Neumann's and J. S. Bell's Theorem. A Comparison" translated by Hans-Jakob Wilhelm from "J. von Neumann's and J. S. Bell's Theorem. Ein Vergleich", in: Philosophia naturalis 28 (1991), pp. 35-53. Printed by permission of Vittorio Klostermann. 29. "EPR-Situation and Bell's Inequality" reprinted from: Existence and Explanation. Essays Presented in Honor of K. Lambert. Ed. by W. Spohn et al. Kluwer: Dordrecht 1991, pp. 115-129, with kind permission of Kluwer Academic Publishers.
Acknowledgements
589
30. "Three Remarks Concerning Bell's Inequality" reprinted from: Bell's Theorem and the Foundations of Modern Physics. Ed. by A. van der Merwe et al., Singapore: World Scientific 1993, pp. 428-435, with kind permission of World Scientific Publishing.
31. "Invariance and Covariance" reprinted from: Scientific Philosophy Today. Essays in Honor of Mario Bunge. Eds. J. Agassi and R. S. Cohen. Reidel: Dordrecht 1982, pp. 311-331, with kind permission of Kluwer Academic Publishers. 32. "Hermann Weyl and the Nature of Spacetime" reprinted from: Exakte Wissenschaften und ihre philosophische Grundlegung. Internationaler Hermann Weyl Kongress Kiel 1985. Ed. by W. Deppert et al., Frankfurt/Main: Peter Lang 1988, pp. 61-82, with kind permission of the editor.
33. "Covariance and the Non-Preference of Coordinate Systems" reprinted from: Causality, Method and Modality. Essays in Honor of J. Vuillemin. Ed. by G. G. Brittan, Jr. Kluwer: Dordrecht 1991, pp. 23-40, with kind permission of Kluwer Academic Publishers. 34. "A Most General Principle of Invariance" reprinted from: Philosophy, Mathematics and Modern Physics. A Dialogue. Ed. by E. Rudolph and I. O. Stamatescu. Springer: Berlin 1994, pp. 213-225. Reprinted by permission of Springer. 35. "Kant's Philosophy of Mathematics" translated by Hans-Jakob Wilhelm from "Kants Philosophie der Mathematik", in: Mittlgn. Math. Ges. Hamburg X (1977), pp. 353-372. Printed by permission of the author. 36. "Mathematics and Physical Axiomatization" reprinted from: Merites et limites des methodes logiques en philosophie. Ed. by Fondation SingerPolignac, Paris: Vrin 1986, pp. 251-277, with kind permission of Vrin. 37. "Calculemus! The Problem of the Application of Logic and Mathematics" first appeared in "Knowledge Organization" Heft II 1996, ©1996 by Indeks Verlag Frankfurt, Germany, ©1997 by Ergon Verlag Dr. H.-J. Dietrich, Wiirzburg, Germany. 38. "The Mathematical Overdetermination of Physics" reprinted from: Philosophy of Mathematics Today. Ed. by E. Agazzi and G. Darvas. Kluwer: Dordrecht 1997, pp. 269-285, with kind permission of Kluwer Academic Publishers.
Literature
(* = item appearing in the Introduction only) Adler, F.: Die Einheit des physikalischen Weltbildes. Naturwissenschafti. Wochenschrift. Neue Folge VIII (1909) 817-22 Anderson, J.L.: Principles of Relativity Physics. New York 1967 Anderson, J.L.: Covariance, Invariance, and Equivalence: a Viewpoint. Gen. ReI. Grav. 2 (1971) 161-72 Appel, K., and W. Haken: The Four Colour Proof Suffices. Math. Intelligencer 8 (1986) 10-20 Arndt, H.W.: Methodo scientific a pertractatum. Berlin 1971 Aristotle: Analytica Posterior a Aristotle: Metaphysica Aristotle: De Poetica Ashtekar, A.: On the Relation between Classical and Quantum Observables. Commun. Math. Phys. 71 (1980) 59-64 Ayala, F.J., and T. Dobzhansky (Eds.): Studies in the Philosophy of Biology. Reduction and Related Problems. Berkeley, Ca., 1974 Baker, G.A., Jr.: Formulation of Quantum Mechanics Based on the QuasiProbability Distribution Induced on Phase Space. Phys. Rev. 109 (1958) 2198-206 Ballentine, L.E.: The Statistical Interpretation of Quantum Mechanics. Revs. Mod. Phys. 42 (1970) 358-81 Balzer, W., and J.D. Sneed: Generalized Net Structures of Empirical Theories. Studia Logica 36 (1977) 195-211 and 37 (1978) 167-94 (German version: Verallgemeinerte Netz-Strukturen empirischer Theorien. In: Zur Logik empirischer Theorien. Ed. by W. Balzer and M. Heidelberger. Berlin 1983. 117-68)
592
Literature
Balzer, W., Moulines, C.U., and J.D. Sneed: An Architectonic for Science. The Structuralist Program. Dordrecht 1987 Barrow, J.D., and F.S. Tipler: The Anthropic Cosmological Principle. Oxford 1986 Bavink, B.: Bedeutung des Konvergenzprinzips fur die Erkenntnistheorie der Naturwissenschaften. Zeitschr. fUr philos. Forschg. 2 (1947) 111-30 Beck, L.W.: Can Kant's Synthetic Judgements be Made Analytic? KantStudien 47 (1955) 168-81 Beehner, J.: Bibliography on Quantum Logic. In: Studies in the Foundations of Quantum Mechanics. Ed. by P. Suppes. East Lansing, Mich., 1980. 223-56 Belinfante, F.J.: A Survey of Hidden-Variables Theories. Oxford 1973 Bell, J.S.: On the Einstein-Podolsky-Rosen Paradox. Physics 1 (1964) 195200 (repr. in Bell 1987, 14-21) Bell, J.S.: On the Problem of Hidden Variables in Quantum Mechanics. Revs. Mod. Phys. 38 (1966) 447-52 (repr. in Bell 1987, 1-13) Bell, J.S.: Introduction to the Hidden-Variable Question. In: Foundations of Quantum Mechanics. Proc. Int. School of Phys. 'Enrico Fermi', Course IL. Ed. by B. d'Espagnat. London 1971. 171-81 (repr. in Bell 1987, 29-39) Bell, J.S.: Speakable and Unspeakable in Quantum Mechanics. Cambridge 1987 Bell, J.L., and M. Machover: A Course in Mathematical Logic. Amsterdam 1977 Becker, 0.: Grundlagen der Mathematik. Freiburg 1954 Bergmann, G.: The Metaphysics of Logical Positivism. Madison, Wis., 1954; 21967 Berkeley, G.: The Analyst. In: The Works of George Berkeley, vol.IV. Ed. by A.A. Luce and T.E. Jessop. London 1951 Bertotti, B.: The Later Work of E. Schrodinger. Stud. Hist. Phil. Sci. 16 (1985) 83-100 Bethge, K., and U.E. Schroder: Elementarteilchen. Darmstadt 1986 Birkhoff, G., and J. v. Neumann: The Logic of Quantum Mechanics. Ann. of Math. 37 (1936) 823-43 (repr. in Hooker 1975, 1-26) Blackmore, J.T.: Boltzmann's Concession to Mach's Philosophy of Science. In: Ludwig Boltzmann 1981ff, Vol. 8, 155-90 Blanshard, B.: The Nature of Thought, 2 vols. London 1939
Literature
593
Blanshard, B.: Reason and Analysis. London 1962 Bohm, D.: Quantum Theory. Englewood Cliffs, N. J., 1951 Bohm, D.: A Suggested Interpretation of the Quantum Theory in Terms of 'Hidden' Variables. Phys. Rev. 85 (1952) 166-79, 180-193 Bohm, D.: Proof that Probability Density Approaches IlPI2 in Causal Interpretation of the Quantum Theory. Phys. Rev. 89 (1953) 458-66 Bohm, D.: Causality and Chance in Modern Physics. London 1957 Bohm, D.: Wholeness and the Implicate Order. London 1980 Bohm, D., and J. Bub: A Refutation of the Proof by Jauch and Piron that Hidden Variables Can be Excluded in Quantum Theory. Revs. Mod. Phys. 38 (1966) 453-69 Bohm, D., and J. Bub: A Proposed Solution of the Measurement Problem in Quantum Mechanics by a Hidden Variable Theory. Revs. Mod. Phys. 38 (1966) 453-69 Bohm, D., and J. Bub: On Hidden Variables - A Reply to Comments by Jauch and Piron and by Gudder. Revs. Mod. Phys. 40 (1968) 235f Bohm, D., and B.J. Hiley: On the Intuitive Understanding of Non-locality as Implied by Quantum Theory. Found. of Phys. 5 (1975) 93-109 Bohm, D., and B.J. Hiley: Measurement Understood through the Quantum Potential Approach. Found. of Phys. 14 (1984) 255-74 Bohm, D., Hiley, B.J., and P.N. Kaloyerou: An Ontological Basis for Quantum Theory. Physics Reports 144 (1987) 321-75 Bohm, D., and J.P. Vigier: Model of the Causal Interpretation of Quantum Theory in Terms of a Fluid with Irregular Fluctuations. Phys. Rev. 96 (1954) 208-16 Bohr, N.: Atomtheorie und Naturbeschreibung. Berlin 1931 (Engl. Ed.: Atomic Theory and the Description of Nature. Cambridge 1934; 1961) Bohr, N.: Can Quantum Mechanical Description of Physical Reality be Considered Complete? Phys. Rev. 48 (1935) 696-70 Bohr, N.: Atomic Physics and Human Knowledge. New York 1958 (German Ed.: Atomphysik und menschliche Erkenntnis. Aufsatze und Vortrage aus den Jahren 1933-1955. Braunschweig 1958; 21964) Bohr, N.: Physique Atomique et connaissance humaine. Ed. by C. Chevalley. Paris 1991
594
Literature
Bohr, N.: The Causality Problem in Atomic Physics. In: New Theories in Physics. Ed. by Intern. Inst. of Intellectual Co-operation. Paris 1939. 11-45 Bohr, N.: Essays 1958-1962 on Atomic Physics and Human Knowledge. New York 1963 (German Ed.: Atomphysik und menschliche Erkenntnis II. Aufsiitze und Vortriige aus den Jahren 1958-1962. Braunschweig 1966) Bois-Reymond, E. Du: Uber die Grenzen des Naturerkennens. Leipzig 1872; 111916 Boltzmann, L.: Vorlesungen uber Gastheorie. Leipzig 1896 (repr. in: Boltzmann 1981ff, vol. 1) Boltzmann, L.: Populiire Schriften. Leipzig 1905 and Braunschweig 1979 (ed. by E. Broda) Boltzmann, L.: Theoretical Physics and Philosophical Problems. Selected Writings. Ed. by B. McGuiness. Dordrecht 1974 Boltzmann, L.: Gesamtausgabe. Ed. by R.D. Sexl. Braunschweig 1981ff Boltzmann, L.: Principien der Naturfilosofi. Ed. by J.M. Fasol-Boltzmann. Berlin 1990 Bondi, H.: What is Progress in Science? Epistemologia 6 (1983) 87-98 Boole, G.: The Mathematical Analysis of Logic. Cambridge 1847. Oxford 1948 Born, M.: Natural Philosophy of Cause and Chance. London 1949 Born, M.: 1st die klassische Mechanik tatsiichlich deterministisch? Phys. Bliitter 11 (1955) 49-54 Born, M., and D.J. Hooton: Statistical Dynamics of Multiple Periodic Systems. Zeitschr. fur Physik 142 (1955) 201--18 Bourbaki, N.: Elements de Mathematique. Paris 1939ff Bourbaki, N.: Elements of Mathematics. Theory of Sets. Paris 1968 Bratelli, 0., and D.W. Robinson: Operator Algebras and Quantum Statistical Mechanics, vol.l. Berlin 1979 Bridgman, P.W.: The Nature of Physical Theory. Princeton 1936 Brittan, G.G.: Kant's Theory of Science. Princeton, N.J., 1978 Broglie, L. de: La theorie de la mesure en mecanique ondulatoire. Paris 1957 Brush, St.G.: Ludwig Boltzmann and the Foundations of Natural Sciences. In: Boltzmann 1990, 43-61
Literature
595
Bub, J.: Hidden Variables and the Copenhagen Interpretation - A Reconciliation. Brit. Journ. Phil. Sci. 19 (1968) 185-210 Bub, J.: What is a Hidden Variable Theory of Quantum Phenomena? Int. Journ. Theor. Phys. 2 (1969) 101-23 Campbell, N.R: Physics: The Elements. Cambridge 1920 and New York 1957 Capasso, V., Fortunato, D., and F. Selleri: Von Neumann's Theorem and Hidden Variable Models. Riv. del Nuovo Cimento II (1970) 149-99 Carnap, R: Der logische Aufbau der Welt. Berlin 1928; Hamburg 21961 Carnap, R: Scheinprobleme in der Philosophie: Das Fremdpsychische und der Realismusstreit. Berlin 1928; Hamburg 21961 Carnap, R: Testability and Meaning. Philos. of Sci. 3 (1936) 420-71 and 4 (1937) 1-40 Carnap, R: Logical Foundations of the Unity of Science. In: Encyclopedia of Unified Science. Ed. by O. Neurath. Chicago 1938. 42-62 Carnap, R: Logical Foundations of Probability. Chicago 1950; 21962 Carr, B.J., and M.J. Rees: The Anthropic Principle and the Structure of the Physical World. Nature 278 (1979) 605-12 Cartan, E.: Sur un theoreme fondamental de M. H.Weyl. Journ. Math. pure appl. 2 (1923) 167-92 Cartan, E.: Sur les varietes a connexion affine et la theorie de la relativite generalisee. Ann. sci. Ecole Normale Super. 40 (1923) 326-412 and 41 (1924) 1-25 Carter, B.: Large Number Coincidences and the Anthropic Principle in Cosmology. In: Confrontation of Cosmological Theories with Observational Data. Ed. by M.S. Longair. Dordrecht 1974 Cartwright, N.: How the Laws of Physics Lie. Oxford 1983 Chew, G.F.: 'Bootstrap': A Scientific Idea? Science 161 (1968) 762-5 Choquet-Bruhat, Y., de Witt-Morette, C., and M. Dillard-Bleick: Analysis, Manifolds and Physics. Amsterdam 1977 Clauser, J., and A. Shimony: Bell's Theorem: Experimental Tests and Implications. Rep. Progr. Phys. 41 (1978) 1881-1927 Coffa, J.A.: Elective Affinities: Weyl and Reichenbach. In: Hans Reichenbach. Logical Empiricist. Ed. by W. Salmon. Dordrecht 1979. 267-304 Cohen, LB.: Revolution in Science. Cambridge, Mass., 1985
596
Literature
Davies, P.C.W.: The Accidental Universe. Cambridge 1982. Davis, Ph. J., and R Hersh: The Mathematical Experience. Brighton, Suss., 1982 Davis, M.: Computability and Unsolvability. New York 1958; 21982 Davis, M, (Ed.): The Undecidable. New York 1965 Descartes, R: Oeuvres. Ed. by Ch. Adam and P. Tannery. Paris 1897ff. New Ed. ibid. 1974ff Descartes, R: Meditations sur la philosophie premiere. Paris 1641 Devaney, RL.: Chaotic Dynamical Systems. New York 1986 Dixon, W.G.: Special Relativity. Cambridge 1978 Dodd, J.E.: The Ideas of Particle Physics. Cambridge 1984 Dombrowski, H.D., and K. Horneffer: Der Begriff des physikalischen Systems in mathematischer Sicht. Nachr. Akad. Wiss. G6ttingen, Math.-phys. Kl. 1964.67-100 Dorling, J.: Schr6dinger's Original Interpretation of the Schr6dinger Equation: a Rescue Attempt. In: Schr6dinger. Centenary Celebration of a Polymath. Ed. by C.W. Kilmister. Cambridge 1987. 10-40 Duhem, P.: La Theorie physique: Son objet et sa structure. Paris 1906; 21914 (Engl. Ed.: The Aim and Structure of Physical Theory. Princeton, N. J., 1954; New York 1962) Ehlers, J.: The Nature and Structure of Spacetime. In: The Physicist's Conception of Nature. Ed. by J. Mehra. Dordrecht 1971. 71-91 Ehlers, J.: Uber den Newtonschen Grenzwert der Einsteinschen Gravitationstheorie. In: Grundlagenprobleme der modernen Physik. Ed. by J. Nitsch, J. Pfarr and E.W. Stachow. Mannheim 1981. 65-84 Ehlers, J.: On Limit Relations between, and Approximate Explanations of, Physical Theories. In: Logic, Methodology, and Philosophy of Science VII. Ed. by R Barcan Marcus et al. Amsterdam 1986. 387-403 Einstein, A.: Uber das Relativitatsprinzip und die aus demselben gezogenen Folgerungen. Jahrb. der RadioaktiviUit und Elektronik 4 (1907) 411-62 Einstein, A.: Antrittsrede. Sitzungsberichte der K6niglich Preussischen Akademie der Wissenschaften XXVIII (1914) 739-42 Einstein, A.: Ernst Mach. Phys. Zeitschr. 17 (1916) 101-4
Literature
597
Einstein, A.: Die Grundlage der allgemeinen Relativitatstheorie. Ann. der Phys. 49 (1916) 769-822 Einstein, A.: Uber die spezielle und die allgemeine Relativitatstheorie. Braunschweig 1917; 23 1988 Einstein, A.: Prinzipielles zur allgemeinen Relativitatstheorie. Ann. der Phys. 55 (1918) 241-4 Einstein, A.: Priifung der allgemeinen Relativitatstheorie. Naturwissenschaften 7 (1919) 776 Einstein, A.: Grundziige der Relativitatstheorie. Braunschweig 1922; 81990 Einstein and the Philosophies of Kant and Mach. Nature 112 (1923) 253 Einstein, A.: Elsbachs Buch: Kant und Einstein. Deutsche Literaturztg. 45 (1924) 1685-92 Einstein, A.: Uber den gegenwartigen Stand der Feldtheorie. In: Festschrift fUr A. Stodola. Ziirich 1929. 126-32 Einstein, A.: Mein Weltbild. Ed. by C. Seelig. Amsterdam 1934; Frankfurt 1989 Einstein, A.: Quantenmechanik und Wirklichkeit. Dialectica 2 (1948) 320-3 Einstein, A.: On the Generalized Theory of Gravitation. Sci. Amer. 182 (1950), no.4, 13-7 Einstein, A.: Autobiographical Notes. In: Schilpp 1949. 1-95 (German Ed.: Autobiographisches. In: Schilpp 1955. 1-35) Einstein, A.: Remarks to the Essays Appearing in this Co-operative Volume. In: Schilpp 1949. 663-88 (Germ. Ed.: Bemerkungen zu den in diesem Bande vereinigten Arbeiten. In: Schilpp 1955. 493-511) Einstein, A.: Ideas and Opinions. New York 1954 Einstein, A.: Einleitende Bemerkungen iiber Grundbegriffe. In: George 1955, 13-7 Einstein, A.: Aus meinen spat en Jahren. Stuttgart 1979; 31984 Einstein, A., and L. Infeld: The Evolution of Physics. Cambridge 1938 Einstein, A., Podolsky, B., and N. Rosen: Can Quantum Mechanical Description of Physical Reality be Considered Complete? Phys. Rev. 47 (1935) 777-80 Albert Einstein/Hedwig und Max Born: Briefwechsel 1916-1955. Miinchen 1969
598
Literature
Elkana, Y.: Boltzmann's Scientific Research Programmme and its Alternatives. In: Some Aspects of the Interaction between Science and Philosophy. Ed. by Y. Elkana. Atlantic Highlands 1971, 243-79 Ellis, G.F.R: The World's Environment: the Universe. South African Journal of Science 75 (1979) 529-33 Emch, G.: Geometric Dequantization and the Correspondence Problem. Int. Journal of Theor. Physics 22 (1983) 397-420 Emch, G.G.: Mathematical and Conceptual Foundations of 20th-Century Physics. Amsterdam 1984 Exner, F.: Vorlesungen uber die physikalischen Grundlagen der Naturwissenschaften. Vienna 1919 Feyerabend, P.: Explanation, Reduction, and Empiricism. In: Minnesota Studies in the Philosophy of Science III. Ed. by H. Feigl and G. Maxwell. Minneapolis, Minn., 1962. 28-97 Feyerabend, P.: How to be a Good Empiricist - a Plea for Tolerance in Matters Epistemological. In: Philosophy of Science. The Delaware Seminar, vol. 2. Ed. by B. Baumrin. New York 1963. 3-39 Feyerabend, P.: Reply to Criticism. In: Boston Studies in the Philosophy of Science, vol. 2. Ed. by RS. Cohen and M.W. Wartofsky. New York 1965. 223-61 (German version: Antwort an Kritiker. In Feyerabend 1981a, 126-60) Feyerabend, P.: On the Meaning of Scientific Terms. Journ. of Philos. 62 (1965) 266-74 Feyerabend, P.: Problems of Empiricism. In: Beyond the Edge of Certainty. Ed. by RG. Colodny. Englewood Cliffs, N.J., 1965. 145-260 Feyerabend, P.: Problems of Empiricism, Part II. In: The Nature and Function of Scientific Theories. Ed. by RG. Colodny. Pittsburgh 1970. 275-353 Feyerabend, P.: Against Method. In: Minnesota Studies in the Philosophy of Science IV. Ed. by M. Radner and S. Winokur. Minneapolis, Minn., 1970. 17-130 Feyerabend, P.: Consolations for the Specialist. In: Lakatos/Musgrave 1970. 197-230 (Extended german version: Kuhns Struktur wissenschaftlicher Revolutionen. Ein Trostbuchlein fUr Spezialisten. In: Feyerabend 1978. 153-204) Feyerabend, P.: Die Wissenschaftstheorie - eine bisher unbekannte Form des Irrsinns? In: Natur und Geschichte. X. Deutscher Kongress fur Philosophie, KieI1972. Ed. by K. Hubner and A. Menne. Hamburg 1973. 88-124 (repr. in Feyerabend 1978, 293-338)
Literature
599
Feyerabend, P.: Against Method. Outline of an Anarchistic Theory of Knowledge. London 1975 (German Ed.: Wider den Methodenzwang. Skizze einer anarchistischen Erkenntnistheorie. Frankfurt 1976) Feyerabend, P.: Der wissenschaftstheoretische Realismus und die Autoritiit der Wissenschaften. Ausgewiihlte Schriften, Bd.l. Braunschweig 1978 Feyerabend, P.: Probleme des Empirismus. Ausgewiihlte Schriften, Bd.2. Braunschweig 1981 Feyerabend, P.: Realism, Rationalism and Scientific Method. Philosophical Papers VoLl, Cambridge 1981 Feyerabend, P.: Problems of Empiricism. Philosophiocal Papers Vol.2, Cambridge 1981 Field, H.: Science without Numbers. A Defence of Nominalism. Princeton, N.J., 1980 Fisher, A.: Formal Number Theory and Computability. Oxford 1982 Folland, G.B.: Weyl Manifolds. Journ. Differ. Geom. 4 (1970) 145-53 Folse, H.J.: The Philosophy of Niels Bohr. Amsterdam 1985 Forman,P.: Weimar Culture, Causality and Quantum Theory, 1918-1927. Hist. Stud. in the Phys. Sciences 3 (1971) 1-115 Fraenkel, A., Bar-Hillel, Y., and A. Levy: Foundations of Set Theory. Amsterdam 21973 Frank, Ph.: Einstein, Mach, and Logical Positivism. In: Schilpp 1949, 269-86 (Schilpp 1955, 173-87) Frege, G.: Begriffsschrift. Halle 1879 Freistadt, H.: The Causal Formulation of Quantum Mechanics of Particles. Suppl. al Nuovo Cimento V, ser.X, (1957) 1-70 Freudenthal, H.: Zu den Weyl-Cartanschen Raumproblemen. Arch. der Math. 11 (1960) 107-15 Freudenthal, H.: Lie Groups in the Foundations of Geometry. Advances in Math. 1 (1964) 145-90 Freudenthal, H.: 1m Umkreis der sog. Raumprobleme. In: Essays on the Foundation of Mathematics, Dedicated to Prof Fraenkel. Ed. by Y. Bar-Hillel et al. Jerusalem 1967. 322-7 Friedman, M.: Relativity Principles, Absolute Objects, and Symmetry Groups. In: Space, Time, and Geometry. Ed. by P. Suppes. Dordrecht 1973. 296-320
600
Literature
Gabbai, D., and F. Guenthner (Eds.): Handbook of Philosophical Logic. 3 vols. Dordrecht 1983 Gale, G.: The Anthropic Principle. Sci. Amer. 245 (1981) 114-22 Galilei, G.: Le Opere di Galileo Galilei. Editione Nazionale, ed. by A. Favaro. Florence 1890-1909 *Gell-Mann, M.: The Quark and the Jaguar. New York 1994 (German Ed.: Das Quark und der Jaguar. Miinchen 1994) Genz, H., and R Decker: Symmetrie und Symmetriebrechung in der Physik. Braunschweig 1991 George, A. (Ed.): Louis de Broglie. Physicien et Penseur. Paris 1953 (German ed. (shortened): Louis de Broglie und die Physiker. Hamburg 1955) Gleason, A.M.: Measures on the Closed Subspaces of a Hilbert Space. Journ. of Math. and Mech. 6 (1957) 885-93 Goldblatt, RI.: Semantic Analysis of Orthologic. Journ. of Philos. Logic 3 (1974) 19-35 Goldblatt, RI.: Orthomodularity is not Elementary. Journ of Symbolic Logic 49 (1984) 401-4 Grossmann, S.: Chaos, Unordnung und Ordnung in nichtlinearen Systemen. Phys. Bliitter 39 (1983) 139-45 Gudder, S.P.: Hidden Variables in Quantum Theory Reconsidered. Revs. Mod. Phys. 40 (1968) 229-31 Gudder, S.P.: On Hidden Variable Theories. Journ. of Math. Phys. 11 (1970) 431-6 Haken, H.: Erfolgsgeheimnisse der Natur. Stuttgart 1984 Hardy, G.H.: A Mathematician's Apology. Cambridge 1940 Hart, H.L.A., and A.M. Honore: Causation in the Law. Oxford 1959 Hart, M.: The Evolution of the Atmosphere of the Earth. Icarus 33 (1978) 23-39 Havas, P.: Four-Dimensional Formulation of Newtonian Mechanics and their Relation to the Special and General Theory of Relativity. Revs. Mod. Phys. 36 (1964) 938-65 Hawking, St.: Is the End in Sight for Theoretical Physics? Cambridge 1980 Heilbron, J.L.: The Dilemmas of an Upright Man. Max Planck as Spokesman for German Science. Berkeley, Cal., 1986
Literature
601
Heilbron, J.L.: Max Planck. Ein Leben fUr die Wissenschaft. 1858-1947. Stuttgart 1988 Heisenberg, W.: Die physikalischen Prinzipien der Quantentheorie. Stuttgart 1930; 2Mannheim 1958 Heisenberg, W.: Wandlungen der Grundlagen der exakten Naturwissenschaft in jiingster Zeit. Angew. Chemie 47 (1934) 697-702 (repr. in 41943, 7-23, and in 1984, CI, 96-101) Heisenberg, W.: Wandlungen in den Grundlagen der Naturwissenschaft. Leipzig 1935; 41943 Heisenberg, W.: Prinzipielle Fragen der modernen Physik. In: Neuere Fortschritte in den exakten Wissenschaften. Leipzig and Vienna 1936. 91-102 (repr. in 41943,38-50, and in 1984, CI, 108-19) Heisenberg, W.: Die Einheit des naturwissenschaftlichen Weltbildes. Leipzig 1942 (repr. in 1984, CI, 181-92) Heisenberg, W.: Der Begriff"Abgeschlossene Theorie" in der modernen Naturwissenschaft. Dialectica 2 (1948) 331-6 (repr. in 1971, 87-94, and in 1984, CI,335-40) Heisenberg, W.: Platons Vorstellungen von den kleinsten Bausteinen der Materie und die Elementarteilchen der modernen Physik. In: 1m Umkreis der Kunst. Eine Festschrift fUr Emil Preetorius. Wiesbaden 1953. 137-40 (repr. in 1984, CI, 394-7) Heisenberg, W.: Elementarteile der Materie. In: Vom Atom zum Weltsystem. Stuttgart 1954. 45-58 (repr. in 1984, CI, 421-33) Heisenberg, W.: Physics and Philosophy. New York 1958. (German Ed.: Physik und Philosophie. Stuttgart 1959; repro in 1984, CIl, 3-203) Heisenberg, W.: Die Plancksche Entdeckung und die philosophischen ProbIerne der Atomphysik. Universitas 14 (1959) 135-48 (repr. in 1984, CIl, 20512) Heisenberg, W.: Der Teil und das Ganze. Miinchen 1969. (Engl. Ed.: Physics and Beyond. London 1971; repro in 1984, CIll, 3-334) Heisenberg, W.: Abschluss der Physik? Siiddeutsche Zeitung 6. 10. 1970 (repr. in 1971, 306-13, and in 1984, CIlI, 385-92) Heisenberg, W.: Schritte iiber Grenzen. Miinchen 1971 (Engl. Ed. by P. Heath: Across the Frontiers. New York 1974) Heisenberg, W.: Die Richtigkeitskriterien der abgeschlossenen Theorien in der Physik. In: Einheit und Vielheit. Festschrift fiir C.F. V. Weizsacker zum
602
Literature
60. Geburtstag. Ed. by E. Scheibe and G. (repr. in 1984, CIlI, 417-21)
Sii~mann.
G6ttingen 1973. 140-4
Heisenberg, W.: The Development of Concepts in Physics of the 20th Century. In: Connaissance Scientifique et Philosophie. Ed. by Acad. Roy. Belgique. Bruxelles 1975. 161-7 (repr. in 1984, CIlI, 447-463) Heisenberg, W.: Ordnung der Wirklichkeit. Ed. by H. Rechenberg. Miinchen 1989 (repr. in 1984, CI, 217-306; French Ed. by C. Chevalley: Philosophie. Le manuscrit de 1942. Paris 1998) Heisenberg, W.: Collected Works. Ed. by W. Blum, H.-P. Diirr and H. Rechenberg. Series C (5 voL). Miinchen 1984 Hellman, G.: Stochastic Einstein-Locality and the Bell Theorem. Synthese 53 (1982) 461-504 Helmholtz, H. von: Uber das Verhaltnis der Naturwissenschaften zur Gesamtheit der Wissenschaften. In: Vortrage und Reden 1. Braunschweig 51903. 15985 Helmholtz, H. von: Die Tatsachen der Wahrnehmung. Berlin 1879 (Quoted Ed. Darmstadt 1959) Hempel, C.G., and P. Oppenheim: Studies in the Logic of Explanation. Philos. of Science 15 (1948), 135-75 (Repr. in: Hempel 1965, 245-90) Hempel, C.G.: Aspects of Scientifc Explanation and other Essays in the Philosophy of Science. New York 1965 Hensel, E. (Ed.): Paul Hensel. Sein Leben in seinen Briefen. Wolfenbiittel/Hannover 1947 Hentschel, K.: Die Korrespondenz Einstein - Schlick: Zum Verhaltnis der Physik zur Philosophie. Annals of Science 43 (1986) 475-88 Hentschel, K.: Einstein's Attitude towards Experiments: Testing Relativity Theory 1907-1927. Stud. Hist. Phil. Sci. 23 (1992) 593-624 Hermann, R.: Vectorbundles in Mathematical Physics. 2 vols. New York 1970 Hertz, H.: Untersuchungen iiber die Ausbreitung der elektrischen Kraft. Leipzig 1892; 31914 Hertz, H.: Die Prinzipien der Mechanik. Leipzig 1894; Darmstadt 1963 Hilbert, D.: Grundlagen der Geometrie. Leipzig 1899; 71930 Hilbert, D.: Mathematische Probleme. Arch. fUr Math. und Phys. 3d ser., voU (1901) 44-63, 213-37 (also in: Ges. Abhdlg. vol.IlI 2Berlin 1970, 290329)
Literature
603
Hilbert, D.: Axiomatisches Denken. Math. Ann. 78 (1918) 405-15 (also in: Ges. Abhdlg. voLIII. 2Berlin 1970. 146-56) Hilbert, D.: Naturerkennen und Logik. Naturwissenschaften 18 (1930) 959-63 (also in: Ges. Abhdlg. vol.1I1. 2Berlin 1970. 378ff) Hilbert, D., Neumann, J.v., and L. Nordheim: Uber die Grundlagen der Quantenmechanik. Math. Ann. 98 (1928) 1-30 Hintikka, J.: Knowledge and the Known. Dordrecht 1974 Hoffmann, B., and H. Dukas: Albert Einstein. Creator and Rebel. New York 1972 Hofstadter, D.R.: Mathematische Spielereien. Spektrum der Wiss. 1982. Jan.,
7-17 Holton, G.: Mach, Einstein, and the Search for Reality. Daedalus 97 (1968) 636-73 Holton, G.: Einstein's Scientific Program: The Formative Years. In: Some Strangeness in the Proportion. Ed. by H. Woolf, Reading, Mass., 1980.49-65 Holton, G.: Thematic Origins of Scientific Thought. From Kepler to Einstein. Cambridge, Mass., 1973 Holton, G.: Thematische Analyse der Wissenschaft. Die Physik Einsteins und seiner Zeit. Frankfurt 1981 Hooker, C.A. (Ed.): The Logico-Algebraic Approach to Quantum Mechanics, VoLl and II. Dordrecht 1975 and 1979 Howard, D.: Einstein and Duhem. Synthese 83 (1990) 363-84 Hoyningen-Huene, P.: Zu Problemen des Reduktionismus der Biologie. Philos. Natur. 22 (1985) 271-86 Hund, F.: Geschichte der physikalischen Begriffe. Mannheim 1972 Hiittemann, A.: Idealisierungen und das Ziel der Physik. Berlin 1997 Iyanaga, S., and Y. Kawada (Eds.): Encyclopedic Dictionary of Mathematics. Cambridge, Mass., 1977 Jaki, St.: The Relevance of Physics. Chicago 1966 Jammer, M.: The Philosophy of Quantum Mechanics. New York 1974 Janich, P.: Die Sprache der Physik und die Wirklichkeit der Naturwissenschaften. Dialectica 31 (1977) 301-12
604
Literature
Janich, P., Kambartel, F., and J. Mittelstrall,: Wissenschaftstheorie als Wissenschaftskritik. Frankfurt 1974 Janich, P., and H. Tetens: Protophysik. Eine Einfiihrung. Philos. Natur. 22 (1985) 3-21 Jauch, J.M., and C. Piron: Can Hidden Variables be Excluded in Quantum Mechanics? Helv. Phys. Acta 36 (1963) 827-37 Jauch, J.M., and C. Piron: Hidden Variables Revisited. Revs. Mod. Phys. 40 (1968) 228f Jensen, RB.: Modelle der Mengenlehre. Berlin 1967 Jost, R: Boltzmann und Planck. Die Krise des Atomismus urn die Jahrhundertwende und ihre Uberwindung durch Einstein. In: Einstein- Symposion Berlin. Ed. by H. Nelkowski et al. Berlin 1979. 128-45 Jungnickel, Ch., and R McCormmach: Intellectual Mastery of Nature. Theoretical Physics from Ohm to Einstein. 2 vols. Chicago 1986 Kalmbach, G.: Orthomodular Lattices. London 1983 Kamber, F.: Die Struktur des Aussagenkalkiils in einer physikalischen Theorie. Nachr. Akad. Wiss. Gottingen, Math.-phys. Kl. 1964. 103-24 Kamber, F.: Zweiwertige Wahrscheinlichkeitsfunktionen auf orthokomplementaren Verbanden, Math. Ann. 158 (1965) 158-96 Kant, I.: Gedanken von der wahren Schatzung der lebendigen Krafte. KOnigsberg 1747 Kant, I.: Untersuchung tiber die Deutlichkeit der Grundsatze der nattirlichen Theologie und der Moral. Konigsberg 1764 (quoted from Akademie-Edition, vol. II, pp.273-301) Kant, I.: Von dem erst en Grunde des Unterschiedes der Gegenden im Raume. 1768 Kant, I.: De mundi sensibilis atque intelligibilis forma et principiis. Konigsberg 1770 Kant, I.: Kritik der reinen Vernunft. Riga 1781 (A); 21787 (B) Kant, I.: Prolegomena. Riga 1783 Karp, C.R: Languages with Expressions of Infinite Length. Amsterdam 1964 Kemeny, J.G., and P. Oppenheim: On Reduction. Philos. Studies 7 (1956) 6-19
Literature
605
Kienle, H.: Vom Wesen astronomischer Forschung. Bremer Beitrage zur Naturwissenschaft 1 (1933) 113-25 Kirchhoff, G.: Vorlesungen tiber mathematische Physik. VoLl: Mechanik. Leipzig 1876 Klein, F.: Elementarmathematik vom hoheren Standpunkt aus, vol. II: Geometrie. 3Berlin 1925 Kline, M.: Mathematics. The Loss of Certainty. New York 1980 Klingenberg, W.: Eine Kennzeichnung der Riemannschen sowie der Hermiteschen Mannigfaltigkeiten. Math. Zeitschr. 70 (1959) 300-9 Kobayashi, S., and T. Nagano: On the Fundamental Theorem ofWeyl-Cartan on G-Structures. Journ. Math. Soc. Japan 17 (1965) 84-101 Kobayashi, S., and K. Nomizu: Foundations of Differential Geometry, voU, New York 1963 Konig. J.: Bemerkungen tiber den Begriff der Ursache. In: Das Problem der Gesetzlichkeit. Ed. by Jungiusgesellschaft der Wissenschaften in Hamburg, Hamburg 1949. 25-120 (Repr. in: Josef Konig. Vortrage und Aufsatze. Ed. by. G. Patzig. Freiburg 1978. 122-255) Krantz, D.H., Luce, R.D:, Suppes, P., and A. Tversky: Foundations of Measurement, vol.l. New York 1971 Kreisel, G., and J.L. Krivine: Elements of Mathematical Logic. Amsterdam 21971
Kretschmann, E.: Uber den physikalischen Sinn der Relativitatspostulate. A.Einsteins neue und seine ursprtingliche Relativitatstheorie. Ann. der Phys. 53 (1917) 575-614 Krips, H.: Quantum Theory and Measures on Hilbert Space. Journ. of Math. Physics 18 (1977) 1015-21 Kruszynsky, P.: Extensions of Gleason's Theorem. In: Quantum Probability and Applications to the Quantum Theory of Irreversible Processes. Ed. by L.Accardi et al. Berlin 1984, 210-27 Kunen, K.: Set Theory. An Introduction to Independence Proofs. Amsterdam 1980 Ktinzle, H.P.: Galilei and Lorentz Structures on Spacetime: Comparison of the Corresponding Geometry and Physics. Ann. Inst. Henri Poincare XVII (1972) 337-62 Ktinzle, H.P.: Covariant Newtonian Limit of Lorentz Space-Times. Gen. Rel. Grav. 7 (1976) 445-57
606
Literature
Kuhn, Th.S.: The Structure of Scientific Revolutions. Chicago 1962; 21970 Kuhn, Th. S.: Logic of Discovery or Psychology of Research? In: Criticism and the Growth of Knowledge. Ed. by I. Lakatos and A. Musgrave. Cambridge 1970. 1-23 Kuhn, Th.S.: The Essential Tension. Chicago 1977 (German Ed.: Die Entstehung des Neuen. Ed. by L. Kriiger. Frankfurt 1977) Kuhn, Th.S.: Commensurability, Comparability, Communicability. In: PSA 1982. Ed. by P.D. Asquith and T. Nickles. East Lansing, Mich., 1983. 669-88 Lakatos, I.: Falsification and the Methodology of Scientific Research Programmes. In: Criticism and the Growth of Knowledge. Ed. by I. Lakatos and A. Musgrave. Cambridge 1970. 91-195 (repr. in Lakatos 1978, vol.1) Lakatos, I.: History of Science and its Rational Reconstruction. In: Boston Studies in the Philosophy of Science VIII. Ed. by R.C. Buck and R.S. Cohen. Dordrecht 1971. 91-136 (repr. in Lakatos 1978, vol.1) Lakatos, I.: Philosophical Papers, 2 vols. Ed. by J. Worrall and G. Currie. Cambridge 1978 Lambert, J.K.: Free Logic and the Concept of Existence. Notre Dame Journal of Formal Logic VIII (1967) 133-44 Lanford, O.E.: The Evolution of Large Classical Systems. In: Dynamical Systems, Theory and Applications. Ed. by J. Moser. Berlin 1975. 1-111 Lanford, O.E.: On a Derivation of the Boltzmann Equation. Asterisque 40 (1976) 117-37 Laugwitz, D.: Uber eine Vermutung von Hermann Weyl zum Raumproblem. Arch. der Math. 9 (1958) 128-33 Laurikainen, K.V.: Beyond the Atom. Berlin 1988 Leibniz, G.W.: Samtliche Schriften und Briefe. Ed. by the Berlin- Brandenburgische (formerly Preussische) Acad. of Scis. Berlin 1923ff Leibniz, G.W.: Pre-Edition (Vorausedition) of the Philosophical Writings (=VE). Fasc. 7. Miinster 1988 Leibniz, G.W.: Die philosophischen Schriften von G.W. Leibniz. 7 vols. Ed. by C.J. Gerhardt. Berlin 1890; Hildesheim 1961 Leibniz, G.W.: Discours de Metaphysique. 1686; Meiner: Hamburg 1958 Leibniz, G.W.: La logique de Leibniz d'apres des documents inedits. Ed. by L. Couturat. Paris 1901
Literature
607
Leibniz. G.W.: Opuscules et fragments inedits de Leibniz. Ed. by L.Couturat. Paris 1903 Leibniz, Textes inedits. Ed. by G. Grua. 2 vol. Paris 1948 Leibniz. G.W.: Nouveaux Essais sur l'Entendement Humain. Leipzig 1765; In: G.W. Leibniz. Philosophische Schriften. Ed. by W. v. Engelhardt and H.H. Holz. Vol. iii, Darmstadt 1959 Leplin, J., (Editor): Scientific Realism. Berkeley, Cal., 1984 Locke, J.: An Essay Concerning Human Understanding. London 1700; Ed. by P.H. Nidditch. Oxford 1975 Lorenz, K: Kants Lehre vom Apriorischen im Lichte gegenwartiger Biologie. Blatter fUr Deutsche Philos. 15 (1941) 94~125. (Repr. in: Die Evolution des Denkens. Ed. by K Lorenz and F.M. Wuketits. Miinchen 1983. 95~124 Lorenz, K: Die Riickseite des Spiegels. Miinchen 1977 Lorenzen, P.: Differential und Integral. Frankfurt 1965 Lorenzen, P.: Methodisches Denken. Frankfurt 1968 Lorenzen, P.: Konstruktive Wissenschaftstheorie. Frankfurt 1974 Lorenzen, P., and O. Schwemmer: Konstruktive Logik, Ethik und Wissenschaftstheorie. 2Mannheim 1975 Lottermoser, M.: Uber den Newtonschen Grenzwert der Allgemeinen Relativitatstheorie und die relativistische Erweiterung Newtonscher Anfangsdaten. Dissertation Ludwig-Maximilian Universitat, Miinchen, 1988 Ludwig, G.: Deutung des Begriffs 'physikalische Theorie' und axiomatische Grundlegung der Hilbertraumstruktur der Quantenmechanik durch Hauptsatze des Messens. Berlin 1970 Ludwig, G.: Die Grundstruturen einer physikalischen Theorie. Berlin 1978; 21990 Ludwig, G.: An Axiomatic Basis for Quantum Mechanics, Vol.l. Berlin 1985 Lukasiewicz, J.: Aristotle's Syllogistic from the Standpoint of Modern Formal Logic. Oxford 1951; 21957. Mach, E.: Die Geschichte und die Wurzel des Satzes von der Erhaltung der Arbeit. Prag 1872; quoted Ed.: Amsterdam 1969 Mach, E.: Die Mechanik in ihrer Entwicklung. Leipzig 1883; quoted Ed.: 71912 Mach, E.: Die Analyse der Empfindungen. Jena 1885; quoted Ed.: Darmstadt 1991
608
Literature
Mach, E.: Die Prinzipien der Warmelehre. Leipzig 1896; quoted Ed.: 21900 Mach, E.: Erkenntnis und Irrtum. Leipzig 1905; quoted Ed.: Darmstadt 1968 Mach, E.: Die Leitgedanken meiner naturwissenschaftlichen Erkenntnislehre und ihre Aufnahme durch die Zeitgenossen. Phys. Zeitschr. XI (1910) 599-606 Mackey, G. W.: Mathematical Foundations of Quantum Mechanics. New York 1963 Mal'cev, A. I.: The Metamathematics of Algebraic Systems. Amsterdam 1971 Mandelbrot, B.B.: The Fractal Geometry of Nature. New York 1977; 21983 Margenau, H.: Einstein's Conception of Reality. In: Schilpp 1949, 243-68 (Schilpp 1955, 151-72) Mathematics: The Unifying Thread in Science. Notices of the Amer. Math. Soc. 33 (1986) 716-33 Mayer-Kuckuk, Th.: Der gebrochene Spiegel. Basel 1989 Mayr, E.: The Growth of Biological Thought. Diversity, Evolution and Inheritance. Cambridge, Mass., 1982 (German ed.: Die Entwicklung der biologischen Gedankenwelt. Vielfalt, Evolution und Vererbung.Berlin 1984) Mayr, E., and St. Weinberg: The Limits of Reductionism. Nature 331 (1988) 475-6 Mc Mullin, E. (Ed.): Galileo. Man of Science. New York 1967 Medawar, P.: A Geometric Model of Reduction and Emergence. In: Ayala/Dobzhansky 1974. 57-63 Medawar, P.B. and J.S.: Aristotle to Zoos. A Philosophical Dictionary of Biology. London 1985 Mill, J.St.: A System of Logic. London 1847 Minkowski, H.: Raum und Zeit. Speech before the 80. Versammlung Deutscher Naturforscher und Arzte in Cologne, 1908 (repr. in: Das Relativitatsprinzip. Ed. by O. Blumenthal, Leipzig 41922. 54-71) Mintzer, D.: Transporttheorie in Gasen. In: Die Mathematik fUr Physik und Chemie, vol.2. Ed. by H. Margenau and G.M. Murphy. Frankfurt 1967. Ch.l Misner, Ch.W., Thorne, K.S., and J.A. Wheeler: Gravitation. San Francisco, Ca., 1973 Mittelstaedt, P.: Empiricism and Apriorism in the Foundations of Quantum Logic. Synthese 67 (1986) 497-525
Literature
609
Mittelstaedt, P., and E.-W. Stachow (Eds.): Recent Developments in Quantum Logic. Mannheim 1985 Mittelstaedt, P.: Klassische Mechanik. Mannheim 21995 MittelstraJ&, J.: Changing Concepts of the A Priori. In: Proc. of the 5th Int. Congr. on Logic, Methodology, and Philosophy of Science. Ed. by R.E. Butts and J. Hintikka. Dordrecht 1977. 113-28 MittelstraJ&, J.: Rationale Rekonstruktion der Wissenschaftsgeschichte. In: Wissenschaftstheorie und Wissenschaftsforschung. Ed. by P. Janich. Munich 1981. 89-111 and 137-48 Monk, J.D.: Mathematical Logic. Berlin 1976 Moore, G.E.: Philosophical Papers. London 1959 Moore, G.E.: Proof of an External World. Proc. Brit. Acad. 1939, 273-300; quoted after: Moore 1959, 127-50 Miiller, A.: Naturwissenschaft und reale AuJ&enwelt. Naturwissenschaften 28 (1940) 705-9 Nagel, E.: The Structure of Science. Problems in the Logic of Scientific Explanation. New York 1961 Nernst, W.: Theoretische Chemie. Stuttgart 1893; 11th-15th Ed. 1926 (Engl. Ed.: Theoretical Chemistry. London 1911) Nernst, W.: Zum Giiltigkeitsbereich der Naturgesetze. Naturwissenschaften 10 (1922) 489-95 Neumann, J. von: Mathematische Grundlagen der Quantenmechanik. Berlin 1932 (Engl Ed.: Mathematical Foundations of Quantum Mechanics. Princeton, N.J., 1955) Neurath, 0., et al.: Wissenschaftliche Weltauffassung - Der Wiener Kreis (1929), repro in : Logischer Empirismus - Der Wiener Kreis. Ed. by H. Schleichert. Miinchen 1975. 201-22. (Engl. Ed.: The Scientific Conception of the World: The Vienna Circle. Dordrecht 1973) Niiniluoto, I.: Truthlikeness: Comments on Recent Discussion. Synthese 38 (1978) 281-329 Ochs, W.: Lassen sich QuantenzusUinde als Ensembles streufreier Zustande darstellen? I. Zeitschr. fUr Naturforschg. 25a (1970) 1546-55 Ochs, W.: Can Quantum Theory be Presented as a Classical Ensemble Theory? II. Zeitschr. fUr Naturforschg. 26a (1971) 1740-53
610
Literature
Oppenheim, P., and H. Putnam: Unity of Science as a Working Hypothesis. In: Minnesota Studies in the Philosophy of Science, VoU1. Ed. by H. Feigl et al. Minneapolis, Minn., 1958. 3-26. (German Ed.: Einheit der Wissenschaft als Arbeitshypothese. In: Erkenntnisprobleme der Naturwissenschaften. Ed. by L. Krtiger. Cologne 1970. 339-71) Ostwald, W.: Die Uberwindung des wissenschaftlichen Materialismus. Verhandlg. Gesellsch. Deutscher Naturforscher und A.rzte 67 (1895) Erster Teil, 155-68 Ostwald, W.: Vorlesungen tiber Naturphilosophie. Leipzig 1902 Pascal, B.: Uber die Religion (Pensees). Ed. by E. Wasmuth. 3Heidelberg 1946 Patzig, G.: Die aristotelische Syllogistik. Gottingen 1959; 31969 (Engl. Ed.: Aristotle's Theory of the Syllogism. Dordrecht 1968) Pauli, W.: Bemerkungen zum Problem der verborgenen Parameter in der Quantenmechanik und zur Theorie der Ftihrungswelle. In: George 1955, 2635 Pauli, W.: Aufsiitze und Vortriige tiber Physik und Erkenntnistheorie. Braunschweig 1961 Pearson, K.: The Grammar of Science. London 1892; repro Bristol 1991 Peitgen, H.-O.,and P.H. Richter: The Beauty of Fractals. Berlin 1986 Peres, A.: Nonlinear Variants of Schrodinger's Equation Violate the Second Law of Thermodynamics. Phys. Rev. Letters 63 (1989) 1114-5 Philippidis, C., Dewdney, C., and B.J. Hiley: Quantum Interference and the Quantum Potential. Nuovo Cimento 52 B (1979) 15-28 Pinch, T.J.: What Does a Proof Do if it Does not Prove? In: The Social Production of Scientific Knowledge. Ed. by E. Mendelsohn et al. Dordrecht 1977. 171-215 Pitowsky, 1.: Quantum Probability - Quantum Logic. Berlin 1989 Planck, M.: Acht Vorlesungen tiber Theoretische Physik. Leipzig 1910 Planck, M.: Zur Machschen Theorie der physikalischen Erkenntnis. Phys. Zeitschr. XI (1910) 1186-90 Planck, M.: Vorlesungen tiber die Theorie der Wiirmestrahlung. Leipzig 1913 Planck, M.: Naturwissenschaft und reale Auf&enwelt. Naturwissenschaften 28 (1940) 778-9 Planck, M.: Vortriige und Erinnerungen. Stuttgart 51949
Literature
611
Planck, M.: Wissenschaftliche Selbstbiographie. Ed. by Ch. Scriba. Halle 1990 Plato: Phaedo Plato: Theaitetus Poincare, H.: Wissenschaft und Hypothese. 3Leipzig 1914 Pool, J.T.C.: Mathematical Aspects of the Weyl Correspondence. Journal of Math. Physics 7 (1966) 66-76 Popper, K: Logik der Forschung. Wien 1935. 5Tiibingen 1973 (Engl. Ed.: The Logic of Scientific Discovery. London 1959) Popper, K: The Poverty of Historicism. London 1957 Popper, K: The Aim of Science. Ratio 1 (1958) 24-35 (German version: Uber die Zielsetzung der Erfahrungswissenschaft. In German ed. of Ratio, 1 (1957) 21-31) Popper, K: Truth, Rationality, and the Growth of Scientific Knowledge. In: Conjectures and Refutations. London 1963. 215-50 Popper, K: Objective Knowledge. Oxford 1972 Primas, H.: Chemistry, Quantum Mechanics, and Reductionism. Berlin 1981 Primas, H.: Kann Chemie auf Physik reduziert werden? Chemie in unserer Zeit 19 (1985) 109-19, 161-6 Putnam, H.: How not to Talk about Meaning. In: Boston Studies in the Philosophy of Science, vol. II. Ed. by R.S. Cohen and M.W. Wartofsky. New York 1965. 205-22 Putnam, H.: Is Logic Empirical? In: Boston Studies in the Philosophy of Science, VOl.V. Ed. by R.S. Cohen and M.W. Wartofsky. Dordrecht 1969. 216- 41 Putnam, H.: What is Realism? In: Leplin 1984. 140-53 Quine, W.V.: Methods of Logic. New York 1950 Quine, W.V.: From a Logical Point of View. 2New York 1961 Quine, W. V.: Grundziige der Logik. Frankfurt 1969 Quine, W.V.: The Ways of Paradox and other Essays. 2Cambridge, Mass., 1976 Redei, M.: Bell's Inequalities, Relativistic Quantum Field Theory and the Problem of Hidden Variables. Philos. of Sci. 58 (1991) 628-38
612
Literature
Reichenbach, H.: Experience and Prediction. Chicago 1938 (German version: Erfahrung und Prognose. Braunschweig 1983 [ = vol.4 of the Gesammelte Werke, Ed. by A. KamIah and M. Reichenbach]) Rohracher, H.: Die Arbeitsweise des Gehirns und die psychischen Vorgiinge. Munchen 1967 Rohrlich, F.: Classical Charged Particles. Reading, Mass., 1965 Rosen, G.: Galilean Invariance and the General Covariance of Nonrelativistic Laws. Amer. Journ. Phys. 40 (1972) 683-7 Rosenfeld, L.: Die Evidenz der Komplementaritiit. In: George 1955. 36-57 Rosenfeld, L.: Review of Bohm 1957. Nature 181 (1958) 658 Russell, B.: Logic and Knowledge. Essays 1901-1950. London 1956 Russell, B.: History of Western Philosophy. London 1946 Russell, B.: My Philosophical Development. London 1959 Samburski, S.: The Physical World of the Greeks. London 1956 Scheibe, E.: Uber das Weylsche Raumproblem. Journ. reine angew. Math. 197 (1957) 162-207 Scheibe, E.: Die kontingenten Aussagen in der Physik. Frankfurt 1964 Scheibe, E.: Bemerkungen tiber den Begriff der Ursache. In: Vom Geist der Naturwissenschaft. Ed. by H.H. Holz and J. Schickel. Zurich 1969. 105-34 (Engl. transl. by D.J. Marshall jr.: Remarks on the Concept of Cause. in: Contemporary German Philosophy, vol.4, ed. by D.E. Christensen et al. London 1984. 223-43; this vol. I.1) Scheibe, E.: Ursache und Erkliirung. In: Erkenntnisprobleme der Naturwissenschaften. Ed. by 1. Kruger. Cologne 1970. 253-75 Scheibe, E.: Ein vernachlassigter Aspekt physikalischer Erkliirung I. Naturwissenschaften 58 (1971) 1-6 Scheibe, E.: Die Erkliirung der Keplerschen Gesetze durch Newtons Gravitationsgesetz. In: Einheit und Vielheit. Festschrift fur C.F. v. Weizsiicker zum 60. Geburtstag. Ed. by E. Scheibe and G. SuJ6mann. G6ttingen 1973. 98-118 (this vol. V.20, translated by H.J.Wilhelm) Scheibe, E.: The Approximative Explanation and the Development of Physics. In: Logic, Methodology and Philosophy of Science IV. Ed. by P. Suppes et al.. Amsterdam 1973. 931-42 Scheibe, E.: The Logical Analysis of Quantum Mechanics. Oxford 1973
Literature
613
*Scheibe, E.: Gesetzlichkeit und Kontingenz. In: Natur und Geschichte. X. Deutscher Kongress fUr Philosophie. Kiel 1972. Ed. by K. Hubner and A. Menne. Hamburg 1973. 170-89 Scheibe, E.: Vergleichbarkeit, Widerspruch und Erklarung. In: Philosophie und Physik. Ed. by R Haller und J. Gotschl. Braunschweig 1975. 57-71 Scheibe, E.: Gibt es ErkHirungen von Theorien? Allg. Zeitschr. fUr Philos. 1 (1976) 26-45 (Engl. transl. by J.A. Novak: Are There Explanations of Theories? In: Contemporary German Philosophy, vol. 3. Ed. by D.E. Christensen et al.. London 1983, 141-58; this vol. V.2l) Scheibe, E.: Conditions of Progress and the Comparability of Theories. In: Essays in Memory of Imre Lakatos. Ed. by RS. Cohen, P. Feyerabend and M.W. Wartofsky. Dordrecht 1976. 547-68 Scheibe, E.: Kants Philosophie der Mathematik. Mittlg. Math. Ges. Hamburg X (1977) 353-72 (this vol. VIII.35, translated by H. J. Wilhelm) Scheibe, E.: On the Structure of Physical Theories. In: The Logic and Epistemology of Scientific Change. Ed. by I. Niiniluoto and R Tuomela. Amsterdam 1979. 205-24 (this vol. III.11; German version: Uber die Struktur physikalischer Theorien. In: Zur Logik empirischer Theorien. Ed. by W. Balzer and M. Heidelberger. Berlin 1983. 169-88) Scheibe, E.: Quantentheorie und verborgene Parameter. Physikunterricht 15 (1981) 56-74 Scheibe, E.: Eine Fallstudie zur Grenzfallbeziehung in der Quantenmechanik. In: Grundlagenprobleme der modernen Physik. Festschrift fur P. Mittelstaedt zum 50. Geburtstag. Ed. by J. Nitsch, J. Pfarr, and E.-W. Stachow. Mannheim 1981. 257-67 (this vol. V.22, translated by H.J.Wilhelm) Scheibe, E.: Zum Theorienvergleich in der Physik. In: Physik, Philosophie und Politik. Festschrift fUr C.F. von Weizsacker zum 70. Geburtstag. Ed. by K.M. Meyer-Abich. Munchen 1982. 291-309 Scheibe, E.: A Comparison of Two Recent Views on Theories. Metamedicine 3 (1982) 233-53 (this vol. III.12) Scheibe, E.: Invariance and Covariance. In: Scientific Philosophy Today. Essays in Honor of Mario Bunge. Ed. by J. Agassi and R S. Cohen. Dordrecht 1982.311-31 (this vol. VII.31) Scheibe, E.: Ein Vergleich der Theoriebegriffe von Sneed und Ludwig. In: Erkenntnis- und Wissenschaftstheorie. Akten des 7. Int. Wittgenstein Sympos. 1982. Ed. by. P. Weingartner and H. Czermak. Vienna 1983. 371-83
614
Literature
Scheibe, E.: Zur Rehabilitierung des Rekonstruktionismus. In: Rationalitat. Philosophische Beitrage. Ed. by H. Schnadelbach. Frankfurt 1984. 94-116 (this vol. III.13, translated by H. J. Wilhelm) Scheibe, E.: Explanation of Theories and the Problem of Progress in Physics. In: Reduction in Science. Ed. by W. Balzer et al. Dordrecht 1984. 71-94 Scheibe, E.: Quantum Logic and some Aspects of Logic in General. In: Recent Developments in Quantum Logic. Ed. by P. Mittelstaedt and E.W. Stachow. Mannheim 1985. 115-28 (this vol. V1.25) Scheibe, E.: What Kind of Hidden Variables are Excluded by Bell's Inequality? In: Foundations of Physics. Ed. by P. Weingartner and G. Dorn. Vienna 1986.251-71 (this vol. V1.26) Scheibe, E.: Kohiirenz und Kontingenz. Die Einheit der Physik und die rationalistische Tradition. Zeitschr. fur philos. Forschg. 40 (1986) 321-36 Scheibe, E.: The Comparison of Scientific Theories. Interdisciplinary Science Reviews 11 (1986) 148-52 Scheibe, E.: Mathematics and Physical Axiomatization. In: Merites et limites des methodes logiques en philosophie. Ed. by Fond. Singer-Polignac. Paris 1986. 251-77 (this volume VII1.36) Scheibe, E.: The Increase of Contingencies in Science. Epistemologia X (1987) 171-86 (German Version: Die Zunahme des Kontingenten in der Wissenschaft. Neue Hefte fur Philos. 24/25 (1985) 1-13) Scheibe, E.: Ganzheitsaspekte in Philosophie und Wissenschaft. In: Jahrbuch der Akad. der Wiss. in G6ttingen 1987. G6ttingen 1987 (Engl. Version: Paradigms of Holistics in Science and Philosophy. In: Medical Systems with a Holistic Approach. Ed. by S.N. and Y.B. Tripathi. Varanasi 1993. 13-29; this volume 1.2) Scheibe, E.: Kant's Apriorism and some Modern Positions. In: The Role of Experience in Science. Proc. of the 1986 Conf. of the Acad. Int. de Philos. des Sciences (Bruxelles), held at the Univ. of Heidelberg. Ed. by E. Scheibe. Berlin 1988. 1-22 (this vol. I.3) Scheibe, E.: The Physicists' Conception of Progress. Stud. Hist. Philos. Sci. 19 (1988) 141-59 (this vol. II.6) Scheibe, E.: Struktur und Theorie in der Physik. In: Philosophie und Physik der Raum-Zeit. Ed. by J. Audretsch und K. Mainzer. Mannheim 1988.103-19 Scheibe, E.: Aquivalenz und Reduktion. Zur Frage ihres empirischen Status. Conceptus 22 (1988) 91-105
Literature
615
Scheibe, E.: Hermann Weyl and the Nature of Spacetime. In: Exakte Wissenschaften und ihre philosophische Grundlegung. Ed. by W.Deppert et al. Frankfurt 1988. 61-82 (this vol. VII.32) Scheibe, E.: Paul Feyerabend und die rationalen Rekonstruktionen. In: Wozu Wissenschaftsphilosophie? Positionen und Fragen zur gegenwiirtigen Wissenschaftsphilosophie. Ed. by P. Hoyningen-Huene and G. Hirsch. Berlin 1988. 149-71 (this vol. 111.14, translated by H. J. Wilhelm) *Scheibe, E.: Do We Want to Be Irrational? Interdisciplinary Science Reviews 13 (1988) 84-76 Scheibe, E.: Two Types of Successor Relations between Theories. Zeitschr. fUr allg. Wissenschaftstheorie 14 (1989) 68-80 Scheibe, E.: Die Kopenhagener Schule. In: Klassiker der Naturphilosophie. Ed. by G. Bohme. Miinchen 1989. 374-92 Scheibe, E.: Coherence and Contingency. Two Neglected Aspects of Theory Succession. Nous XXIII (1989) 1-16 (this vol. IV.15) Scheibe, E.: Die Kopenhagener Schule und ihre Gegner. In: Wieviele Leben hat Schrodingers Katze? Zur Physik und Philosophie der Quantenmechanik. Ed. by J. Audretsch and K. Mainzer. Mannheim 1990. 157-82 (this vol. V1.27, translated by H.J.Wilhelm) Scheibe, E.: Calculemus! Das Problem der Anwendung von Logik und Mathematik. In: Leibniz' Auseinandersetzung mit Vorgiingern und Zeitgenossen. Ed. by I. Marchlewitz and A. Heinekamp. Stuttgart 1990. 200- 16 (Engl. Transl. by J. Zwart in: Knowledge Organization 23 (1996) 67-76; this vol. VIII.37) Scheibe, E.: Erwin Schrodinger und die Philosophie der Physiker. Zeitschr. fUr Wissenschaftsforschg. 6 (1991) 9-19 (Engl. version: Erwin Schrodinger and the Philosophy of the Physicists. In: Erwin Schrodinger's World View. The Dynamics of Knowledge and Reality. Ed. by J. Gotschl. Dordrecht 1992. 25-34; this vol. 11.7) Scheibe, E.: Predication and Physical Law. Topoi 10 (1991) 3-12 (this vol. IV.16) Scheibe, E.: General Laws of Nature and the Uniqueness of the Universe. In: Philosophy and the Origin and Evolution of the Universe. Ed. by E. Agazzi and A. Cordero. Dordrecht 1991. 341-60 (this vol. IV.18) Scheibe, E.: J.v. Neumann's und J.S. Bell's Theorem. Ein Vergleich. Philos. Natur. 28 (1991) 35-53 (this vol. V1.28, translated by H.J.Wilhelm)
616
Literature
Scheibe, E.: EPR-Situation and Bell's Inequality. In: Existence and Explanation. Essays Presented in Honor of K. Lambert. Ed. by W. Spohn, B.C. van Fraassen and B. Skyrms. Dordrecht 1991. 115-29 (this vol. V1.29) Scheibe, E.: Covariance and the Non-Preference of Coordinate Systems. In: Causality, Method, and Modality. Essays in Honor of Jules Vuillemin. Ed. by G.G. Brittan, Jr. Dordrecht 1991. 23-40 (this vol. VII.33) Scheibe, E.: Substances, Physical Systems, and Quantum Mechanics. In: AdvanCes in Scientific Philosophy. Essays in Honor of P. Weingartner. Ed. by G. Schurz und G. Dorn. Amsterdam 1991. 215-30 (this vol. IV.17) Scheibe, E.: The Role of Mathematics in Physical Science. In: The Space of Mathematics. Philosophical, Epistemological, and Historical Explorations. Ed. by J. Echeverria et al. Berlin 1992. 141-55 Scheibe, E.: Albert Einstein: Theorie, Erfahrung, Wirklichkeit. Heidelberger Jahrbiicher XXXVI (1992) 121-38; this vol. 11.8, translated by Charito Pizarro) Scheibe, E.: A New Theory of Reduction in Physics. In: Philosophical Problems of the Internal and External Worlds. Essays on the Philosophy of Adolf Griinbaum. Ed. by J. Earman et al.. Pittsburgh, Pa., 1993. 248-71 (this vol. V.23) Scheibe, E.: Three Remarks Concerning Bell's Inequality. In: Bell's Theorem and the Foundations of Modern Physics. Ed. by A. van der Merwe et al.. Singapore 1993. 428-35 (this vol. V1.30) Scheibe, E.: C. F. von Weizsiicker und die Einheit der Physik. Philos. Natur. 30 (1993) 126-45 (this vol. 1.4, transl. by H. J. Wilhelm) Scheibe, E.: Heisenbergs Begriff der abgeschlossenen Theorie. In: Werner Heisenberg. Physiker und Philosoph. Ed. by B. Geyer et al. Heidelberg 1993. 251-7 (this vol. 11.9, transl. by H. J. Wilhelm) Scheibe, E.: Zwischen Rationalismus und Empirismus: Der Weg der Physik. In: Vernunftbegriffe in der Moderne. Stuttgarter Hegel-Kongress 1993. Ed. by F. Fulda and R.-P. Horstmann. Stuttgart 1994. 73-95 (this vol. 1.5, translated by H. J. Wilhelm) Scheibe, E.: A Most General Principle of Invariance. In: Philosophy, Mathematics and Modern Physics. A Dialogue. Ed. by E. Rudolph and 1.0. Stamatescu. Berlin 1994. 213-25 (this vol. VII.34) Scheibe, E.: The Rationality of Reductionism. In: Natural Sciences and Human Thought. Ed. by R. Zwilling. Berlin 1995. 101-9 (this vol. V.24)
Literature
617
Scheibe, E.: L'origine du realism scientifique: Boltzmann, Planck, Einstein. In: Les savants et l'epistemologie vers la fin du XIXe siecle. Ed. by M. Panza and J.-C. Pont. Paris 1995. 157-72 (Engl. version: The Origin of Scientific Realism: Boltzmann, Planck, Einstein. In: The Reality of the Unobservable. Ed. by M. Pauri. Dordrecht 1999; this vol. II.10) Scheibe, E.: Laws and Theories: Generality. vs. Coherence. In: Laws of Nature. Ed. by F. Weinert. Berlin 1995. 208-26 Scheibe, E.: The Mathematical Overdetermination of Physics. In: Philosophy of Mathematics Today. Ed. by E. Agazzi and G. Darvas. Dordrecht 1997. 26985 (this vol. VIII.38) *Scheibe, E.: Die Reduktion physikalischer Theorien. Ein Beitrag zur Einheit der Physik. Teil I: Grundlagen und element are Theorie. Berlin 1997. Teil II: Inkommensurabilitat und Grenzfallreduktion. ibid. 1999 Scheibe, E.: On Limitations of Physical Knowledge. Philos. Natur. 35 (1998) 41-57 (this vol. IV.19) Schelling, F.W.J.: Werke. Ed. by M. Schr6ter. Miinchen 1927, vol. II Schilpp, P.A. (Ed.): Albert Einstein. Philosopher - Scientist. Evanston, Ill., 1949 (German Ed.: Albert Einstein als Philosoph und Naturforscher. Stuttgart 1955) Schilpp. P.A. (Ed.): The Philosophy of Rudolf Carnap. La Salle, Ill., 1963 Schilpp, P.A. (Ed.): The Philosophy of Karl Popper. La Salle, IlL, 1974 *Schlieder, S.: EPR-Relations, von Neumann's Standard Forms and a Proof Concerning a Conjecture of E. Scheibe. Comm. Math. Phys. 169 (1995) 58996 Schmidt, H.-J.: Axiomatic Characterization of Physical Geometry. Berlin 1979 *Schmidt, H.-J.: The Status of Set-theoretic Axioms in Empirical Theories. In: The Space of Mathematics. Philosophical, Epistemological, and Historical Explorations. Ed. by J. Echeverria et al. Berlin 1992. 156-67 Schneider, M.: Funktion und Grundlegung der Mathesis Universalis im Leibnizschen Wissenschaftssystem. In: Leibniz: Questions de logique. Ed. by A. Heinekamp. (Studia Leibnitiana, Sonderheft 15) Stuttgart 1988. 162-82 Schouten, J.A.: Ricci-Calculus. Berlin 1954 Schroeder, M.R.: Number Theory in Science and Communication. Berlin 1984; 21987
618
Literature
Schrodinger, E.: Quantisierung als Eigenwertproblem II (1926). In 1984, vol.3, 98-136 Schrodinger, E.: Abhandlungen zur Wellenmechanik. Leipzig 1927; 21928 Schrodinger, E.: Energieaustausch nach der Wellenmechanik (1927). In 1984, vol.3, 267-79 Schrodinger, E.: Der erkenntnistheoretische Wert physikalischer Modellvorstellungen (1928). In 1984, volA, 288-94 Schrodinger, E.: Das Gesetz der Zufalle (1929). In 1984, vol. 4, 316-7 Schrodinger, E.: Was ist ein Naturgesetz? (1929) In 1984, vol.4, 295-7 Schrodinger, E.: Antrittsrede (1929). In 1984, vol.4, 303-7 Schrodinger, E.: Die Wand lung des physikalischen Weltbegriffs (1930). In: Was ist ein Naturgesetz? Miinchen 1987. 18-26 Schrodinger, E.: Uber Indeterminismus in der Physik (together with: 1st die Naturwissenschaft milieubedingt?). Leipzig 1932 Schrodinger, E.: Die gegenwiirtige Situation in der Quantenmechanik (1935). In 1984, vol.4, 484-501 Schrodinger, E.: Discussion of Probability Relations between Separated Systems (1935). In 1984, voLl, 424-32 Schrodinger, E.: Probability Relations between Separated Systems (1935). In 1984, voLl, 433-9 Schrodinger, E.: Die Besonderheit des Weltbilds der Naturwissenschaften (1948). In 1984, vol.4, 409-53 Schrodinger, E.: Are there Quantum Jumps? (1952) In 1984, vol.4, 478-502 Schrodinger, E.: Nature and the Greeks. Cambridge 1954 Schrodinger, E.: The Philosophy of Experiment (1955). In 1984, vol.4, 55868 Schrodinger, E.: Might perhaps Energy be a merely Statistical Concept? (1958) In 1984, voLl, 502-10 Schrodinger, E.: Meine Weltansicht. Vienna 1961 Schrodinger, E.: Science and Humanism. Physics in our Time. Cambridge 1961 Schrodinger, E.: Gesammelte Abhandlungen. Ed. by Osterreichische Akademie der Wissenschaften. Vienna 1984
Literature
619
Scriven, M.: The Limits of Physical Explanation. In: Philosophy of Science. The Delaware Seminar, vol.2. Ed. by B. Baumrin. New York 1963. 107-35 Seeger, RJ.: Galileo Galilei, his Life and his Works. Oxford 1966 Seelig, C.: Albert Einstein und die Schweiz. Zurich 1952 Selleri, F.: A Stronger Form of Bell's Inequality. Lettre al Nuovo Cimento 3 (1972) 581-2 Selleri, F., and G. Tarozzi: Quantum Mechanics, Reality, and Separability. Riv. del Nuovo Cimento 4, N.2 (1981) 1-53 Selleri, F.: History of the Einstein-Podolsky-Rosen Paradox. In: Quantum Mechanics versus Local Realism. Ed. by F. Selleri. New York 1988. 1-61 Shimony, A.: Reflections on the Philosophy of Bohr, Heisenberg, and Schr6dinger. In: Physics, Philosophy and Psychoanalysis. Essays in Honoor of Adolf Grunbaum. Ed. by RS. Cohen and L. Laudan. Dordrecht 1983. 209-21 Shoenfield, J.: Mathematical Logic. Reading, Mass., 1967 Siegel, C.L.: Vorlesungen uber Himmelsmechanik. Berlin 1956 Sikorski, R: Boolean Algebras. Berlin 1969 Sklar, L.: Types of Inter-theoretic Reduction. Brit. Journal for the Phil. of Sci. 18 (1967) 109-24 Smoluchowski, M.: Uber den Begriff des Zufalls und den Ursprung der Wahrscheinlichkeitsgesetze in der Physik. Naturwissenschaften 6 (1918) 25363 Sneed, J. D.: The Logical Structure of Mathematical Physics. Dordrecht 1971 Sneed, J. D.: Philosophical Problems in the Empirical Science of Science. A Formal Approach. Erkenntnis 10 (1976) 115-46 Sommerfeld, A.: Einige grundsatzliche Bemerkungen zur Wellenmechanik. Phys. Zeitschr. XXX (1929) 866-71 (Quoted from Sommerfeld 1968, 1-6) Sommerfeld, A.: Wege zur physikalischen Erkenntnis. Scientia, April 1936, 181-7 (Quoted from Sommerfeld 1968, 609-15) Sommerfeld, A.: Philosophie und Physik seit 1900. Naturwiss. Rundschau I (1948) 97-100 (Quoted from Sommerfeld 1968, 640-3) Sommerfeld, A.: To Albert Einstein's Seventieth Birthday. In: Schilpp 1949. 97-106 (German version in Schilpp 1955, 37-42) Sommerfeld, A.: Gesammelte Schriften. Braunschweig 1968, vol. IV
620
Literature
Steen, L.A.: The Science of Patterns. Science 1988, No 240, 611-6 Stegmiiller, W.: Wissenschaftliche ErkUirung und Begriindung. Berlin 1969 Stegmiiller, W.: Personelle und statistische Wahrscheinlichkeit, vol.l. Berlin 1973 Stegmiiller, W.: The Structure and Dynamics of Theories. Berlin 1976 (Engl. version of Theorienstrukturen und Theoriendynamik. Berlin 1973) Stegmiiller, W.: The Structuralist View of Theories. Berlin 1979 Stegmiiller, W.: ErkUirung, Begriindung, Kausalitiit. Berlin 1983 (2nd. ed. of Stegmiiller 1969) Stegmiiller, W.: Evolutioniire Erkenntnistheorie, Realismus, Wissenschaftstheorie. In: Evolutionstheorie und menschliches Selbstverstiindnis. Ed. by R. Spaemann et al. Weinheim 1984. Summers, St.J., and R. Werner: Bell's Inequalities and Quantum Field Theory. Journ. Math. Phys. 28 (1987) 2440-56 Suppe, F.: The Search for Philosphical Understanding of Scientific Theories. In: The Structure of Scientific Theories. Ed. by F. Suppe. Urbana, Ill., 1974. 3-241 Suppes, P., Introduction to Logic. New York 1957 Suppes, P.: Axiomatic Set Theory. Princeton, N.J., 1960 Suppes, P.: Set-Theoretical Structures in Science. Stanford, Cal., 1970 (Typoscript) Suppes, P, and M. Zanotti: When are Probabilistic Explanations Possible? Synthese 48 (1981) 191-9 Siifl,mann, G.: Uber den Messvorgang. Abh. der Bayer. Akad. der Wiss., Math.- Nat. Klasse, Heft 88. Miinchen 1958 Takeuti, G.: Proof Theory. Amsterdam 1975 Tarski, A.: Der Wahrheitsbegriff in den formalisierten Sprachen. Studia Philosophica I (1936) 261-405. (Engl. Ed.: The Concept of Truth in Formalized Languages. In: Logic, Semantics, Metamathematics. Papers from 1923 to 1938. Oxford 1956, pp.152-278) Tarski, A.: What is Elementary Geometry? In: The Axiomatic Method. Ed. by L. Henkin et al.. Amsterdam 1959. 16-29 Thiele, J.: Ein zeitgenossisches Urteil iiber die Kontroverse zwischen Max Planck und Ernst Mach. Centaurus 13 (1968) 85-90
Literature
621
Thirring, W.: Lehrbuch der mathematischen Physik. 4 vols. Vienna 1977 Toulmin, St.: The Philosophy of Science. London 1953 Toulmin, St.: Human Understanding. Oxford 1972 Trautman, A.: Comparison of Newtonian and Relativistic Theories of SpaceTime. In: Perspectives in Geometry and Relativity. Ed. by B. Hoffmann. Bloomington, Ind., 1967 Trautman, A.: Invariance of Lagrangian Systems. In.: Genmeral Relativity. Ed. by L.O. Raifeartaigh. Oxford 1972 Truesdell, C.A.: Foundations of Continuum Mechanics. In: Delaware Seminar in the Foundations of Physics. Ed. by M. Bunge. Berlin 1967. 35-48 Van der Waerden, B.L.: Moderne Algebra. VoLl: 8Berlin 1971. Vol. II: 5Berlin 1967 Van der Waerden, B.L.: Synthetische Urteile a priori. In: Quanten und Felder. Ed. by H.P. Durr. Braunschweig 1971. 51-65 Van Fraassen, B.C.: The Scientific Image. Oxford 1980 Varadarajan, V. S.: Geometry of Quantum Theory, vol.1. Princeton, N..J., 1968 Vollmer, G.: Was k6nnen wir wissen?, vol. 1: Die Natur der Erkenntnis. Beitriige zur Evolutioniiren Erkenntnistheorie. Stuttgart 1985; vol.2: Die Erkenntnis der Natur. Stuttgart 1986 Whitehead, A.N., and B. Russell: Principia Mathematica. Cambridge 1910; 21963 Weinberg, St.: Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity. New York 1972 Weinberg, St.: In: Mathematics 1986. 725-8 Weinberg, St.: Newtonianism, Reductionism and the Art of Congressional Testimony. Nature 330 (1987) 433-7 Weinberg, St.: Dreams of a Final Theory. New York 21994 (German Ed.: Der Traum von der Einheit des Universums. Munchen 1993) *Weinberg, St.: Reductionism Redux. The New York Review, Oct. 1995, 3942 Weingartner, P.: Wissenschaftstheorie. Stuttgart 1971ff Weizsiicker, C.F. von: Die Geschichte der Natur. Zurich 1948; G6ttingen 31956
622
Literature
Weizsacker, C.F. von: Die Einheit der Natur. Mtinchen 1971 (Engl. Ed.: The Unity of Nature. New York 1980) Weizsacker, C.F. von: The Preconditions of Experience and the Unity of Physics. In: Transcendental Arguments and Science. Ed. by P. Bieri et al.. Dordrecht 1979. 123-58 Weizsacker, C.F. von: Der Aufbau der Physik. Mtinchen 1985 Wertheimer, M.: Uber Gestalttheorie. Erlangen 1925 Wessels, L.: Schr6dinger's Interpretation of Wave Mechanics. Diss. Indiana Univ. at Bloomington 1975 Weyl, H.: Erlauterungen zu B. Riemann: Uber die Hypothesen, welche der Geometrie zu Grunde liegen. Ed. by H. Weyl. Berlin 1919. 24-47 Weyl, H.: Die Einzigartigkeit der pythagoreischen Maf&bestimmung. Math. Zeitschr. 12 (1922) 114-46 (also in 1968, vol. II. 263-95) Weyl, H.: Das Raumproblem. Jahresbericht DMV 31 (1922) 205-21 (also in 1968, vol. II. 328-44) Weyl, H.: Zur Charakterisierung der Drehungsgruppe. Math. Zeitschr. 17 (1923) 293-320 Weyl, H.: Mathematische Analyse des Raumproblems. Vorlesungen in Barcelona und Madrid. Berlin 1923 Weyl, H.: Raum, Zeit, Materie. Vorlesungen tiber Allgemeine Relativitatstheorie. 5Berlin 1923 Weyl, H.: David Hilbert and his Mathematical Work. Bull. Amer. Math. Soc. 50 (1944) 612-54 (also in 1968, vol.IV, 130--72) Weyl, H.: Philosophy of Mathematics and Natural Science. Princeton, N.J., 1949 Weyl, H.: Gesammelte Abhandlungen. K. Chandrasekharan (Ed.). Berlin 1968 Whitehead, A.N., and B. Russell: Principia Mathematica. Cambridge 1910 Wigner, E.: Symmetries and Reflections. 2Woodbridge, Conn., 1979 Wiener, 0.: Die Erweiterung unserer Sinne. Deutsche Revue tiber das gesamte nationale Leben der Gegenwart. 25. Jahrg. (1900), vol. 4, 25-41 Will, C.M.: Theory and Experiment in Gravitational Physics. Cambridge 1981
Literature
623
Wittgenstein, L.: Tractatus Logico-Philosphicus, with an English translation by D. Pears and B. McGuiness, London 1961 Yamabe, H.: A Generalization of a Theorem of Gleason. Ann. of Math. 58 (1953) 351-65
Index
Anderson, J., 499 Aquinas, Th., 71 Archimedes, 47 Aristotle, 24, 25, 28, 36, 75, 105, 206, 246, 264, 268, 289, 293, 372, 560 Ayala, F., 369, 370, 373 Ballentine, L. E., 403, 404 Bell, J. S., 391, 402, 423, 424, 434, 435, 441, 443-447 Bergmann, G., 195 Berkeley, G., 204, 519 Bernays, P., 177, 179 Beth, E., 160 Birkhoff, G., 582 Blanshard, B., 25, 75, 76, 153, 233, 234 Bohm, D., 391, 402, 405, 410, 412, 414, 415,444 Bohr, N., 33, 34, 99,101-103,106,107, 109, 223, 290, 388, 403-407, 410, 412, 415 Boltzmann, L., 27, 41, 57, 91-93, 97, 98, 103, 106, 114, 115, 119, 142, 143, 145, 291, 356 Bondi, R., 97 Boole, G., 561, 562 Borel, E., 394 Born, M., 109, 114, 402 Bourbaki, N., 160, 162, 163, 165, 181, 354, 356, 458, 460, 461, 492, 504, 540, 545, 562 Bradley, J., 71 Bridgman, P. W., 572, 573 Broglie, L. de, 402, 409 Brunschvicg, L., 124 Campbell, N. R., 13 Cantor, G., 177, 533, 562, 564, 576 Carnap, R., 40, 49, 55, 196, 197, 213, 232, 327, 501 Cartan, E., 359, 479 Cartwright, N., 289
Chevalley, C., 290 Clausius, R., 145 Cohen, I. B., 91 Copernicus, N., 30, 150, 237 Coulomb, C. A., 295 Darwin, Ch., 40 Democritus, 105 Descartes, R., 31, 44, 86, 133, 144, 205, 216, 553-556, 561 Dilthey, W., 44 Dirac, P. A. M., 110, 116, 239 Duhem, P., 29, 30, 308, 317 Eddington, A., 120 Einstein, A., 27, 28, 59, 70-74, 81, 82, 85, 95, 96, 98, 99, 108, 109, 117, 119, 142, 148, 152, 153, 234, 241, 266, 273, 289, 292, 330, 357, 413, 417, 441, 443, 447, 474, 475, 482, 490, 492, 496, 498, 565, 566, 568, 572 Elsbach, A.C., 124 Euclid, 222, 335, 531, 579 Euler, L., 558 Exner, F., 114, 115 Faraday, M., 150 Feyerabend, P., 55, 81, 99, 107, 203, 212,213,217,219,223,327,340,362 Field, R., 536-540, 544-546, 549, 552, 573, 574 Folse, R., 224 Forman, P., 114 Fraenkel, A., 164, 176, 249, 458, 540, 576 Frank, P., 72, 119 Frege, G., 179, 206, 210, 247, 264, 532, 533, 560 Fresnel, A. J., 94 Freudenthal, R., 483 Friedman, M., 499
626
Index
Galilei, G., 26, 33, 79, 95, 105, 110, 198, 220, 261, 273, 278, 293, 300, 310, 326, 352, 363, 536, 565, 572, 577 Gilson, E., 70, 143 Gleason, A. M., 397, 423 Goethe, J. W., 25 Hiittemann, A., 293 Haken, H., 24 Hardy, G. H., 564 Harnack, A. von, 70, 142 Hart, H. L., 7 Hegel, G. W. F., 142, 557 Heidegger, M., 44 Heisenberg, W., 28, 29, 63, 82-84, 99-104, 106, 109, 122, 136, 223, 268, 293, 335, 402, 405, 429, 433 Helmholtz, H., 70, 144, 475 Hempel, C. G., 7, 204, 316, 324 Hertz, H., 112, 119, 146, 147 Hilbert, D., 527, 533, 535, 536, 538, 544, 556, 560, 562, 579 Hintikka, J., 240, 524, 525, 530, 531 Honore, A.-M., 7 Hume, D., 4, 75, 233, 519 Hund, F., 276 Husserl, E., 44 Huxley, A., 15 Huygens, Ch., 110 James, W., 71 Janich, P., 46, 198 Jeans, J. H., 334 Jordan, E. P., 109 Konig, J., 4 Kambartel, A., 198 Kant, I., 4, 36, 61, 69, 71, 74, 121, 124, 132, 144, 153, 203, 206, 247, 286, 357,487,511,517 Kemeny, J. G., 106 Kepler, J., 26, 27, 80, 161, 238, 258, 292, 306, 326, 352, 565 Kienle, H., 96, 97 Kirchhoff, G., 145 Klein, F., 370, 493 Klingenberg, W., 479 Kobayashi, S., 486 Kretschmann, E., 490 Kronecker, L., 551 Kuhn, Th. S., 51, 55, 91, 92, 99, 106, 198, 210, 213, 219, 223, 327, 362, 370 Lakatos, I., 51, 55, 97, 210, 328
Laue, M. von, 151 Leibniz, G. W., 36, 75, 78, 246-248, 261, 264, 267-270, 519, 536, 553, 554, 556, 558, 559, 561, 570 Lie, S., 475 Locke, J., 86, 248, 261, 267, 519 Lorentz, H. A., 99, 358, 363, 501, 577 Lorenz, K., 41, 42 Lorenzen, P., 44 Loschmidt, J., 147 Ludwig, G., 163, 175, 470, 573, 582 Mach, E., 72, 79, 119, 121, 122, 145, 147, 150, 151 Mackey, G. W., 173, 581, 582 Mark, H., 73, 122 Maxwell, J. C., 146, 369, 486 Mayr, E., 369, 371 Medawar, J., 24 Medawar, P., 24, 370 Mill, J. St., 27 Millikan, R. A., 47 Minkowski, H., 220, 242, 358 Misner, Ch. W., 90, 91 Mittelstaedt, P., 292 MittelstraJ5, J., 198, 199 Moore, G. E., 144,290,291 Nagano, T., 486 Nagel, E., 370, 371 Nernst, W., 27-29, 93-95, 97, 106 Neumann, J. von, 49, 160, 177, 179, 383,391,405,410,412-414,417,419, 424, 427, 443, 485, 538, 582 Newton, I., 26-28, 74, 95, 96, 99, 101, 110, 128, 162, 165, 198, 224, 238, 239, 258, 266, 272, 292, 306, 326, 352, 489 Nordheim, L., 538 Oppenheim, P., 7, 55, 106, 204, 218, 239, 324 Ostwald, W., 142, 143, 146 Pascal, B., 25 Pauli, W., 239, 262, 276, 402, 405, 406, 410 Peirce, C. S., 49 Planck, M., 69, 70, 72, 73, 81, 96, 108, 109, 119, 142, 143, 148, 151, 334, 356, 406 Plato, 144, 236 Podolsky, B., 117 Poincare, H., 114, 123, 124, 269
Index Popper, K., 51, 55, 106, 196, 238, 324 Primas, H., 227, 369 Protagoras, 144 Ptolemaus, C., 110 Putnam, H., 55, 160, 218, 239, 383
Stefan, J., 91, 93 Stegmiiller, W., 163, 199 Stone, M. H., 386 Suppe, F., 160, 175 Suppes, P., 160, 163, 461
Quine, W. V., 29, 140
Tarski, A., 204, 544, 579 Thorne, K., 90, 91 Toulmin, St., 15, 16, 197, 213 Truesdell, C. A., 536
Ramsey, W., 183, 187 Rayleigh, J. W. Strutt, Lord, 96, 334 Reichenbach, H., 114, 160, 196, 199, 214 Rickert, H., 41 Riemann, B., 123,358,556 Rohrlich, F., 90 Rosen, N., 117 Rosenfeld, L., 402 Rosenthal-Schneider, I., 120 Russell, B., 43, 71, 75, 179, 246, 247, 264, 267, 290, 291, 532 Sambursky, S., 105 Schelling, F. W. J., 69, 70, 142 Schleiermacher, F., 202 Schlick, M., 130, 153 Schopenhauer, A., 111 Schrodinger, E., 33, 34, 108-118,234, 268,277 Schwarzschild, K., 336 Shimony, A., 109 Sneed, J. D., 160, 163, 172, 173, 175, 251, 461 Socrates, 236 Sommerfeld, A., 70, 142, 143, 512 Spinoza, B. de, 111, 527
627
Van Fraassen, B. C., 160 Van der Waerden, B. L., 562 Vollmer, G., 42 Waals, J. D. van der, 359 Weierstrass, K., 564, 576 Weinberg, St., 71, 81-84, 371, 564, 566, 572 Weingarten, P., 261 Weizsacker, C. F. von, 28, 47-50, 54, 80, 82, 83, 104, 105, 136, 137, 140, 225, 306, 314, 517 Wertheimer, M., 23 Weyl, H., 49, 114, 162, 475, 535 Wheeler, J. A., 52, 90, 91 Whitehead, A. N., 247 Wigner, E., 162, 410, 501, 565, 570, 572 Windelband, W., 41 Wittgenstein, L., 25, 33, 42, 76, 115, 195,233 Wolff, C., 519 Zermelo, E., 164, 176, 249, 458, 540, 575,576