INTUITION AND THE AXIOMATIC METHOD
THE WESTERN ONTARIO SERIES IN PHILOSOPHY OF SCIENCE A SERIES OF BOOKS IN PHILOSOPH...
40 downloads
1239 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
INTUITION AND THE AXIOMATIC METHOD
THE WESTERN ONTARIO SERIES IN PHILOSOPHY OF SCIENCE A SERIES OF BOOKS IN PHILOSOPHY OF SCIENCE, METHODOLOGY, EPISTEMOLOGY, LOGIC, HISTORY OF SCIENCE, AND RELATED FIELDS
Managing Editor WILLIAM DEMOPOULOS
Department of Philosophy, University of Western Ontario, Canada Department of Logic and Philosophy of Science, University of Californina/Irvine Managing Editor 1980–1997 ROBERT E. BUTTS
Late, Department of Philosophy, University of Western Ontario, Canada
Editorial Board JOHN L. BELL,
University of Western Ontario
JEFFREY BUB,
University of Maryland
PETER CLARK,
St Andrews University
DAVID DEVIDI,
University of Waterloo
ROBERT DiSALLE,
University of Western Ontario
MICHAEL FRIEDMAN, MICHAEL HALLETT, WILLIAM HARPER,
Indiana University McGill University
University of Western Ontario
CLIFFORD A. HOOKER, AUSONIO MARRAS,
University of Newcastle
University of Western Ontario
JÜRGEN MITTELSTRASS, JOHN M. NICHOLAS,
Universität Konstanz
University of Western Ontario
ITAMAR PITOWSKY,
Hebrew University
VOLUME 70
INTUITION AND THE AXIOMATIC METHOD Edited by
EMILY CARSON McGill University, Montreal, Canada
and RENATE HUBER University of Dortmund, Germany
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN-10 ISBN-13 ISBN-10 ISBN-13
1-4020-4039-3 (HB) 978-1-4020-4039-9 (HB) 1-4020-4040-7 (e-book) 978-1-4020-4040-5 (e-book)
Published by Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. www.springer.com
Printed on acid-free paper
All Rights Reserved © 2006 Springer No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner Printed in the Netherlands.
Contents
Preface
vii
Acknowledgements
xiii
Part I
Mathematical Aspects
Locke and Kant on Mathematical Knowledge ’ Emily Carson The View from 1763: Kant on the Arithmetical Method before Intuition Ofra Rechter The Relation of Logic and Intuition in Kant’s Philosophy of Science, Particularly Geometry Ulrich Majer
3 21
47
Edmund Husserl on the Applicability of Formal Geometry Ren´e Jagnow
67
The Neo-Fregean Program in the Philosophy of Arithmetic William Demopoulos
87
G¨odel, Realism and Mathematical ‘Intuition’ Michael Hallett
113
Intuition, Objectivity and Structure Elaine Landry
133
Part II
Physical Aspects
Intuition and Cosmology: The Puzzle of Incongruent Counterparts Brigitte Falkenburg
157
Conventionalism and Modern Physics: a Re-Assessment Robert DiSalle
181
v
vi
Intuition and the Axiomatic Method
Intuition and the Axiomatic Method in Hilbert’s Foundation of Physics
213
Ulrich Majer, Tilman Sauer Soft Axiomatisation: John von Neumann on Method and von Neumann’s Method in the Physical Sciences Mikl´os R´edei, Michael St¨oltzner
235
The Intuitiveness and Truth of Modern Physics Peter Mittelstaedt
251
Functions of Intuition in Quantum Physics Brigitte Falkenburg
267
Intuitive Cognition and the Formation of Theories
293
Renate Huber
Preface
Following developments in modern geometry, logic and physics, many scientists and philosophers in the modern era considered Kant’s theory of intuition to be obsolete. Frege’s and Russell’s logicism seemed to go against Kant’s claim that mathematics is based on synthetic a priori judgments. In any case, according to Russell, the sole reason Kant introduced intuition into his philosophy of mathematics and physics was that he did not have available to him modern logic, i.e., the quantified logic of relations. Following this view, the articulation of the new logic thus rendered Kantian intuition redundant. Moreover, the enormous expansion of modern mathematics in the nineteenth century, an expansion accelerated by the development of the modern abstract theory of sets, appeared to take mathematics out of the reach of Kantian ‘sensible’ intuition. A good deal of this mathematics was also used in various ways in mathematical physics. Even in a relatively concrete case, namely geometry, sensible intuition seemed incapable of deciding between alternative geometrical systems. Many mathematicians and physicists also adopted a version of the ‘axiomatic method’, derived from Hilbert’s work on the foundations of geometry, which allowed that mathematical theories need not have a unique ‘content’ founded in logic or intuition or empirical theories, but that axiomatically presented systems are free to be interpreted as mathematical or scientific requirements dictate. Thus, there is no uniquely correct geometrical theory of space. Moreover, Reichenbach and Carnap in particular argued that special and general relativity disprove Kant’s theories of space and time. Reichenbach and Carnap were two of the most prominent representatives of logical empiricism, which owed much to the work of Frege and Russell. The rise of logical empiricism saw the entrenchment of these views of Kant’s philosophy of mathematics and science. But this only represents one side of the story concerning Kant, intuition and twentieth century science. Several prominent mathematicians and physicists were convinced that the formal tools of modern logic, set theory and the axiomatic method are not sufficient for providing mathematics and physics with satisfactory foundations. All of Hilbert, G¨odel, Poincar´e, Weyl and Bohr thought that intuition was an indispensable element in describing the foundations of science. They had very different reasons for thinking this, and they had very different accounts of what they called intuition. But they had in common vii
viii
Intuition and the Axiomatic Method
that their views of mathematics and physics were significantly influenced by their readings of Kant. In the present volume, various views of intuition and the axiomatic method (and their combination) are explored, beginning with Kant’s own approach. By way of these investigations, we hope to understand better the rationale behind Kant’s theory of intuition, as well as to grasp many facets of the relations between theories of intuition and the axiomatic method, dealing with both the strengths and the limitations of the latter; in short, the volume covers logical and non-logical, historical and systematic issues in both mathematics and physics, and also views both sympathetic to, as well as critical of, Kant’s own account and use of intuition. It goes without saying that this collection represents only a modest step in understanding the full impact which Kant’s theory of intuition had on the development of the exact sciences. Part I of this volume deals with the mathematical aspects of the relations between intuition and the axiomatic method, Part II with the physical aspects; the volume thus falls naturally into two parts, although the separation cannot be exact. Both parts begin with detailed investigations of Kant’s own views and of the limitations of what we now call the axiomatic method, and of the ways in which intuition is needed to overcome such limitations. The contributions shed light on modern views of these Kantian topics in the context of modern logic, mathematics and physics. The contributions to Part I deal with Kant’s theory of geometry and arithmetic, with modern interpretations of Kant’s reasons for introducing intuition into the foundations of mathematics, with Husserl’s and G¨odel’s views of the role of intuition in mathematics, and with neo-Fregean logicism and category theory as programmes to replace intuition by the use of formal tools. The distinct approaches show that the role of intuition in mathematics is far from being uncontroversial. Emily Carson sets the stage by considering a traditional correlate of the axiomatic method as it is presented by Locke and the pre-Critical Kant in order to show how Kant’s central notion of pure intuition fills in certain epistemological gaps in Locke’s and the early Kant’s accounts of mathematical knowledge. Ofra Rechter undertakes to shed light on Kant’s Critical philosophy of arithmetic by examining his discussion of the symbolic method of arithmetic in the Prize Essay of 1763. This elucidation of the connection between arithmetical definitions and arithmetical symbolisms serves to clarify the notion of symbolic construction which Kant distinguishes from the ostensive constructions of geometry, and thus to clarify the notion of construction in intuition in general. Ulrich Majer makes the bridge to modern interpretations of Kant’s philosophy of mathematics. In particular, he distinguishes the logical and phenomenological approaches to interpreting Kant’s theory of intuition. According to the former, Kant must appeal to intuition in his account of mathematics because of his restricted conception of logic. According to the latter, however, the appeal to intuition as a non-logical source of knowledge is necessary
PREFACE
ix
regardless of the type of logic available. Majer appeals to Hilbert’s work on the foundations of geometry to argue in favour of the phenomenological approach. Ren´e Jagnow extends the analysis of intuition to Husserl’s account of geometry, which distinguishes between formal, geometric, and intuitive space (the latter being the space of everyday experience). The task here, given these distinctions, is to account for the possibility that the results of the analysis of formal space apply to the space of geometry, and that the results of geometry apply to intuitive space. Jagnow outlines Husserl’s account and argues that its central aim was to guarantee the conceptual continuity between these different notions of space. The applicability of formal inquiry to intuitive space ensures that the former expresses a genuine concept of space. William Demopoulos addresses the idea of founding arithmetic on second-order logic and ‘Hume’s Principle’. In the modern neo-Fregean programme, Hume’s Principle is represented as true by stipulation. Demopoulos argues that we should reject this claim, that it is rather a substantive truth, one that provides the basis for a successful conceptual analysis of our notion of number, deriving the numbers’ theoretically most salient property from the principle underlying their application. This is a partial (but only partial) vindication of Frege’s original project of showing that our knowledge of number has the character of logical knowledge. Among other things, it avoids the assumption that the numbers are ‘given in intuition’. Michael Hallett discusses the respects in which G¨odel’s famous appeal to intuition is based on ideas in Kant. G¨odel thought Kant’s notion of sensible intuition too restrictive for understanding modern mathematical knowledge. What he takes from Kant is rather the idea that there is an underlying conception of physical object through which perception is interpreted. Using this analogy, G¨odel claims that there is a notion of mathematical object through which we interpret mathematical ‘facts’; this notion is given by the iterative concept of set, described by axiomatic set theory. The fundamental incompletenesses of mathematics means the description is essentially open-ended. But G¨odel thinks that the ‘discovery’ of large cardinal axioms extending the iterative hierarchy (thus closing mathematical incompletenesses) represent an unfolding of this concept. Finally, Elaine Landry sketches category theory as a modern mathematical tool that might replace the functions which Kant attributed to intuition in our knowledge of objects, namely the schemata of formal concepts. Part II complements Part I with investigations on the role of intuition in modern physics. The contributions focus on Kant’s pre-Critical worries about the foundations of physical geometry and cosmology, on general relativity and quantum theory, on conceptual analysis and conventionalism, on Hilbert’s and von Neumann’s views of the axiomatic foundations of physics, and on recent views of the role of intuition in modern physics. Brigitte Falkenburg investigates how Kant’s theory of intuition emerged from his use of the analytical method in cosmology. Conceptual analysis of space convinced Kant that the geometrical properties of incongruent counterparts are at odds with Leibniz’s relational account of space, although he always
x
Intuition and the Axiomatic Method
adhered to Leibniz’s criticism of Newton’s notion of Absolute Space. Thus, Kant faced a problem, since neither the relational nor the absolute view of space seemed tenable. It is this dilemma which the theory of spatial intuition is used to overcome. The paper also points out that the problem Kant saw with incongruent counterparts reemerges with parity violation and PCT-invariance. Robert DiSalle analyses the conventionalist view of physical geometry. This arose from the confrontation between the Kantian synthetic a priori and two nineteenth century insights: (i) there is no unique set of a priori conditions of the possibility of spatio-temporal experience; and (ii) geometric principles encapsulate the meanings of fundamental concepts, having thus the character of definitions rather than of synthetic claims about nature. Conventionalists inferred that such principles are matters for free choice, guided by pragmatic rather than by epistemic considerations. DiSalle rejects this inference; he suggests that the ‘definitions’ expressed by a priori principles in physics are not chosen for convenience, but arise from conceptual analysis of empirical knowledge and practice. This partially explains why such concepts appeared (e.g., to Kant) to be founded in intuition, and why the separation of physical geometry from intuition in modern physics did not separate its fundamental concepts from empirical knowledge. DiSalle concludes that the conventionalism of the logical empiricists, while rightly emphasizing the role of a priori constitutive principles in science, failed to appreciate the empiricist motivations that such principles embody. Ulrich Majer and Tilman Sauer investigate Hilbert’s view of the axiomatic method in physics and his understanding of Kant’s a priori; they interpret the relation between the a priori and the historical development of physics through a recursive epistemology. According to Hilbert’s account of scientific knowledge, our a priori assumptions about the world are inevitably permeated by anthropomorphic elements which have to be reduced as much as possible; as a consequence, the proportion of the a priori in our overall knowledge of the world shrinks in the course of development of natural science. Mikl´os R´edei and Michael St¨oltzner examine the way in which von Neumann applied the axiomatic method to physics, especially in his famous axiomatization of quantum mechanics. Among some physicists, von Neumann’s method is highly regarded, and is often compared to a kind of mathematical perception. Others, however, consider it to be too pedantic. Indeed von Neumann himself distinguished between the strict, formal axiomatics of mathematics and a less formal, ‘soft’ axiomatization which is required in physics. Brigitte Falkenburg applies central ideas of Kant’s theory of intuition to Bohr’s and Heisenberg’s views of quantum theory. There is no unique axiomatic basis for quantum physics; formal quantum theory has to be supplemented with appeal to semi-classical models and measurement theories. Heisenberg’s generalized correspondence principle establishes semantic bridges between the classical and the quantum domains, and these bridges can be seen as based on intuition in a Kantian sense. Peter Mittelstaedt argues that our Kantian intuition of physical objects, which is tailored to a classical, Newtonian world, is in fact
PREFACE
xi
too highly structured with respect to the notions of simultaneity and substance. He suggests a weaker, non-Kantian account of intuition according to which neither relativity theory nor quantum theory appear as unintuitive. He also emphasizes the point that classical mechanics itself is very unintuitive when seen from the perspective of everyday experience. The collection is completed by a new, consciously non-Kantian theory of intuition developed by Renate Huber. Her theory of intuition is based on an empirically grounded theory of knowledge supported by neuroscience. It sheds new light on various traditional philosophical views of intuition, and it permits the distinction between several kinds of direct or indirect intuitions. These distinctions then give rise to distinctions between directly and indirectly intuitive theories, the latter in contradistinction to non-intuitive theories. The new approach is illustrated with several examples from physics and is also then applied to Poincar´e’s use of intuition in mathematics. The contributions to this volume emerged from a three-year collaboration between Emily Carson, Robert DiSalle, Brigitte Falkenburg, Michael Hallett, Renate Huber and Ulrich Majer. Our project of investigating various facets of intuition and the axiomatic method as they appear in Kant’s work, in mathematics and in physics was generously supported by the GAAC (GermanAmerican Academic Council, TransCoop Program) from 1998 to 2000, and in various ways by the Social Sciences and Humanities Research Council of Canada. In addition to individual research contacts between the collaborators, a series of workshops and conferences was held, in Dortmund (1998 and 2000), G¨ottingen (1999) and Montreal (1999). Through these in particular, we learned a great deal, not just from each other, but also from the other invited participants. This volume represents reworkings of some of the papers presented. The Editors of the volume, Renate Huber and Emily Carson, were responsible for its final realization, and exhibited much patience and tolerance with the unwritten, written, and then rewritten versions of the papers. William Demopoulos proposed the publication of the book in the Western Ontario Series in the Philosophy of Science, and has shown great support throughout. Tobias Fox was responsible for most of the technical editing work using LATEX, and Dirk Schlimm dealt with the final round of corrections. Enormous thanks are due to all of them. Dortmund and Montreal, February 2004 Brigitte Falkenburg and Michael Hallett
Acknowledgements
The Editors wish to thank the following: The Journal of the History of Philosophy and the British Journal for the History of Philosophy for permission to use material from the papers by Emily Carson, “Kant on the Method of Mathematics”, Journal of the History of Philosophy, Volume 37 (1999), pp. 629–652, and “Locke’s Account of Certain and Demonstrative Knowledge”, British Journal for the History of Philosophy, Volume 10 (2002), pp. 359–378 [http://www.tandf.co.uk]. The Notre Dame Journal of Formal Logic and Philosophical Books for permission to use material from the papers by William Demopoulos, “On the Origin and Status of our Conception of Number”, in Notre Dame Journal of Formal Logic, Volume 41 (2000), pp. 210–226, and “On the Philosophical Interest of Frege Arithmetic”, in Philosophical Books, Volume 44 (2003), pp. 220–228. The journal Topoi for permission to use material from the paper by Elaine Landry, “Logicism, Structuralism and Objectivity”, Topoi, Volume 20 (2001), pp. 79–95. The journal Noˆus and Blackwell Publishing for permission to reprint the article “Conventionalism and Modern Physics: a Re-Assessment” by Robert DiSalle, which originally appeared in Noˆus, Volume 36 (2002), pp. 169–200.
xiii
I
MATHEMATICAL ASPECTS
LOCKE AND KANT ON MATHEMATICAL KNOWLEDGE Emily Carson McGill University, Canada
Both Locke and Kant sought, in different ways, to limit our claims to knowledge in general by comparing it to our knowledge of mathematics. On the one hand, Locke thought it a mistake to think that mathematics alone is capable of demonstrative certainty. He therefore tried to isolate what it is about mathematics that makes it thus capable, in the hope of showing that other areas of inquiry — morality, for example — admit of the same degree of certainty. Kant, on the other hand, attributed much of the metaphysical excess of philosophy to the attempt by metaphysicians to imitate the method of mathematicians. He therefore sought to limit that excess by examining the mathematical method, like Locke, in order to isolate what is special about mathematics that accounts for its certainty. This paper represents a small part of a larger project relating Kant’s views on mathematics to the emergence of the Critical philosophy. Kant’s recognition of the role of intuition in mathematics had important implications for his theoretical philosophy. What that role is, and what those implications are, however, are controversial questions. I want to approach these questions via Kant’s insistence, even in his pre-Critical work, on a sharp distinction between the method appropriate to mathematics and that appropriate to metaphysics. He often asserted that the application of the mathematical method to metaphysics resulted in flights of dogmatic fancies. An obvious question that arises is why the mathematical method issues in genuine knowledge in the one case, but only results in dogmatic fancies in the other. I argue that a philosophically adequate answer to the question raised by Kant’s pre-Critical account of the mathematical method requires the Critical doctrine of pure intuition. Understanding the development of Kant’s views on mathematics in this way sheds light, I think, on the role of intuition in mathematics for Kant by revealing what questions it is supposed to answer.1
3 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 3–19. © 2006 Springer. Printed in the Netherlands.
4
Intuition and the Axiomatic Method
In this paper, I want to compare Kant’s pre-Critical account of mathematics with Locke’s in order to provide further indirect support for my reading of the role of intuition in the Critical account of mathematics. I argue that Locke offers an account of mathematical knowledge very similar to that offered by Kant in the Prize Essay. In particular, both emphasize what I call the ideality of mathematical knowledge in order to explain its peculiar certainty. But there is a tension between this ideality, on the one hand, and what Locke calls the reality of such knowledge, on the other. This tension is only resolved by Kant’s doctrine of pure intuition. The comparison to Locke is thus intended to bring out the shared concern for the peculiar features of mathematics — that it gives us certain, real, and instructive knowledge — as well as the difficulty of explaining these features. This in turn brings out the importance of Kant’s doctrine of pure intuition for such an explanation. The paper falls into three parts. I will argue, first, that Locke’s account was an inadequate representation of the method of attaining mathematical certainty. Secondly, I will present Kant’s early account of the mathematical method in the Prize Essay of 1763, in which he presents a somewhat improved version of the Lockean account, but fails to give an adequate philosophical foundation for the resulting view. Finally, I want to show that Kant’s notion of pure intuition was the key to the development of an adequate account of the mathematical method and that this, in turn, makes clear why this method was of no use outside of mathematics. I hope to show that Kant’s introduction of the notion of construction in pure intuition was thus not simply an application of the independently-developed Critical apparatus, but rather that it is the natural result of philosophical reflection on a shared conception of the mathematical method.
1.
Locke on mathematical knowledge2
For Locke, there are only two kinds of propositions which we can know with perfect certainty.3 First there are trifling propositions which have mere verbal certainty and are therefore not instructive: for example, a purely identical proposition like ‘A spirit is a spirit’, Locke says, does not advance our knowledge. The other propositions which we can know with perfect certainty are those “which affirm something of another, which is a necessary consequence of its precise complex idea, but not contained in it”. Locke’s stock examples of these propositions are geometrical. For example, since the relation of the outward angle of a triangle to either of the opposite internal angles is no part of the complex idea of triangle, the proposition that the external angle of all triangles is bigger than either of the opposite internal angles is “a real truth, and conveys with it instructive real knowledge”. Locke contrasts this kind of certain knowledge with our knowledge of substances: since the only access we have to ideas of substances is through our senses, “we cannot
Locke and Kant on Mathematical Knowledge
5
make any universal certain propositions concerning them”. General propositions about substances, insofar as they are certain, are merely trifling; insofar as they are instructive, they are uncertain [4.8.9]. This gives us some idea of why instructive propositions about substances — the subject matter of natural philosophy — cannot be known with certainty. But the question we are concerned with here is why we can have instructive certain knowledge in other areas of inquiry. Knowledge, for Locke, is the ‘perception of the connexion and agreement’ of our ideas, where ideas are whatever is the object of our understanding when we think [4.1.1]. The ideas of which we may have certain knowledge, according to Locke, are those whose agreement or disagreement with each other may be intuitively perceived. These ideas also admit of demonstrative certainty. Demonstration, for Locke, is necessary where we cannot compare two such ideas immediately. In that case, the mind must discover the agreement or disagreement of these ideas by means of intermediate ideas. So a demonstration is really a chain of intuitive perceptions of agreement and disagreement among ideas [4.8.3]. The question we have to concern ourselves with now is what is the special feature of such ideas that allows their agreement or disagreement to be intuitively perceived. This requires a brief reminder of Locke’s account of ideas. Our ideas are either simple or complex. Simple ideas, the materials of all our knowledge, are produced in the mind by means of the operation of objects on our senses. Although the mind is “wholly passive” [2.30.3] with respect to simple ideas, once the understanding has stored them, it has the power to “repeat, compare and unite them” to “an almost infinite variety”, and thereby “can make at Pleasure new complex Ideas” [2.1.2]. So complex ideas are combinations of simple ideas put together and united under one general name. Among complex ideas, Locke distinguishes ideas of substance, which are ideas of something “self-subsisting”, representing “distinct particular things”, from ideas of modes, which are of “dependencies on, or affections of substance”. So ideas of substance are, roughly, ideas of natural kinds of material things, like the idea of lead or of animal. Ideas of modes are best explained in terms of their origin. Whereas ideas of substance, because they purport to refer to self-subsisting things, are combinations of simple ideas which we notice by experience and observation “go constantly together”, ideas of modes are voluntary combinations of ideas. Locke emphasises the free activity of the mind, its power to repeat and join its own ideas “as it pleases” and “without the help of any extrinsical Object, or any foreign suggestion” [2.13.1]. The simple ideas of which ideas of modes are composed are not to be thought of as characteristic marks of any real beings with “steady existence”, but as “scattered and independent Ideas put together by the Mind”. Unlike ideas of substances, they have their origin and existence “more in the Thoughts of Men, than in the reality of things”. To form such ideas “it sufficed that the Mind put the parts of them together and that they were consistent in the Understanding, without considering whether they had any real Being”. For example, although the idea
6
Intuition and the Axiomatic Method
of man has no more connection in nature with the idea of killing than does the idea of sheep, we combine the ideas of man and of killing, and make it into a species of action, signified by the word ‘murder’. It is this difference, according to Locke, which underlies the difference between demonstrative sciences like mathematics, and natural philosophy. Ideas of modes, the subject matter of mathematics, are just combinations of simple ideas which the mind puts together arbitrarily, of its own choice, without reference to any “real existence”, and subject only to the condition that the simple ideas be “consistent in the understanding”. Because they have their existence “in the thoughts of men” rather than “in the reality of things”, we can have perfect knowledge of them. On the other hand, ideas of substance, the subject matter of natural philosophy, purport to refer to things as they really exist, and to represent that constitution on which all their properties depend. Thus we can never be sure that we have captured all the various qualities belonging to the thing. Locke formulates the difference in terms of his conception of essence. Recall that for Locke, real essence is “the real internal, but generally in substances, unknown constitution of things, whereon their discoverable qualities depend” [3.3.15]. Nominal essence is not the real constitution of things, but rather “the artificial constitution of genus and species”; it is the “workmanship of the understanding” in ranking things into sorts. In the case of substances, the real essence is different from the nominal essence. The real essence is the unknown, perhaps corpuscular, constitution of a substance, while the nominal essence is a combination of perceptible properties of the substance. In the case of modes, however, the real essence and the nominal essence are the same. Both are the ‘workmanship of the mind’, ‘creatures of the understanding’. To illustrate the difference, Locke compares the idea of triangle with that of gold: . . . a Figure including a Space between three Lines, is the real, as well as nominal Essence of a Triangle; it being not only the abstract Idea to which the general Name is annexed, but the very Essentia, or Being, of the thing itself, that Foundation from which all its Properties flow, and to which they are all inseparably annexed.
In the case of gold, however, the real essence is . . . the real Constitution of its insensible Parts, on which depend all those Properties of Colour, Weight, Fusibility, Fixedness, etc. which are to be found in it. Which Constitution we know not; and so having no particular Idea of, have no Name that is the Sign of it. But yet it is its Colour, Weight, Fusibility, and Fixedness, etc. which makes it to be Gold, or gives it a right to that Name, which is therefore its nominal Essence [3.3.18].
This, finally, is the key difference between ideas of modes and of substances which explains why we have certain knowledge of the one, but not of the other. Because ideas of modes are combinations of ideas which the mind puts together arbitrarily without reference to any real existence outside it, the real essence just is the nominal essence. Because we know the nominal essence
Locke and Kant on Mathematical Knowledge
7
(we create it), ideas of modes have knowable real essences. Ideas of substances do not. It is upon this ground, Locke says, that he is “bold to think” . . . that Morality is capable of Demonstration, as well as Mathematicks: Since the precise real Essence of the Things moral Words stand for, may be perfectly known; and so the Congruity, or Incongruity of the Things themselves, be certainly discovered, in which consists perfect Knowledge [3.11.16]
In summary, then, it seems that the relevant difference between ideas of substance and ideas of modes is that ideas of modes are in some sense purely ideal. Mathematics is, Locke says, “only of our own Ideas” [4.4.6]. Discourses about morality are “about Ideas in the mind . . . having no external Beings for Archetypes which they . . . must correspond with” [3.11.17]. Two things follow from the ideality of ideas of modes. First, because they do not refer to anything outside the mind, “they have no other reality but what they have in the minds of men”, they “have the perfection that the mind intended them to have”. For example, the idea of a figure with three sides meeting at three angles is, Locke says, a complete idea, requiring nothing else to make it perfect; it contains “all that is, or can be essential to it, or necessary to complete it, wherever or however it exists”. So complete knowledge of the idea amounts to complete knowledge of the object of the idea. Secondly, Locke seems to think that it follows from the fact that these ideas are ‘the Workmanship of the Understanding’ that they are transparent to us: thus, we can have complete knowledge of ideas of modes, and thus of the objects of those ideas. Reformulated in terms of essences, the two important consequences of Locke’s thesis of the ideality of ideas of modes are (1) that the real essence is the same as the nominal essence, and (2) that the real essence is (therefore) knowable. This immediately gives rise to a question regarding what Locke calls the ‘reality’ of such ideas and of our knowledge of them. The problem is that if knowledge consists only in the perception of the agreement or disagreement of our own ideas, then it seems that regardless of “how things are”, the reasoning of a wise person will be just as certain as the most extravagant fancies of an “enthusiast”. Ideas which are mere chimera may be spoken of consistently and coherently, and thus “such Castles in the Air, will be as strong Holds of Truth, as the Demonstrations of Euclid” [4.4.1]. But what value or use is there in such knowledge of our own imaginations? For what “gives value to our Reasonings, and preference to one Man’s Knowledge over another’s, [is] that it is of Things as they really are, and not of Dreams or Fancies” [4.4.1]. Although Locke raises this question with regard to knowledge in general, it seems particularly pressing with respect to his account of our ideas of modes. The content of ideas of substances is easily explained by their reference to “extrinsical objects”. But if ideas of modes really are just voluntary collections of ideas, put together without any consideration as to whether they have ‘real being’, then how can they provide us with real knowledge? What makes our reasoning about squares and circles count as knowledge where our reason-
8
Intuition and the Axiomatic Method
ing about mere chimera like harpies and centaurs fails to count as knowledge? How do we distinguish mathematical theories from mere fairy tales about castles in the air? More to the point, how do we distinguish our knowledge of triangles from our knowledge of two-sided rectilinear figures? The problem is that, at least in the case of geometry, we don’t want to say that any combination of simple ideas results in a real idea of a mode. But how does Locke rule out, say, ideas of two-sided rectilinear figures? He says that ideas of modes must be “consistent in the understanding”, but what does this mean? Is it mere logical consistency? We find the answer to this question in Locke’s account of the various modes of space, where he explains how our ideas of geometrical figures are generated by the activity of the mind “repeating its own Ideas and joining them as it pleases”. He describes the power of the mind to join lines of whatever length to other lines of different lengths and at different angles until it encloses a space, and thereby multiply figures both in their size and shape in infinitum. These products of the mind are the subject matter of geometry. What Locke describes here as the generation of complex geometrical ideas is the construction of figures against a spatial background. So it turns out that geometrical ideas have spatial properties built into them. It’s not that the idea of the two-sided rectilinear figure is logically inconsistent: it is rather that those simple ideas cannot be combined in this way, in space. This reading fits with Locke’s example of a geometrical demonstration: since the mind cannot compare the sizes of three angles of a triangle and two right angles immediately, it finds some other angles to which the three angles of the triangle are equal, and then determines that those are equal to two right ones, thereby coming to know the equality of the three angles to two right ones. Ideas here then are to be taken as quasi-sensible images. The problem with this reading is that it doesn’t fit with Locke’s two claims based on the ideality of modes, the two claims that I suggested are essential to his account of why these ideas admit of complete certainty. This problem comes out most clearly when we consider Locke’s claim that mathematics is not the only science capable of demonstrative certainty. In particular, he thinks that our ideas of a supreme being and of ourselves are clear enough to provide “such Foundations of our Duty and Rules of Action as might place Morality amongst the Sciences capable of Demonstration” [4.3.18]. He proceeds to back up this claim by arguing that the proposition that where there is no property, there is no injustice “is a Proposition as certain as any Demonstration in Euclid”. The idea of property, he says, is a right to anything; the idea to which the name ‘justice’ is attached is the invasion or violation of that right. With these names attached to these ideas, the proposition can be known to be true as certainly as any proposition in mathematics. Similarly, if the idea of government is the establishment of society upon certain rules which demand conformity, and the idea of absolute liberty is for any one to do whatever he pleases, then we know with demonstrative certainty the truth of the proposition that no government allows absolute liberty.
Locke and Kant on Mathematical Knowledge
9
These examples lead one naturally to question the strength of the analogy with geometrical demonstrations. What is striking about geometrical examples, and what struck Locke, is how they provide us with certain, real and instructive knowledge. The same cannot be said of Locke’s examples of demonstrated ethical truths. As Locke presents them, these so-called demonstrations seem to involve nothing more than analyses of complex concepts into their simple constituents. At least one contemporary critic, Henry Lee, pointed out that these demonstrations seem to result only in trifling or vacuous propositions of no use to us.4 On the other hand, they do seem to fit better with Locke’s account of the ideality of modes: these ethical ideas really are arbitrary combinations of simple ideas. They do seem amenable to certain knowledge in virtue of their transparency: we put them together out of simple ideas, and we can therefore break them down to those simple ideas and recombine them. But the resulting propositions are not instructive in the way that geometrical propositions are. As we’ve seen, geometrical ideas are not arbitrary combinations in the same sense. There are some combinations of simple ideas such that if the mind tries to combine them, it will fail. There are external non-logical constraints on the combination of the simple ideas of space: that combination is therefore governed by rules in a way in which the combination of simple ethical ideas appears not to be. The problem with this is that, as I argued above, Locke’s claim that we have certain knowledge of these ideas rests on their ideality: on the fact that they are only ideas in our mind, referring to nothing outside our minds, that we know their real essences. This is supposed to give us privileged epistemic access to them. But the ideas of geometry are not ideal in the relevant sense. The fact that there are external ‘extra-mental’ constraints, that space, in effect, acts as a background theory, shows that this is not the case. In short, Locke’s account of the content of mathematics conflicts with his account of the certainty of mathematics. Locke does acknowledge a disanalogy between geometrical demonstration and ethical demonstration in that there is a special role for figures in geometry. But he takes this to be a merely heuristic role: figures are “helps to the memory” in retaining the many ideas involved in any given demonstration; because the diagrams are copies of the ideas, they have a greater correspondence with the ideas than do words or sounds. What he fails to see is that the role of the diagram reflects an essential difference between these kinds of ideas. On the contrary, he thinks that this disadvantage to ethical ideas can easily be overcome by means of definitions: we simply set down the collection of simple ideas which every term shall stand for and then use the term steadily and constantly for that precise collection. Then presumably, to perceive that two ideas agree involves something like running through the list of simple ideas contained in each. It is hard to see how this results in anything more than what Locke calls trifling knowledge, particularly since we put those ideas there in the first place. The difference in the kinds of demonstration appropriate to ethics and to geometry could not be made clearer.
10
Intuition and the Axiomatic Method
In summary, then, Locke’s account of modes is supposed to capture what is different between demonstrative sciences like mathematics and ethics on the one hand, and natural philosophy on the other. Because mathematics and ethics are only of our own ideas, we have certain demonstrative knowledge of them. I have tried to suggest (i) that insofar as geometry is a body of demonstratively certain instructive truths, as Locke describes it, its objects are not arbitrary creations of the mind, and (ii) insofar as the objects of ethics are arbitrary creations of the mind, it does not admit of demonstrative certainty of instructive truths in the way that geometry does. The problem here lies with Locke’s failure to account for the essential role of spatial constructions in geometrical demonstration. More important than the failure of Locke’s analogy between mathematics and ethics, however, is that his account of mathematical knowledge is radically incomplete. The claim is supposed to be that we have privileged knowledge of the properties of geometrical figures because they are only in our minds, we know the real essences. The fact that there are external ‘extra-mental’ constraints — that space, in effect, acts as a background theory — shows that this cannot be the case. In failing to integrate these constraints on the generation of geometrical ideas and our knowledge of these constraints into his account of demonstrative certainty, Locke has failed to explain how we come to have demonstrative certainty even in mathematics.
2.
Kant on the method of mathematics
In the Prize Essay of 1763, Kant takes up the question of whether metaphysics is capable of the same degree of certainty as mathematics. Like Locke, he examines the method of mathematics to determine whether it can be applied in areas other than mathematics. Unlike Locke, however, his conclusion is negative, at least with respect to metaphysics. Indeed, Kant thinks that nothing has been more damaging to philosophy than the imitation of the mathematical method. The primary difference in method that Kant considers in the Prize Essay concerns the role of definition. This difference is summarised in the following passage: In mathematics I begin with the definition of my object, for example, of a triangle or a circle, or whatever. In metaphysics I may never begin with a definition. Far from being the first thing I know about the object, the definition is nearly always the last thing I come to know. In mathematics, namely, I have no concept of my object at all until it is furnished by the definition. In metaphysics I have a concept which is already given to me although it is a confused one. My task is to search for the distinct, complete and determinate concept.5
This is made possible by the fact that “mathematics arrives at all its definitions synthetically, whereas philosophy arrives at its definitions analytically” (2:276). A synthetic definition, according to Kant, is arrived at by “the arbitrary combination of concepts”. The concept thus defined is not given prior to
Locke and Kant on Mathematical Knowledge
11
the definition, but rather “comes into existence” as a result of the definition. For example, [w]hatever the concept of cone may ordinarily signify, in mathematics the concept is the product of the arbitrary representation of a right-angled triangle which is rotated on one of its sides [2:276].
To take another example, the concept of a square is the result of the arbitrary combination of the concepts four-sided, equilateral, and rectangle.6 This is not the result of an analysis of some concept given in another way — it is not, for example, abstracted from our experience of squares in nature; the concept is, as Kant says, first given by the definition itself. In philosophy, on the other hand, the concepts are always given in some way, but “confusedly or in an insufficiently determinate fashion”. The task of the philosopher is then to discover by means of analysis the characteristic marks in the confused concept in order to arrive at a complete and determinate concept, that is, a definition. Thus Kant says for example, “everyone has the concept of time”. This idea that everyone has must be examined in all kinds of relations, and once the characteristic marks have been made distinct, and then combined together, the resulting concept has to be compared with the concept of time which was originally given in order to determine whether or not it has captured the original idea. If by contrast we tried to arrive at a definition of time synthetically, by arbitrarily combining concepts, it would have been a “happy coincidence” if the resulting concept had been exactly the same as the idea of time which is given to us [2:277]. So in mathematics, we produce concepts by means of synthetic definitions. In philosophy, we analyse given concepts in order to arrive at analytic definitions. In order to appreciate the importance of this methodological distinction, we have to consider Kant’s theory of definition in a bit more detail. He elaborates on this in his lectures on logic from the early 1770’s. A definition is essentially a distinct and complete concept of a thing.7 A concept is distinct insofar as one is conscious of the marks contained in the concept [24:120]. A concept is complete when the marks are sufficient to cognise, first, the difference of the definitum from all other things, and secondly, the identity of it with other things. Kant claims that the synthetic definitions of mathematics are definitions in this sense, and in fact, that mathematical concepts are the only ones that admit of definition. First of all, he says, all fabricated concepts are “produced simultaneously with their distinctness”: I am conscious of each of the marks included in the concept because I put them there in defining the concept, and “one can most easily be conscious of that which one has oneself invented” [24:153]. Similarly, the definition is complete because the mathematician . . . thinks everything that suffices to distinguish the thing from all others, for [it] is not a thing outside him, which he has cognised in part according to certain determinations, but rather a thing in his pure reason, which he thinks of arbitrarily and in conformity with which he attaches certain determinations, whereby
12
Intuition and the Axiomatic Method he intends that the thing should be capable of being differentiated from all other things [24:125].
In other words, if the thing defined is first given by the definition, then the definition is of course complete. This is in sharp contrast to empirical concepts which, Kant says, are capable only of description, not of definition. Since in that case the concept is given, in order to make it distinct I must “enumerate all the marks that I think in connection with the expression of the definitum”. But one can never know that the marks that one has enumerated at any point are “sufficient to distinguish the thing from all remaining things” [24:124]. The most we can hope for is comparative completeness, “when the marks of a thing suffice to distinguish it from everything that we have cognised in experience until now”. Since philosophical concepts, like empirical ones, are also given, “the philosopher cannot so easily be certain that he has touched on all the marks that belong to a thing, and that he has insight into these completely perfectly”; consequently, many marks “may still belong to the thing of which he knows nothing” [24:153]. This suggests that philosophical concepts, like empirical ones, do not in the end admit of definition either. At best, any purported definition will be uncertain. It seems, then, that Kant’s account of the certainty of mathematical knowledge shares the essential features of Locke’s account. What is special about mathematical concepts is that they are given by synthetic definitions — by the arbitrary combination of concepts. Because I defined the concept, I am conscious of each of the marks included in it; because the thing defined is not a thing outside me, but is first given by the definition, then all the marks which I include in the definition of the thing are all the marks that belong to the thing. In other words, to explain the certainty of mathematics, Kant, like Locke, appeals to what I called earlier the ‘ideality’ of mathematical concepts: because we make the concepts of mathematics, we have perfect insight into them. Just as Locke expresses this in terms of his doctrine of real and nominal essences — claiming that since in the case of ideas of modes, the nominal essence is also the real essence, it follows that we can have knowledge of the real essences — Kant, in the lectures on logic, distinguishes between real and nominal definitions. A definition is nominal when its marks are “adequate to the whole concept that we think with the expression of the definitum”; a real definition is one “whose marks constitute the whole possible concept of the thing” [24:132]. Alternatively, nominal definitions “contain everything that is equal to the whole concept that we make for ourselves of the thing”, whereas real definitions “contain everything that belongs to the thing in itself”. In particular, Kant says that all definitions of arbitrary concepts that are made, as opposed to given, are real definitions . . . because it lies solely with me to make up the concept and to establish it as it pleases me, and the whole concept thus has no other reality than merely what my fabrication wants; consequently I can always put all the parts that I name
Locke and Kant on Mathematical Knowledge
13
into a thing, and these must then constitute the complete, possible concept of the thing, for the whole thing is actual only by means of my will [24:268].
Empirical concepts, on the other hand, would be capable of at best nominal definition since “I do not define the object but instead only the concept that one thinks in the case of the thing” [24:271]. The difference is that in the case of arbitrary concepts, the marks of the “whole possible concept of the thing” just are the marks of the concept that we think in the case of the thing: in defining the concept that one thinks, one at the same defines the object. Because mathematical definitions are of arbitrary concepts, they are also, by Kant’s lights, real definitions. Kant attributes much mistaken philosophy to the failure to recognise this fundamental methodological difference between philosophy and mathematics: that the one arrives at its definitions by analysis, the other by synthesis. Indeed, it underlies his diagnosis in the Prize Essay of the main problem of philosophy: “nothing has been more damaging to philosophy”, he says, than the imitation of the method of mathematics “in contexts where it cannot possibly be employed” [2:283]. For example, a philosopher could offer a synthetic definition by “arbitrarily thinking of a substance endowed with a faculty of reason and calling it a spirit”. However, this would not be a philosophical definition, but a “grammatical” one, a mere linguistic stipulation, and “no philosophy is needed to say what name is to be attached to an arbitrary concept” [2:277]. Indeed, Kant accuses Leibniz of having made this mistake in imagining “a simple substance which had nothing but obscure representations” and calling it a ‘slumbering monad’. He did not thereby define the monad, “he merely invented it, for the concept of a monad was not given to him but created by him”. This charge immediately gives rise to the question of what licenses this way of drawing the distinction between mathematics and philosophy. Kant claims that his treatise contains nothing but “empirical propositions”, a neutral description of the different methods appropriate to mathematics and to philosophy. But for his prescription against the application of the mathematical method in philosophy to carry any weight, he owes us an account of why the mathematical method is appropriate in the one and not the other. What is the relevant difference between mathematical concepts and metaphysical ones, a difference which accounts for the admissibility of arbitrary concepts in the one but not in the other: why is invention permissible, even required, in mathematics, but not in philosophy? Why is the synthetic definition of a trapezium legitimate, and Leibniz’s invention of the slumbering monad not? After all, both seem to involve the formation of complex concepts from given primitive ones. More to the point, the question arises for Kant just as it did for Locke: why does the synthetic definition of a trapezium issue in a legitimate mathematical concept, while the definition of a figure enclosed by two straight lines does not? In short, Kant is subject to Locke’s worry that reasoning about ‘Castles in the Air’ will be ‘as strong Holds of Truth as the Demonstrations of Euclid’, but Kant has somewhat more to say on this matter than Locke does. The com-
14
Intuition and the Axiomatic Method
parison of the role of definitions in metaphysics and mathematics is only part of the general comparison of their respective methods as presented in the Prize Essay. For one thing, unlike Locke, Kant recognises a role for ‘indemonstrable propositions’ in mathematics. Even if they admit of proof elsewhere, he says, “they are nonetheless regarded as immediately certain in this science” [2:281]. These propositions are set up at the beginning of mathematical inquiry “so that it is clear that these are the only obvious propositions which are immediately presupposed as true”. So the following picture emerges from the Prize Essay of the method of mathematics. It begins with a few given concepts, which mathematicians cannot and must not define, such as magnitude in general, unity, plurality and space, and a small number of indemonstrable, immediately certain propositions, such as the propositions that the whole is equal to all its parts taken together, and that there can only be one straight line between two points. Further concepts are built up out of the given ones by arbitrary combination — by synthesis — in accordance with the fundamental propositions. The mathematician then derives further propositions from these complex concepts and the fundamental propositions. Kant has made some progress over Locke because he explicitly distinguishes mathematical demonstration from conceptual analysis; he says little, however, about how the theorems are derived from the complex concepts and fundamental propositions. Nonetheless, we have here the beginnings of an answer to the question about how to rule out the figure enclosed by two straight lines: the figure cannot be defined in accordance with the indemonstrable propositions, for it contradicts the proposition that between two points only one straight line may be drawn. The indemonstrable propositions therefore place constraints on the arbitrary combination of concepts. But this then simply pushes the question onto the indemonstrable propositions. What is their status? It is not enough to presuppose them as true; they must in fact be true, and be known to be true if we are to distinguish the demonstrations of Euclid from mere castles in the air. Kant says that they are immediately certain, but what makes them so? The problem here is that Kant’s description of the mathematical method seems to correspond roughly (the technique of derivation aside) to that appropriate to a formal axiomatic system. But unless some explanation is given of the content of those primitive concepts and propositions and the ground of their certainty, this account collapses into a kind of formalism. Regardless of Kant’s views about formalism as a philosophy of mathematics (and it’s clear that he opposes it), the threat of formalism undermines his attempt to distinguish the methods appropriate to mathematics and metaphysics. If the geometer is simply deducing properties and relations of imaginary or ideal objects given by arbitrary definitions, what is to stop the metaphysician from developing an axiomatic system for slumbering monads in a similar way? In what sense can we say that mathematics is a body of truths, and the theory of slumbering monads is not? More importantly, given Kant’s concern with the relative certainty of
Locke and Kant on Mathematical Knowledge
15
mathematics and metaphysics, how can we say that we know these truths with certainty? To sum up then, Kant’s account of mathematics in the Prize Essay seems to leave open the question of the relation between the method of mathematics, its content, and its certainty. First of all, it’s not clear how mathematical concepts are anything but arbitrary inventions with no objective content; secondly, mathematical propositions then seem to lose their claims to truth as opposed to mere deducibility from axioms and definitions; and thirdly, it’s therefore not clear that Kant is entitled to the sharp distinction he wants to draw between the certainty of mathematics and that of metaphysics. The key to all of these questions with respect to geometry is the relationship between geometry and space. Like Locke, Kant does recognise a role for symbols in mathematical proofs — drawn figures, in the case of geometry — as “an important device which facilitates thought”. The examples of geometrical proofs in the Prize Essay are clearly diagrammatic proofs. But, again like Locke, he fails to integrate this feature into the general account of the method of mathematics: he tells us that “figures and visible signs” play a role in mathematical proofs, but he fails to explain what that role is and how they fulfill it. Considering that his goal is to contrast the nature of mathematical and philosophical certainty, it would seem that he owes us an account of why the distinguishing features of mathematics are guarantees of the certainty of mathematics. It seems then that the important task is to provide an epistemological grounding for the mathematical method. He has to show that the mathematical method of attaining certainty is in fact a method of attaining certainty.
3.
Construction in pure intuition
Kant undertakes this task in the Critique of Pure Reason. In ‘The discipline of pure reason in its dogmatic employment’ near the end of the Critique, he once again takes up the question of whether the mathematical method of attaining certainty is identical with the method of attaining certainty in philosophy. The answer, again, is negative, but the reasons appear different. The essential difference between these two kinds of knowledge is again a formal difference, that “philosophical knowledge is the knowledge gained by reason from concepts” whereas “mathematical knowledge is the knowledge gained by reason from the construction of concepts”.8 To construct a concept is, for Kant, to “exhibit apriori the intuition which corresponds to the concept”. To take one of his examples, the geometer constructs a triangle “by representing the object which corresponds to this concept either by imagination alone, in pure intuition, or in accordance therewith also on paper, in empirical intuition — in both cases completely apriori, without having borrowed the pattern from any experience”. Although the intuition is a single object, “it expresses universal validity for all possible intuitions which fall under the same concept” because it is “determined by certain universal conditions of construction” [A714/B742]. Contrast
16
Intuition and the Axiomatic Method
with this a philosophical concept, like that of cause or reality. “No one can obtain an intuition corresponding to the concept of reality otherwise than from experience; we can never come into possession of it apriori out of our own resources, and prior to the empirical consciousness of reality” [A714/B742]. So far, this is again just a description of the difference: objects corresponding to mathematical concepts can be provided a priori, but this is not the case in metaphysics. Recognising the need for an explanation, though, Kant asks “what can be the reason of this radical difference in the fortunes of the philosopher and the mathematician, both of whom practise the art of reason, the one making his way by means of concepts, the other by means of intuitions which he exhibits apriori in accordance with concepts?” Why is it possible for mathematicians to obtain a priori intuitions corresponding to their concepts, but not for philosophers? The answer, according to Kant, is given by the “fundamental transcendental doctrines” which he has just elaborated. According to these doctrines, there are two elements in the field of appearance: the form of intuition (space and time), and the matter (the physical element). Whereas the material element cannot be given in determinate fashion other than empirically, the formal element, Kant says, “can be known and determined completely apriori”. What does this mean? Objects are given only through intuition. The only intuition given a priori, as Kant argued in the Transcendental Aesthetic, is that of the form of appearances. Because space and time, as the form of appearances, are given a priori, “a concept of space and time as quanta, can be exhibited apriori in intuition, that is, constructed” [A720/B748]. Objects for philosophical concepts, such as that of reality or substance, however, can only be given in empirical intuition, aposteriori. So, Kant concludes, corresponding to two elements in the field of appearance, there is a twofold employment of reason: the mathematical employment of reason through the construction of concepts, and the philosophical employment of reason in accordance with concepts. So it’s clear that the doctrine of pure intuition accounts for the difference between these two methods. But what does this tell us about the role of intuition in the method of mathematics? To answer this question, I want to consider how the account of mathematical method in the Critique of Pure Reason relates to the earlier account given in the Prize Essay. In the Prize Essay, the difference between mathematics and philosophy, between the synthetic and the analytic method, rested largely on the different roles of definitions in each. Similarly in the Critique, Kant attempts to show once and for all that mathematics and philosophy are so different that “the procedure of the one can never be imitated by the other”. He does this by once again considering the means of achieving certainty in mathematics — that is, “definitions, axioms, and demonstrations” — and showing that “none of these, in the sense in which they are understood by the mathematician, can be achieved or imitated by the philosopher” [A726/B754]. I’ll begin with the account of definition in the Critique.
Locke and Kant on Mathematical Knowledge
17
Again, Kant says that to define means to present the complete and distinct concept of a thing. An empirical concept cannot be defined because the limits of the concept are never assured: for example, new observations remove some properties and add others. Concepts given a priori (such as substance or cause) cannot be defined because the completeness of the analysis will always be only probably, never apodeictally, certain. The only concepts which remain are “arbitrarily invented concepts”. With regard to these, Kant says, . . . a concept which I have invented I can always define; for since it is not given to me either by the nature of understanding or by experience, but is such as I have myself deliberately made it to be, I must know what I have intended to think in using it [A729/B757].
However, he goes on, “I have [not] thereby defined a true object”. If the concept depends on any empirical conditions, “this arbitrary concept of mine does not assure me of the existence or of the possibility of its object”. To borrow Kant’s phrase from the Prize Essay, it would just be a “happy accident” if there were an object corresponding to my invented concept. Mathematical concepts, on the other hand, “contain an arbitrary synthesis that admits of apriori construction”. The constructibility of such concepts in pure intuition assures us of the possibility of the corresponding object. Consequently, only mathematics has definitions proper “for the object which it thinks, it exhibits apriori in intuition, and this object certainly cannot contain either more or less than the concept, since it is through the definition that the concept of the object is given” [A729/B757]. Herein lies the reason why synthetic definitions are admissible in mathematics and not in philosophy: the arbitrary combination of concepts in mathematics admits of a priori construction, which assures us of the existence, or better, the possibility of the objects. It is in this sense, then, that mathematical definitions are also real definitions: a real definition, Kant says, “does not merely substitute for the name of a thing other more intelligible words, but contains a clear property by which the defined object can always be known with certainty” [A242n]. Thus, Kant says earlier in the Critique, a real definition “makes clear not only the concept but also its objective reality”. Because mathematical definitions present the object in intuition, in conformity with the concept originally framed by the mind, they are real definitions. Mathematical definitions are, Kant says, constructions of concepts [A730/B758]. What this amounts to, I think, is a rejection of what I have called the ideality thesis. As a result, Kant is able to distinguish explicitly the two models of demonstration that I suggested Locke conflates in his attempt to uphold the ideality thesis. Kant explains that mathematics is not concerned with analytic propositions because as a mathematician, “I must not restrict my attention to what I am actually thinking in my concept of a triangle”; “this” he says, “is nothing more than the mere definition”. Similarly, he goes on to say that it would be futile for the mathematician to philosophise upon the triangle, to
18
Intuition and the Axiomatic Method
think about it discursively, for “I should not be able to advance a single step beyond the mere definition, which is what I had to begin with”. In order to gain mathematical knowledge, instead “I must pass beyond this definition to properties which are not contained in this concept but yet belong to it”. The only way to do this is to “determine my object in accordance with the conditions . . . of pure intuition” [A718/B746]: to construct it. So whereas Locke gave no explanation of how properties not contained in the mathematical concept nonetheless belonged to it, Kant claims that they do so in virtue of the concept’s relation to the conditions of pure intuition: “we can determine our concepts in apriori intuition, inasmuch as we create for ourselves, in space and time . . . the objects themselves”. The doctrine of the form of intuition thus enables Kant to integrate the appeal to spatial construction into his account of the mathematical method. Let me try to make clear now how the doctrine of pure intuition resolves the problems raised above regarding Kant’s early account of mathematical method. That Kant was worried about the formalist possibility left open in the Prize Essay, and that he saw the doctrine of pure intuition as eliminating the possibility, comes out clearly in many passages in the Critique. He says, for example, that geometrical knowledge would be nothing but playing with mere chimeras “were it not that space has to be regarded as a condition of the appearances which constitute the material for outer experience” [A157/B196]. The key idea here is that the content of the arbitrary concepts of mathematics is given a priori, by construction in pure intuition, whereas there is nothing given a priori which could correspond to the concept of a slumbering monad. Pure intuition thus constrains the arbitrariness of the definitions and gives content to the axioms and primitive concepts. The fundamental propositions of geometry assert the “universal conditions of construction” of figures, that is, the conditions imposed by the form of intuition. Construction in pure intuition is then simply construction according to the (Euclidean) postulates. The concept of a figure enclosed by two straight lines is not in accordance with the fundamental propositions (i.e., between any two points only one straight line can be drawn), and thus is not constructible in pure intuition. No similar constraints can be prescribed in advance regarding the existence of the objects corresponding to philosophical concepts. Note, though, that this account requires that those constraints be prescribed in advance. What we’ve got so far is a story about how the fundamental propositions and arbitrary concepts of mathematics have objective content. The mathematician does not simply spin out consequences of arbitrary theories, but rather spins out consequences of true theories. But it’s not enough that these theories simply be presupposed as true; they must be known to be true. Otherwise the door is left open for the axiomatic theory of slumbering monads. For Locke and Kant’s earlier self, the ideality thesis was supposed to explain how we do have certain knowledge of mathematics. Because Kant has rejected the ideality thesis, he now has to provide another explanation of the certainty
Locke and Kant on Mathematical Knowledge
19
of mathematics. It is for this reason, I want to suggest, that Kant must see a role for his doctrine of intuition in explaining not only the content, but our knowledge, of the indemonstrable propositions of mathematics. Only in this way can he achieve what Locke and Kant’s earlier self couldn’t: a coherent account both of the content of mathematics and of the certainty of our knowledge of mathematics.
Notes 1. I argue for this in detail in Carson (1999). 2. This section draws on my paper Carson (2002). 3. Locke, An Essay Concerning Human Understanding, 4.8.8. 4. Cited in Schneewind, J., “Locke’s Moral Philosophy” in Chappell, Vere (ed.), The Cambridge Companion to Locke. Cambridge University Press, 1994; p. 223. 5. Untersuchung u¨ ber die Deutlichkeit der Grunds¨atze der nat¨urlichen Theologie und der Moral, 2:283. 6. Dohna-Wundlacken logic, 24:757. 7. Blomberg logic, 24:263. 8. Critique of Pure Reason, A713/B742.
References References to the Critique of Pure Reason are given by the standard pagination of the A and B editions. References to Kant’s other writings are given by volume and page number of the Akademie edition of Kants gesammelte Schriften. The English citations from the Critique of Pure Reason follow the translation of Paul Guyer and Allen Wood in the Cambridge Edition of the Works of Immanuel Kant. Citations from the pre-Critical works follow the translation of David Walford in collaboration with Ralf Meerbote in Theoretical Philosophy 1755–1770. Citations from Kant’s lectures on logic follow J. Michael Young’s translation in the Cambridge Edition volume. Carson, E. (1999), “Kant on the Method of Mathematics” in: Journal of the History of Philosophy 37 (4), 629–652. Carson, E. (2002), “Locke’s Account of Certain and Demonstrative Knowledge” in: British Journal for the History of Philosophy 10 (3), 359–378. Kant, I. (1992a), Lectures on Logic, ed. by J. Michael Young, Cambridge University Press. Kant, I. (1992b), Theoretical Philosophy 1755–1770, translated and edited by D. Walford, Cambridge University Press. Kant, I. (1998), Critique of Pure Reason, translated by P. Guyer and A. Wood, Cambridge University Press. Locke, J. (1975), An Essay Concerning Human Understanding, ed. by P. H. Nidditch, Clarendon, Oxford. Schneewind, J. (1994), “Locke’s Moral Philosophy” in: V. Chappell (ed.), The Cambridge Companion to Locke, Cambridge University Press, 199–225.
THE VIEW FROM 1763: KANT ON THE ARITHMETICAL METHOD BEFORE INTUITION∗ Ofra Rechter Tel-Aviv University, Israel
1.
Preface
In the Critique of Pure Reason, Kant famously draws a sharp contrast between mathematics and metaphysics, held to be a contrast in “method” or “form” of knowledge (A714/B742). His principal aim is to establish the fallacious and self-defeating character of any attempt on the part of metaphysics to aspire to achieve certainty by way of imitating the methods of mathematics. Metaphysical judgments are drawn “in accordance with concepts . . . and never intuitively”. The mathematical sciences, and specifically, pure arithmetic, comprise “knowledge by reason from the construction of concepts”. On this basis, Kant claims, arithmetical knowledge can be intuitively evident. To construct a concept is to “exhibit” the concept in pure intuition. But this is tantamount to defining it, and “. . . in mathematics we do not have any concepts at all prior to the definitions”. The claims that arithmetical concepts can be defined and can be constructed constitute, therefore, both an epistemic thesis and a thesis about the content of these concepts, just as the term “definition” would suggest. The most systematic presentation of the relation of mathematical knowledge to the “mathematical method” of “definition” and “construction” in the ∗ The
present paper is the first of a bipartite treatment of the subject of neglected continuity between Kant’s views on arithmetic in 1763 and his philosophy of arithmetic in the Critique of Pure Reason. My work on this paper and its sequel were made possible by the generous support of the Edelstein center at the Hebrew University of Jerusalem. For the discussion of a talk in which this paper originated I am grateful to members of the audience at the conference on mathematical intuition held at McGill University in 1999. For discussion of earlier versions of this paper I’m grateful to members of the audience at HOPOS 2002, at the conference on Kant in the 20th -Century at Tel-Aviv University in 2002, and at the Edelstein Colloquium 2003, and to Daniel Sutherland, Yaron Senderowicz, William Demopoulos, Mark Steiner, Hilary Putnam, Carl Posy, and Sidney Morgenbesser. I am indebted to Charles Parsons for comments and conversations on earlier versions of this paper and its sequel.
21 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 21–46. © 2006 Springer. Printed in the Netherlands.
22
Intuition and the Axiomatic Method
Critique of Pure Reason occurs in the Discipline of Pure Reason in Its Dogmatic Employment, where two forms of mathematical construction are discussed: “ostensive construction” and “symbolic construction”. This presentation does not include an elucidation of these notions in their arithmetical employment.1 “Ostensive construction”, said to be “of the objects themselves”, or their “forms”, is exemplified in geometrical constructions; in “symbolic construction” “mere magnitude (quantitatem)” is constructed. The symbolic construction “entirely abstracts from the constitution of the object”. Of course, the terms “definition” and “construction” have a place in the practice of Euclidean geometry. The absence of an analogous context of use for these notions in arithmetical practice makes the clarification of Kant’s usage of them in his philosophy of arithmetic all the more pressing.2 The Discipline, in effect, omits a connection between the possibility of defining arithmetical concepts and the symbolic method. But textual material that Kant denies us in the Critique he explicitly elaborates in the Inquiry Concerning the Distinctness of the Principles of Natural Theology and Morality, also known as the Prize Essay of 1763. This connection is presented in an argument for mathematical certainty — in a context which is universally recognized for its notable correspondence with the Discipline. Both passages engage a contrast between the mathematical method and the metaphysical method; and, in spite of the profound differences, primary parameters that frame this contrast have survived the Critical turn. Whether or not Kant holds the construction of arithmetical concepts to be a species of symbolic construction is a question on which many interpretive stands have been taken, directly or indirectly. But interpreters hold very different views of the significance of this question.3 The assumption that in arithmetical construction symbols are constructed out of other symbols has recently been contested in the literature.4 Interpretations that make this assumption do not address explicitly the fact that textual evidence for it is anything but direct or that Kant does not state it in the Discipline.5 But if the fact that there is no explicit statement of the symbolic aspects of arithmetical construction in the Critique cannot be ignored in interpreting Kant’s philosophy of arithmetic, neither can we ignore the only published text in which Kant states and articulates a relation between arithmetical concepts and their symbolization. Many interpreters turn for evidence to the Prize Essay. But appeals to the Prize Essay have derived very different conclusions from its comparison with the Discipline. For example, Jaakko Hintikka makes the provocative claim that [T]here is no reason to doubt that Kant’s statement in the Critique of Pure Reason that in mathematical reasoning one typically employs intuitions to represent general concepts is intended to formulate in the first place the very same precritical theory, no matter what superstructure Kant subsequently has built on top of it.6
By contrast, Charles Parsons concludes that [T]he prize essay suggests a position incompatible with the Critique of Pure Reason, namely that since in mathematics signs are manipulated according to rules
The View from 1763: Kant on the Arithmetical Method before Intuition
23
which we have laid down. . . operation with signs according to the rules, without attention to what they signify, is itself a sufficient guarantee of correctness.7
Whereas this comparison leads Hintikka to associate Kant’s early views more closely with logic, Parsons emphasizes the “ostensive” aspects implied by the notion that “symbolic construction is essentially a construction with symbols as objects of intuition” (ibid. p. 137). He argues that “the ‘symbolic’ construction in generating numerals is already enough to settle the question of their reference”.8 Disputed questions about the interpretation of Kant’s philosophy of mathematics in the Critique constitute the background for incompatibilities among appeals to the Prize Essay. But an analysis of its place in relation to the notion of mathematical symbolism in the Discipline — central to the interpretation of Kant’s philosophy of arithmetic — is a substantive but neglected task.9 The task of recovering the connection between arithmetic and the symbolic method in the Prize Essay is the task of this paper. Recovering the connections between arithmetical definitions and arithmetical symbolism — two of the most perplexing notions of Kant’s views on arithmetic — can lay the ground for addressing the question: What can be derived from the Prize Essay for a clarification of Kant’s Critical philosophy of arithmetic?
2.
The Prize Essay and the Discipline: Neglected comparison
The primary site of the Critical discussion of mathematical construction is found in A713–738/B742–766. I reproduce the most relevant passages for our present purposes.10 In [the case of algebra, mathematics] chooses a certain notation for all construction of magnitudes in general (numbers), as well as addition, subtraction, extraction of roots, etc., and, after it has also designated the general concept of quantities in accordance with their different relations, it then exhibits all the procedures through which magnitude is generated and altered in accordance with certain rules in intuition; where one magnitude is to be divided by another, it places their symbols together in accordance with the form of notation for division, and thereby achieves by symbolic construction equally well what geometry does by the ostensive or geometrical construction (of the objects themselves) which discursive knowledge can never achieve by means of mere concepts. (A717/B745) Philosophical cognition, on the contrary, must do without this advantage, since it must always consider the universal in abstracto (through concepts), while mathematics assesses the universal in concreto (in the individual intuition) and yet through pure a priori intuition, where every false step becomes visible. Since they can only be conducted by means of mere words (the object in thought), I would prefer to call the former acoramatic (discursive) proofs rather than demonstrations, which, as the expression already indicates, proceed through the intuition of the object. (A734–735/B762–3)
24
Intuition and the Axiomatic Method
The passages above in A717/B745–6, the primary site of a Critical discussion of symbolic construction can be compared with the passages from the First Reflection, § 2 in the Prize Essay: . . . I first of all appeal to arithmetic, both the general arithmetic of indeterminate magnitudes as well as that of numbers, where the relation of magnitude to unity is determinate. In both there are first of all posited not the things themselves but their signs, together with the special designations of their increase or decrease, their relations, etc., thereafter one operates with signs according to easy and certain rules, by means of substitution, combination, subtraction and many kinds of transformation, so that the things signified are completely ignored in the process, until eventually, when the conclusion is drawn, the meaning of the symbolic conclusion is deciphered. (Ak. 2:279)
A contrast between arithmetical signs and the metaphysical use of words made in A734/B762–3 appears also in the Prize Essay, but in more detail. For instance, Kant says that . . . words can neither show in their composition the constituent concepts of which the whole idea, indicated by the word, consists; nor are they capable of indicating in their combination the relations of the philosophical thoughts to each other. (Ak. 2:278–9) . . . since signs in mathematics are sensible means to cognition, it follows that one can know that no concept has been overlooked, and that each particular comparison has been drawn in accordance with easily observed rules, etc. . . . By contrast, the only help which words, construed as the signs of philosophical cognition, afford is that of reminding us of the universal concepts which they signify. (Ak. 2:291–2)
Considerable differences divide the arguably related passages, over and above such differences as manifestly testify to the Critical turn that separate them. Given that the notion of construction is not yet present in the Prize Essay, one cannot simply import pre-Critical theses into the Critical setting.11 Of these differences, I want to point out those which should be kept in mind and referred to throughout the ensuing discussion. 1. The Critical framework, of course, contrasts concepts (in their transformed sense) not with objects but with intuitions. This fundamental difference is intimately linked to the significant difference in the senses in which definitions are held responsible for the content of the concepts of mathematics in 1763 and in the Critique, respectively. In the Prize Essay Kant holds mathematical concepts to have “no other significance [Bedeutung] apart from what is given in their definitions” (2:291). This claim does not reappear in the Discipline. Importantly, it is no longer equivalent to the claim that mathematical definitions “make the concept itself” (Ak. 2:276; 2:281). The latter claim, of course, reappears in A731/B759.12 2. The traditional13 tripartite division of logical moments into concepts, judgments or propositions, and reasoning informs the Prize Essay as well as the Discipline. The Prize Essay compares the methods of metaphysics, geom-
The View from 1763: Kant on the Arithmetical Method before Intuition
25
etry, and arithmetic, in relation to each of its moments: concepts, indemonstrable propositions, and “analyses, proofs and inferences”. Specifically, in the case of the arithmetical method, definitions are associated with our epistemic access to the distinct cognition of arithmetical concepts; the principle that “the whole is equal to all its parts taken together” constitutes the arithmetically relevant of the two indemonstrable propositions provided; and the manipulation of signs in accordance with rules in a notation belongs with the subject of reasoning. Later, in the Critique, the transformed moments appear under the headings “definitions”, “axioms” and “demonstrations”. The question of the role of these moments in arithmetic is now more delicate. Insofar as elementary equalities are concerned, only the first chapter of this method is, strictly, applicable, because “in arithmetic there are no axioms” in the mathematical sense, and equalities are best called “numerical formulae” because they are “indemonstrable”.14 3. Two significant differences divide the pre-Critical and the Critical remarks on symbolic reasoning. First, the perceptual expressions replete in the Prize Essay, whether in their literal or metaphorical employment, deserve a separate discussion. But an important difference has been overlooked between usages of an expression with perceptual connotations in the respective passages on reasoning and should be pointed out here. In the Critical passage A734/B762, evidence analogous to perception “secures all inferences against mistakes by placing each one before one’s eyes”.15 In the Prize Essay Kant also entitles us to assurance in reasoning with symbols and expresses the basis for this with a superficially similar perceptual metaphor. But here, the “degree of assurance characteristic of seeing something with one’s own eyes” is attributed to our knowledge “that one has not left any concept out of account, that every equation has been derived by easy and certain rules, etc.” (2:291).16 That no error is possible — provided that the formalism is employed in accordance with the rules — is a metatheoretical claim about the formalism; our claim to know this is likened to and, strictly, distinguished from perceptual evidence. It is a claim about the sufficiency of the correct use of a system of rules in a system of numerical notation to ensure “that it is impossible for a cognition” established by the method in question that it should “be false” (Ak. 2:290).17 Finally, it should be noted that an allusion to the connection between algorithms for calculations and the arithmetical operations occurs only in A717/B745 and not in the Prize Essay. In the Critique the operations are associated with their symbolic “forms” in an adopted system of notation. The Prize Essay18 lists only rules for the manipulation of signs, such as “substitution”, “combination” or “transposition.” These significant differences go without notice if Kant’s pre-Critical views on mathematics are interpreted on the sole basis of remarks about geometry. In fact Kant’s conception of the basis for arithmetical certainty does not coincide with his conception of certainty in geometry.
26
3.
Intuition and the Axiomatic Method
Why arithmetic?
Kant’s Inquiry Concerning the Distinctness of the Principles of Natural Theology and Morality is known as “the Prize Essay” because of its “Being an answer to the question proposed for consideration by the Berlin Royal Academy of Sciences for the year 1763”. The Prussian academy’s question was this: One wishes to know whether the metaphysical truths in general, and the first principles of Theologiae naturalis and morality in particular, admit of distinct proofs to the same degree as geometrical truths; and if they are not capable of such proofs, one wishes to know what the genuine nature of their certainty is, to what degree the said certainty can be brought and whether this degree is sufficient for complete conviction.
Kant’s response to the question of the Academy implies that the method of the metaphysicians must undergo substantial reassessment and must accordingly be profoundly transformed if metaphysics is to board a secure path of its own. The knowledge that is derived in the mathematical disciplines can be derived with certainty because the methods of reasoning — “proofs and inferences” (Ak. 2:278–9) — employed are adequate for the particular character of “their concepts” and of “their objects”. In both respects there are substantive differences between metaphysics and the mathematical disciplines. Accordingly, the respective methods of “arriving at these concepts” are dissimilar. (Ak .2:277–9) Kant does not, however, restrict his comparison to metaphysics and geometry, as formulated in the question. Why does Kant introduce a separate comparison with arithmetic? Why bring arithmetic in at all? Two possible reasons can be derived from the literature. I argue that neither is supportable by the texts. a) As is well known, the question of the Academy arises against the background of conflicting approaches — associated respectively with the Newtonian “mathematicians” and the Leibniz-Wolffian “metaphysicians” — to questions about the nature of physical space and of bodies in space. Specifically, the geometrical notion of the infinite divisibility of space conflicts with a presupposition common among the “metaphysicians” about the unity of individual substances, viz. that such substances are simple and indivisible, and are the elements of which the real is composed.19 Kant’s introduction of a place in the essay for arithmetic equal to that allotted to geometry may be thought to have to do with an implicit view about a resolution of the disputes surrounding the labyrinth of the continuum, perhaps because arithmetic deals with discrete magnitudes, or because arithmetical and algebraic combinatory methods were prevalently regarded in the period as more rigorous and therefore possibly more reliable than the constructions of geometry.20 But Kant makes no claim to this effect, as one would have expected had that been his motivation. In the Prize Essay Kant makes no argument from arithmetic to support a stand on these disputes. b) A hypothesis that may have more serious exegetical basis would be that Kant is taking issue with the Leibnizian ideal of universal characteristics.21 In
The View from 1763: Kant on the Arithmetical Method before Intuition
27
his lectures on mathematics of the same period (1762–4) recorded by Herder (Ak. 29:1–1) Kant remarks on Leibniz’s contribution to the rigor associated with arithmetical and algebraic notation, and on the combinatory method.22 These remarks in the lectures mention Leibniz’s “dyadic” system with admiration. But this mention is made in the context of Kant’s elaboration on the ingenuity manifested in the discovery of the generative character of the alphabet and also of various systems of numerical notations and observes that both employ a principle for the generation of indefinitely many distinct combinations on the basis of an initial finite morphology. As we will shortly see, these remarks are very useful in clarifying what is otherwise implicit in the Prize Essay. But they do not include any indication as to Kant’s stand with respect to the Leibnizian ideal rigorization of conceptual analysis.23 Kant’s remarks in the Prize Essay, to the extent that they invoke the Leibnizian project at all, seem rather to have to do with the distinction with which Kant opens the essay: “There are two ways of arriving at a general concept”. The difficulties in “arriving at” a metaphysical concept can be made insurmountable if the metaphysician attempts to employ the mathematical “way of arriving at” a concept, that is, by defining it. Kant is concerned, therefore, with the differences in kind between metaphysical concepts and other types of equally abstract and complex concepts, that will explain the hazards of the imitation of the mathematical method of “arriving at” them. Kant’s question of our cognitive access to concepts is at right angles to the question with which Leibniz’s project is concerned. The allusions aim to bring out a contrast between linguistic representation and arithmetical systems of notation. It is this contrast that explains the place of arithmetic in the Prize Essay. Recall that, in the Prize Essay Kant argues that the obstacles in the way of metaphysical certainty derive from the inapplicability of mathematical methods in metaphysics. I propose that Kant has in mind sources of error in the method employed by the metaphysicians which the comparison with the method of arithmetic, and not with that of geometry, can recover as. I will argue, there are obstacles in the way of attaining metaphysical certainty that cannot be brought out merely by comparing metaphysics with geometry. What is of special interest is that by contrasting the metaphysical method with the arithmetical method as well as with the geometrical method, Kant implicitly advances his views on the nature of arithmetical knowledge. In the First Reflection of the Prize Essay § 2 Kant makes two claims that should be distinguished. His first claim is this: No replacement or proxy for epistemic access to the “things themselves”, the objects and concepts of metaphysics can be forged in the form of a formalized “perspicuous” notation. In the philosophical case, neither figure nor sensible signs are capable of expressing either the thoughts or the relations, which hold between them. Nor can abstract reflection be replaced by the transposition of signs in accordance with rules, the representation of the things themselves being replaced in this procedure by the easier representations of the signs. (Ak. 2:279)
28
Intuition and the Axiomatic Method
But Kant is not arguing that no formal symbolic alternative to the representation of such concepts in natural language can be devised that would be sufficient for eliminating vacuity, ambiguity or equivocation. He is pointing out, rather, that the question of such an alternative is irrelevant to the major difficulty in the way of attaining certainty in metaphysics. The latticework of conceptual relations is not transparent to the human mind. The idea of a rigorous procedure for deciding the correctness of judgments could well exploit the mathematical structure of conceptual relations, provided that there is for us a procedure for assessing the correctness of purported conceptual analysis. But analysis is the product of successful reflection the nature of which “is difficult and involved” (Ak. 2:282). Since the presence of ambiguities and equivocations in the usage of natural language can get in the way of getting the word-concept relation right, we need to have access to the distinct concept, even in order to recover these linguistic imperfections or correct them. The second claim introduces a principled distinction between words and numerical expressions. It is the claim that arithmetical signs can do something that words cannot do. . . . words can neither show in their composition the constituent concepts of which the whole idea, indicated by the word, consists; nor are they capable of indicating in their combination the relations of the philosophical thoughts to each other. (Ak. 2:278) . . . since signs in mathematics are sensible means to cognition, it follows that one can know that no concept has been overlooked, and that each particular comparison has been drawn in accordance with easily observed rules, etc.. . . By contrast, the only help which words, construed as the signs of philosophical cognition, afford is that of reminding us of the universal concepts which they signify. (Ak. 2:291–2)
Linguistic representation provides no aid for verifying the correctness of conceptual analysis or of philosophical reflection. Kant argues that the numerical symbolism has the capacity to “show in its composition” enough to rule out the possibilities of ambiguity or equivocation and the possibility of vacuity present in the use of words. Furthermore, mathematical usage, unlike linguistic usage, securely preserves distinct representation. The point is that, in the case of arithmetic, no independent access to contents or “things signified” is required for assessing the correctness of an arithmetical cognition.24 In this, arithmetic is singled out by its contrast with metaphysics. As I will presently show, the contrast of metaphysical concepts with the geometrical concepts is not analogous. Unlike metaphysics, mathematics has no need for “carefully searching out what is certain in one’s object. . . by means of a certain inner experience, that is to say, by means of an immediate and self evident inner consciousness, seek out the characteristic marks which are certainly to be found in the concept of any general property” (Ak. 2:286). But the epistemic advantages of geometrical and of arithmetical “sensible means for cognition” are not the same.25
The View from 1763: Kant on the Arithmetical Method before Intuition
29
The very next section, in which Kant explains why the mathematical employment of “sensible signs” makes redundant any need for comparing the concepts with their purported objects. Kant says: Because we are here treating our propositions only as conclusions derived immediately from our experience, I first of all appeal, with regard to the present matter, to arithmetic. . . ˜In both kinds of arithmetic there are posited first of all not the things themselves but their signs. . . Thereafter, one operates with these signs according to easy and certain rules . . . so that the things signified are completely forgotten in the process, until eventually, when the conclusion is drawn, the meaning of the symbolic conclusion is deciphered. (Ak. 2:278)
Words play a mediating role of directing “attention” to the “immediate inner consciousness” (where one can be found). The “essential and substantive” difference in the role of arithmetical signs is that they are not playing the role of mediators at all. So, our question is, in what way do arithmetical signs serve as “means for cognition”? The general issue raised is the character of the mathematical concepts and the nature of their relations to their representations, or “signs”. In virtue of some properties of either, the mathematical case is free of the obstacles in metaphysical cognitions. Both cases involve highly abstract concepts and connections; in both cases these connections are modal and involve in one form or other an “immense” complexity. The question of certainty in the cognition of such concepts arises for a finite mind. How can such concepts be present to a non-divine mind, and be distinctly present, given that no concept whatever is transparent to the human mind? The core of Kant’s argument is that metaphysical concepts cannot be “defined”. “It is the business of philosophy to analyze concepts which are given in a confused fashion, and to render them complete and determinate” (Ak. 2:278). Now, the method of “arriving” at concepts that can be defined is creative and “synthetic”. A central claim of the Prize Essay is that metaphysical reasoning, reflection or analysis can be nothing other than “false and deceptive”, if it assumes it can “start from definitions”. In contradistinction, for both the mathematical methods of geometry and of arithmetic definitions provide a secure foundation for reasoning. Metaphysical analysis applies to “given” concepts represented by language in use. To say that these concepts are “given” is in the first instance to say that their content, distinctly perspicuous or transparent to a divine consciousness, is something the cognition of which by us cannot be taken for granted. Our epistemic access to the concepts of metaphysics must be possible by means beyond the language “through” which they are confusedly cognized and which we use in analysis and reasoning. Because the “only help which words, construed as the signs of philosophical cognition, afford is of reminding us of the universal concepts which they signify”, it is up to our “attention” to make sure that we do not “take different things to be the same thing” or, alternatively, mistake distinct things for one and the same.
30
Intuition and the Axiomatic Method
Metaphysics “is constrained to represent the universal in abstracto without being able to avail itself of that important device which facilitates thought and which consists in handling individual signs rather than the universal concepts of the things themselves” (Ak. 2:279). Kant is not denying that the metaphysical depends on the use of signs. The point is that “the signs employed in philosophical reflection are never anything other than words . . . [h]ence, in reflection in this kind of cognition, one has to focus one’s attention on the things themselves” (ibid.). Both “kinds of cognition” present some dependence of cognition or reflection on the use of “signs”. However, not every “sign” is a “means for cognition”. Specifically, the words, which we use in metaphysics, are not, although they are signs. Words are ever at an epistemological remove from the objects of thoughts and the relations among thoughts (e.g., Ak. 2:280). The core obstacles in the way of metaphysical knowledge are in the reflection, which, if successful, may lead to the analysis of a concept. Kant does not believe that the language we use in the process of this reflection can yield a standard for the assessment of this success. Rather, what “immediate cognition” we have of conceptual or logical relations is available to us by a kind of “immediate experience” in “inner sense”.26 Kant discerns features of arithmetical formalism that are paralleled neither in the resources of metaphysical reflection nor by geometrical “synthetic” representation. We want to know: what is it that “essentially and substantively” distinguishes the abstract and complex concepts of arithmetic or their signs, and rules out the sources of error to which metaphysical cognition is vulnerable? What is numerical symbolism sufficiently perspicuous of ? What are the “things themselves” that can be replaced by arithmetical “signs”? Are these the things that the signs “show in their composition”? What does the fact that these signs are “sensible” have to do with securing our epistemic access to “the significance of” these signs? Kant offers no direct answers. A step toward answering these questions on Kant’s behalf will be made by articulating his views on mathematical definitions and by clarifying the difference between arithmetical definitions and geometrical ones.
4.
Mathematical definitions in general
In mathematical definitions something new is introduced. Kant says that mathematical definitions are real definitions, that defined concepts that are “created” have “no other significance” beyond what the definitions express.27 Kant’s usage of the phrases “an individual sign” and “the [distinct] object of my concept” seem virtually interchangeable. As I explained above, Kant is interested in the certainty of mathematical knowledge, which he assumes we have, and he assumes that it is obtained by the methods of geometry and arithmetic. His question is: how are the methods of mathematics so related to the objects and concepts to which this knowledge
The View from 1763: Kant on the Arithmetical Method before Intuition
31
pertains that they can yield the certainty that is characteristic of mathematical knowledge? Kant is not aiming to justify the truths established in mathematics or to account for the character of the subject matter of these disciplines over and above what is required for discerning the basis for the illicit practice of imitating mathematics in metaphysics. Leaving to one side other sources of error for metaphysics or its imitation of mathematics, the discussion of definitions should clarify their role in ruling out in mathematics the possibility of error that metaphysics owes to its illicit employment of definitions. I will show that mathematical definitions are responsible for assuring us cognitive access to the objects of the concepts of mathematics.28 By characterizing the general form of mathematical definitions we will be able to isolate the differences between the respective relations of geometrical and arithmetical concepts and the “distinctness” of their objects. In order to clarify these differences we need to recover a distinction implicit in Kant’s terminology.
4.1
Defining and positing
In the Critique of Pure Reason A730/B758 Kant says29 that there is only one German word (Erkl¨arung) for “exposition”, “explication”, “declaration”, and “definition”, and, hence, German usage does not sustain linguistic intuition for distinguishing precisely these different senses and does not always manifest the strictness of the requirement that philosophical Erkl¨arungen be denied the “honorific title” of ‘Definitionen’.30 But “philosophical definitions are never more than expositions of given concepts, mathematical definitions are constructions of concepts. . . therefore mathematical definitions make their concepts [whereas] in philosophical definitions concepts are only explained”. (A730/B758; Ak. 4:479) Strictly speaking, then, there really are no philosophical definitions.31 These remarks in the Critique have their root in much earlier lectures on logic and are consistent with the views in the Prize Essay. It is therefore highly plausible that Kant made a systematic distinction between his uses of the terms Erkl¨arung and Definition in 1763 as well, even if his usage is less than entirely consistent. Analysis, then, really amounts to making confused cognitions distinct. The composition of cognition, however, or synthesis, is helpful and useful simply and solely to produce something new and at the same time, all at once, to make it distinct. (Blomberg Logic § 139, Ak. 24:131, dated in the early 1770s.)32 In mathematics I begin with an exposition [Erkl¨arung] of my object, for example, of a triangle or a circle, or whatever. In metaphysics I may never begin with a definition. . . . . In mathematics, namely, I have no concept of my object at all until it is furnished by the definition. In metaphysics I have a concept, which is already given to me although it is a confused one. (Ak. 2:283)
Philosophical expositions (Erkl¨arungen) are explications of (given) concepts that purport to elucidate concepts in use and not to define them. In mathematics
32
Intuition and the Axiomatic Method
there is no use for this sort of explication because these would be the “last thing I know” of a concept given for analysis, and in mathematics concepts are not given at all, but made. In definitions of concepts something new is introduced. Mathematical definitions “create a distinct concept”. Unlike the illusory purportedly “analytic definitions” for which Kant criticizes the dogmatists, mathematical definitions are neither mere grammatical stipulations33 nor mere inventions.34 . . . mathematics. . . can say with certainty that what is not intended to be represented in the object by means of the definition does not belong to that object. (Ak. 2:291)
However, that a definition “creates” a new, distinct concept does not make it mathematical. All mathematical synthetic definitions are real. Real definitions do not merely “contain everything that is equal to the whole concept that we make for ourselves of the thing”, like nominal definitions. A real definition contains “everything that belongs to the thing in itself” (Ak. 24:269). Kant does not purport in the Prize Essay to establish his thesis that any mathematical concept is definable or that any mathematical definition is real. Given a definition of a mathematical concept, that the definition is real is manifested in the sufficiency of the posited signs for establishing mathematical cognition with certainty. An explanation of the reality of the synthetic mathematical definitions would have to account for the sense in which such definitions are sufficient for verifying their own adequacy and how they provide for their own “objects”. The fact that no analogous device can be relied on for the attaining the aims characteristic of metaphysical cognition is the basis for Kant’s denial that definitions can be used analogously in a secure metaphysical method. The character of this adequacy and, accordingly, the contribution of mathematical expositions of “signs” for mathematical certainty in arithmetic and in geometry are not the same. We can see how this distinction is manifested in Kant’s geometrical examples.
5.
Geometrical definitions
Consider the following example of what Kant calls a definition. “Think arbitrarily of four straight lines enclosing a plane such that the opposite sides are not parallel to each other. Let this figure be called a trapezium” (Ak. 2:276). In accordance with the singularized version of a Euclidean definition,35 a concept is assigned to a figure by a performative act of ostension.36 Now the statement “a trapezium is a four-sided figure. . . etc.” would be true, and any and all diagrams constructed according to the instructions would be instances. Such a statement would presumably be an example of “the first thought I can have of my object”. We can now see how the reality of the concept can be guaranteed and how the mathematical definition is “real” in that it is sufficient for “its own correctness” — for the true thought that can be formulated to express it. A particular figure “posited”, or drawn, is regarded as a prototype or exemplar for any instance falling under the defined concept. Such an exposition
The View from 1763: Kant on the Arithmetical Method before Intuition
33
exploits the fact that the figure, having been drawn in accordance with the concepts combined in the definition, exhibits the properties sufficient for it to count as an instance. The figure is individuated as the product of the exposition. That it exhibits the properties in question is a consequence of its configuration as a construction-dependent object. The product of an exposition is a spatial product of a rule-governed activity: paradigmatically, a geometrical figure. However, the fairly clear sense in which such a figure plays the role of a “means for cognition” is one that seems to draw on the individuality and the “concreteness” of the figure, not in the first instance on its “sensibility”.37 Let me clarify. For Kant, early and late, in mathematical definitions “the universal” can be considered “in concreto”, as is well known. But Kant has two logical senses in which, invoking a figure, a concept can be considered in the concrete, neither of which imply the exhibition of a concept in a sensible presentation. The first sense in which a concept can be considered in concreto does not even imply the individuality of that in which the concept is considered, which is as general as the concept so considered, but simply stands to that concept as species to genus. A figure represents a concept that is “lower” in the conceptual hierarchy than the geometrical concept under which it mediately falls. Its concept, for instance, the concept of an isosceles triangle, is a species relative to the geometrical concept of a triangle. So considered, the higher concept is considered in the lower one “in concreto”. In another sense that is logical, the individual figure “under” which its concept “is considered” is concrete relative to that general concept’s plurality of possible instances, for which the figure stands as an exemplar. “. . . in order, for example, to discover the properties of all circles, one circle is drawn; and in this one circle. . . two lines only are drawn. . . and the universal rule, which governs the relations holding between intersecting lines in all circles whatever, is considered in these in concreto” (Ak. 2:278). In the geometrical method the “lower” concepts is considered “under” the individual figure that its constructive definition “posits”.38 This sense of “in concreto” does imply the individuality of that in which the universal is considered: the individuality of an instance of a concept. What, if anything, is the contribution of the sensibility of figures to the certainty of the geometrical method? As can be seen from Kant’s claim that words are not “sensible” in their capacity as signs, it is not in the mere perceptibility of the inscribed diagram that the contribution of its “sensible” character is to be sought.39 But it does, I believe, have to do with its spatiality. Kant contrasts the “sensible cognition” of mathematical signs and figures with the immediate evidence of “inner sense”. As Kant’s examples show, the grasp of a concept entails the capacity to reflect on concepts or to “compare” them, which can be expressed by means of judgments. But the reflective ability necessary for carrying out the analysis of concepts has no resources for the individuation of its objects over and above its own “activity”. By contrast, the recognition of conceptual differences in geometry is not limited in this way, but depends on the grasp of spatial relations, because the recognition of spatial
34
Intuition and the Axiomatic Method
properties and relations exhibited by figures underwrites our capacity to draw geometrical judgments. Our cognitive access to the distinctness of metaphysical concepts mandates the recognition of conceptual differences independently of the individuative contribution of space.40 It is in any case not in doubt that the properties cognized and attributed in geometrical judgments are spatial. In “setting up definitions at the beginning of proofs and inferences” the geometer’s assurance of the reality or the correctness of her definitions is free not only from faithfulness to “whatever the concept of a cone might ordinarily signify”,41 but also from metaphysical dependence on a relation between concept and construction-independent object. Geometrical definitions purportedly guarantee that no characteristic mark that does not belong to the object can be inferred to belong to it, and that no characteristic mark that belongs to the thing can thereby be denied of it. In this sense geometry is not open to the error of the metaphysician who misleadingly and deceptively takes “certain knowledge of a predicate of a thing” for a definition (Ak. 2:293). Kant is not claiming that “until now” no metaphysical knowledge has ever been established, but that no metaphysical system has been established on a secure path of its own. “Mistakes are made” when one takes something one knows to be a definition when it is in fact not fit to serve as one (Ak. 2:284).
6.
Arithmetical definitions
Kant does not offer an example of a definition in arithmetic in the Prize Essay. We find something answering to the name in his lectures on mathematics of the same period (1762–4), recorded by Herder (Ak. 29:1–1). In the lectures Kant remarks in some detail on the ingenuity manifested in the discovery of the generative character of the alphabet and also characteristic of the principles for the generation of numerical signs. Both employ a principle for the generation of indefinitely many distinct combinations on the basis of an initial finite morphology. He then goes on to present the arithmetical formalism. “Definitions” are given only for the initial 9 numerals of the decimal Arabic notation. They are 1 + 1 = 2, 2 + 1 = 3, 3 + 1 = 4 . . . 8 + 1 = 9. Thereafter Kant gives what look like proofs for such formulae as 8 + 4 = 12. This seems to conflict with the Critical statement that such propositions as 7+5 = 12 are immediately certain and indemonstrable (A163/B204). They are, in the Critical period, regarded as postulates or as “numerical formulae” based on postulates, which are “immediately certain practical judgments . . . requiring no resolution or proof” (J¨asche Logic, Ak. 112. § 38). Had Kant changed his view?42 Let us consider these “proofs” in the context in which they are set forth. First, is it significant to observe that proofs are given only for the function m+n with arguments 1–9? These “proofs” in effect establish a connection between the ordering of the initial elements by the operation “+1” and equalities for n + m = k for n, m ≤ 9.43 Thereafter, one is to proceed according to the
The View from 1763: Kant on the Arithmetical Method before Intuition
35
usual grade-school algorithms for computation.44 The “proofs” are really an articulation of the (morphological) basis for the appeal Kant has alluded to in the preceding sections to “Einmaleines”, where Kant observes the elimination of the table in Leibniz’s Dyadik.45 In the text immediately preceding the section in which the “proof” is found (§ 38), Kant remarks on the binary notation and proceeds to offer a detailed elaboration of the properties of the decimal system, emphasizing its positional character. (§§ 32–37 are entirely devoted to the description of alternative systems of numerical notation.) The discussion reflects what we learn in grade school. We memorize the sums for arguments less than 10 and then learn an algorithm for computing the sum given any pair of Arabic numerals. If we put this in deductive form we can deduce an arbitrary equation m + n = k from the cases with arguments ¡ 10, using rules for equality, associativity, commutativity and a rule that allows inference from p + q = r, where p, q, r are individual numerals, to p 0 + q 0 = r 0, where “p 0” is the concatenation of the numeral p with 0.46 But Kant does not seem to envision a quasi-axiomatic system. He emphasizes the “ingenious” generative character of numerical languages. In the language in question, the initial 9 elements of the Arabic numerals are ordered by the definitions and the rest generated by the iterated application of the algorithm for producing numerical strings out of previously generated ones in accordance with these principles. This way of conceiving of the generation of the numerals associates calculation in this language with two sets of rules for calculation: the primitive recursive operations of addition and multiplication, and the algorithms that constitute the formation rules for well-formed expressions in the language. Then, in carrying out the algorithms for calculation we employ auxiliary heuristics in which visual aspects like the graphic arrangement of a configuration of numerals play a role. In this procedure . . . there are posited not the things themselves but their signs. . . .thereafter one operates with these signs according to easy and certain rules. . . [applying] all kinds of transformations, so that the things symbolized are completely forgotten in the process, until eventually, when the solution is drawn, the meaning of the symbolic deduction is deciphered. (Ak. 2:278, § 2)
In ordinary calculations in our canonical system of Arabic numerals, we place longer strings above the shorter ones to be added or multiplied, for instance, and begin to calculate individual components of the strings one by one from right to left. Whenever the result is greater than 9 we carry a digit and then add what we “carried” to the result of that next set. In so proceeding we appeal to the memorized table. Eventually, after a series of intermediate such steps, we arrive at a single string — the solution of our original problem. Deciphering the “significance” of a symbol is tantamount to locating its position in the series of numerals. But we have yet to clarify what “locating the position” is taken to involve.
36
Intuition and the Axiomatic Method
It is quite clear that the claim, that when we arrive at the symbolic solution there remains something to “decipher”, does not mean that in carrying out calculations we consider only meaningless strings of signs, mere syntactic entities and then have yet to verify what they signify against some “things signified”. In fact, in the procedure just described we apply the primitive recursive arithmetical operations of addition and multiplication, and these are defined for the numbers. We take the numerals before us to present these numbers. What we completely forget or ignore in the process is the original problem. In most of these steps we have to take numerals in isolation and hence as representatives of different numbers from those they stand for in the original strings to be multiplied. Thus, in multiplying n and m I am “ignoring” that in “n 0 0” the number signified by “n” is n × 102 . The heuristics of the calculation exploits the following properties of the formalism. The morphological composition of our canonical system of Arabic notation is such that, idealizing from the zero in initial position, no combination of numerals is ill-formed47 and well-formedness is sufficient for nonvacuity. Numeral strings preserve well-formedness under arbitrary transformation of order, but the position of a constituent numeral in a numeral string is a factor in determining its significance, along with the positions of numerals in the sequence. Moreover, the formation rules guarantee that no two combinations can be generated by the same rule. The composition of strings form singular numerical expressions only. In arithmetical calculation a numerical symbol is what an algorithm for calculation can be applied to. In the present context this comes to the claim that the signs used as arithmetical symbols are possible objects for the application of the rules for calculation — objects that we can calculate with.48 To spell this out in terms of Kant’s present apparatus we need to clarify the sense in which the arithmetical definitions “create distinct concepts” and are sufficient for “rendering distinct” the cognition of their “objects”. To this aim we should first consider the distinction between the arithmetical expositions and definitions. We have established that the mastery of the symbolism and of the rules for manipulating symbols is held sufficient for “distinctness” because the formalism exhibits the following properties: (1) no expression is ambiguous, (2) every expression generated is significant (3) no two expressions have the same significance. What do the definitions of the arithmetical concepts owe to these properties of the formalism?
6.1
Defining and positing in arithmetic
How does the distinction between definition and exposition apply in this case? How does the application of the distinction between defining and positing to the arithmetical example accommodate their stipulative and yet creative character? As expositions (Erkl¨arungen), 1 + 1 = 2, 2 + 1 = 3 etc. are stipulations of signs. Assuming an initial element49 , and an operation “+1”, they
The View from 1763: Kant on the Arithmetical Method before Intuition
37
assign the numeral “2” as the value of “n + 1” for n = 1. The expression “+1” expresses the “positing” for any sign so generated, a next. Kant, recall, takes definitions to be the “first thought I can have of my object”, and therefore they express truths. It is by means of the arithmetical expositions that “the thought of my object first becomes possible” (2:280). Considered as meta-linguistic stipulations, the signs “posited” are really mentioned. But the numerals will appear in use as numeral names in formulae in use50 . So, putting the question of the individuation of numerical symbols to one side, we could for the moment grant that definitions may be sufficient to account for our grasp of the symbolic numerical expressions. But to account for the thoughts — for our cognition of the truths that these signs are used for expressing — is another matter. This leaves us with an ambiguity in the account of definitions: between cognizing the distinct sign and cognizing the truth of a “thought” that the “composition” of signs purportedly “shows”. This ambiguity corresponds to an ambiguity in the notion of the “reality” of definitions. A “real definition”, recall, contains “everything that belongs to the thing in itself” (Ak. 24:269). But this is also supposed to make the real definitions sufficient for our confidence in their correctness, so that we “can say with certainty that what is not intended to be represented in the object by means of the definition does not belong to that object” (Ak. 2:291). A normative sense of syntactic well-formedness will not yield the sense of correctness we need if the notation is to deliver its promise to “show in its composition” the concepts it purportedly defines. The question posed earlier: “What do the signs ‘show”’? can now be made more precise. That on Kant’s view the posited signs cannot be held to be mere pawns in a syntactic game is already indicated in our discussion of calculation. One might have taken Kant’s view to suggest that they are, because Kant holds that the mathematical “significance” of our arithmetical “concepts” is exhausted by their definitions51 and because the operations “+1” and the equality sign are “significant” only for individual arguments presented by numeral signs or by (essentially singular) functional numerical expressions. However, that Kant’s view does not render otiose the notion that the signs refer is evident from his allusion to the “things signified” by these signs. On the one hand, there are the “things signified” for which the generated signs are mere “representatives” or “replacements”. But, on the other hand, these “things” seem to make no contribution to the basis for the correctness of our “symbolic conclusions”; in calculation the signs play the role of representatives for “things signified”. Nor does Kant seem to render the notion that numerals refer moot by identifying the “object of the concept” with the posited sign, regarding its reference reflexively, as it were. What Kant had said was that it is sufficient to replace these things with signs in order to establish correct results by calculation. Kant, therefore, does not deny the notion that these signs behave like names. Now, Kant uses the term “an arithmetical object” for something of whose complexity we have a distinct grasp [Faßlichkeit] or understanding [verstanden]. He does
38
Intuition and the Axiomatic Method
not say what an arithmetical object is. But he does say that distinct arithmetical cognition involves “understanding an arithmetical object” and he does have something to say about what “understanding an arithmetical object” involves. Where does what we have recovered thus far leave us? Kant assumes that our only epistemic access to arithmetical concepts rests on the mastery and use of the formalism. Fundamental characteristics of the arithmetical symbolism play a role in the function of arithmetical “signs” as “means for cognition”. The generative principles of the notation guarantee that any well-formed numeral string is (uniquely) significant and that any sequence of such strings generated will be isomorphic with the sequence for which it purportedly stands. Numerical symbols owe their relation to that for which they stand to the structure to which they belong because they can be generated only in accordance with principles governed by a certain form. Numerals are singular representations partly in the sense that they can be correlated with a unique position in that structure; for the same reason they can also serve to model what they symbolize. What fixes the place of a numeral in a symbolic system, as indicated above, are the principles for generating symbols out of other symbols. These principles, in turn, are sufficient for “understanding52 an arithmetical object” distinctly, and this understanding really consists of “understanding a relation of a multiplicity to unity”, for example, the relation of a trillion to unity” (Ak. 2:280). But of course this doesn’t mean that the “object” is identified with such relations. Symbolic composition must (at least) be adequate for representing sameness of “position” for different expressions, for the employment of the symbolism in the practice of calculation seems to turn precisely on this. As the above discussion of the role allotted to “visible signs” as symbolic configurations in calculation suggests, Kant’s pre-Critical notion of a numeral-name “occupying” or being assigned a position in a structure seems to trade on an ambiguity between a spatial and a logical, notion of position.53 Kant’s discussion in the Prize Essay proceeds on the premise that we possess arithmetical knowledge and purports to account for the conceptual cognition involved in this knowledge by the methodology of definition. Kant states that what the definitions define are “concepts”. But the texts provide insufficient basis for a description of what these concepts might be the concepts of. Would the arithmetical definitions be sufficient to yield a mathematical definition of the concept of number in general? If not, does this force the view that the general concept of number is a metaphysical concept? Would this imply that this concept must be held to stand to the defined concepts of the numbers as genus to species? Do the concept arithmetically defined include the concepts of the arithmetical operations? Kant clearly includes among concepts arithmetically defined the concepts of the individual numbers. Kant’s usage seems, however, to equivocate between what would be the “objects of” these concepts, the referents for numerical equalities — of the arithmetical “thoughts”, and the “things signified”, on the one hand, and between (any of
The View from 1763: Kant on the Arithmetical Method before Intuition
39
these) and “things” which are not what the concepts defined signify, but to which they may apply. One difficulty arises from the fact that, notwithstanding the mention of the notion of counting in the lectures, the remarks seem to leave out of account the notion of cardinality. The account of a system of numerals exploits the successive composition of the generated strings of the notation-system. But Kant’s later emphasis on successive addition is entirely absent from the view from 1763. It is not clear how much we can derive from Kant’s words toward a resolution of the ambiguities to which they seem to give rise or to the questions posed. Here, however, is a suggestion that we can make. Because the numeral system in question in fact also provides a source of models for the numbers, forming, as it does, the standard order-type of the successor function, we could regard the numerals as mere sign-types for which “deciphering” the significance by verifying their reference is required, for example, by counting numerals.54 In this sense an arithmetically significancant structure is instantiated by systems of signs.55 In arithmetic we can take the generative procedure to be that which is employed in the articulation of a canonical system of notation — a system of numeral names — for the numbers. This may bridge the ambiguity between our grasp of the “object” and our cognition of the arithmetical truths because the numeral-types we construct must themselves form a model of the arithmetical truths (for us, of the axioms).
7.
Conclusion
In establishing that the introduction of arithmetic into Kant’s response to the Academy’s question in the Prize Essay serves to uncover obstacles in the way of achieving metaphysical certainty — in contrast to mathematical certainty — obstacles that could not be uncovered if metaphysics were contrasted only with geometry, I emphasized those aspects of “method” which are shared by geometry and metaphysics, but not by arithmetic. Specifically, in geometry and in metaphysics, our access to concepts turns — in one sense or another — on the “comparison” of a representation with something else. In metaphysics concepts are “compared” with other concepts or with compounds thereof. In geometry what individuates a figure is the recipe for its construction, but Kant also introduces the notion that the geometrical figure or sign “is similar to the things signified”, and this notion must play a role in our cognition of the adequacy of this recipe. This is also indicated in the logical form of the “first thoughts I can have of my object” — the linguistic articulation of the definitions in geometry, which are purportedly true predications. But in arithmetic the definitions are formulae in a notation. These formulae belong with the principles that constitute the notation. However, the correctness of the definitions or of the results of calculation does not depend on correspondence between signs and these things. The signs posited in arithmetic owe their perspicuity not to shared properties between sign and object, but rather to an affinity in their manner of generation.
40
Intuition and the Axiomatic Method
In an important sense there would seem to be much in common between the “ostensive” aspects of geometrical and arithmetical signs. The generated figure, or “symbolic form”, is representative of precisely the procedures that are responsible for its individuation. In both cases what is “placed before the eyes” is imbued with the constructive principles to which its function as a sign or a representative is due. Of course, a geometrical figure stands in for an indefinite plurality of possible instances of the concept of, say, a triangle whereas a numeral is one of indefinitely many possible symbolic representations of the same number. That the arithmetical individual signs hold an inverse relation to what would be their “object” and the respective mathematical “signs” may now be construed in terms of a notion of mathematical form. For that is a result of mathematical facts about the respective content or subject matter of these disciplines. The relevant mathematical fact is that Euclidean space is homogeneous; every point is like every other. Given two congruent triangles, there is an automorphism of the space that takes one to the other.56 But in arithmetic the only automorphism is the identity. In considering the relation of this view to Kant’s position about arithmetical knowledge in the Critique of Pure Reason many questions remain open. The leading question is how Kant might take us from a conception of the finite ordinals qua positions in a sequence (the grasp of which he associates with the grasp of an arithmetical object, whose cognition is availed by symbolic composition, and consists of understanding a relation of a multiplicity to unity) to the Critical claim that the elementary numerical equalities are singular judgments. But the bounds of this paper are met in concluding that these connections should be explored.
Notes 1. A712/B740 to A738/B766. 2. Construction is not mentioned explicitly in the Critical passages on arithmetic, e.g., B14–16, A164– 165/B205–206, A240/B299, or A140–141/B179–180. For allusions to arithmetic in the Discipline see A724/B752; A720/B748. But the remarks on symbolic construction mention algebra. As is well known, the relation between algebra and arithmetic in Kant’s view is a matter of dispute. These disputes do not center on the relevance of the relationship between the languages of arithmetic and algebra. 3. Examples include leading interpretations of Kant’s philosophy of mathematics, notably Charles Parsons (1969); Michael Young (1984); Charles Parsons (1984); Michael Friedman (1992); C. D. Broad (1941), (1978); Manley Thompson (1972–3); E. W. Beth (1956–7); Jaakko Hintikka (1973); Philip Kitcher (1975). I should note that the claim that arithmetical construction involves the construction of numerals does not settle the question whether arithmetical construction is “symbolic” or “ostensive”. The considerations that enter are essential for appreciating the connection between Kant’s views on the arithmetical “method” in the Discipline and his Critical views on the concepts of the numbers and on arithmetical equalities (and inequalities). For different ways of conceiving the relation between symbolic and ostensive notions of arithmetical construction see Parsons (1969), Young (1984), Parsons (1984). I can at most gesture here toward issues I address in the sequel to this paper. 4. This assumption was recently contested by Lisa Shabel (1998). Whether or not Shabel is right, her criticism brings out the difficulties in the way of advancing a direct exegetical argument for this assumption solely on the basis of textual evidence from the Critique of Pure Reason. 5. A713–738/B741–766. 6. Hintikka (1973), p. 145.
The View from 1763: Kant on the Arithmetical Method before Intuition
41
7. Parsons (1969), p. 138 of Parsons (1983); in Posy (ed.) (1992), p. 66. 8. I will not address the much discussed questions that divide Parsons’ and Hintikka’s approaches. Support can be found in what follows for Parsons’ observation that a “connection to the senses by way of symbols” is already made in the Prize Essay. (See Parsons (1969), p. 183 of Parsons (1983), or Posy (ed.) (1992), p. 66.) Such a connection is, in fact, made in several forms which I will distinguish and explore. 9. Even in such a focused discussion of the evolution of the Critical views on the mathematical method from the Prize Essay as is found in Carson (1999), the specific relation between the respective discussions of the symbolic methods of arithmetic in 1763 and in the Critical period is not addressed. Carson argues that in the Prize Essay Kant does not explain why definitions in mathematics are “real” and cannot be mere inventions. Whether or not this criticism is correct, it does not affect Kant’s conception of arithmetical definitions. 10. The different aspects of their comparison will call them to our attention as we proceed. 11. Pace Hintikka. 12. The Critique’s qualification that the “sense” [Sinn] and the “significance” [Bedeutung] of mathematical concepts and the standing of the knowledge discussed in the Discipline depends on the claim that the pure intuitions in which these concepts are exhibited are the pure forms of sensible intuitions of objects of possible experience is essential for appreciating the difference between the pre-Critical and the Critical conceptions of the mathematical method and on the relation between definitions and mathematical concepts. See particularly (A240/B299). For a discussion of the significance of this qualification for Kant’s Critical philosophy of arithmetic, see Parsons (1984), reprinted in Posy (ed.) (1992). 13. In the standard traditional source of the Logic of Port Royal, the three moments are “ideas”, “judgments” and “reasoning”. 14. A163–165/B204–206. 15. In these much discussed remarks, Kant emphasizes that the epistemological value of “symbolic construction in which one displays by signs in intuition the concepts, especially the relations of quantities” goes beyond its heuristic advantages. 16. Compare Friedman (1992) p. 85. 17. It associates the license for certainty in combinatory operations to our mastery of the notation and command of the operations performed. 18. Ak. 2:279. See also Ak. 2:278. In the context, the “subtraction” included in the detailed list evidently means the elimination of signs or symbols concatenated or otherwise “combined”. 19. Kant’s own contribution to these and related controversies in his early writings as well as the transformations of his views are the subjects of extensive study. These debates, in turn, have had notable impact in propelling the evolution of his pre-Critical thought. 20. For a helpful survey and discussion, see Pycior (1997). 21. This suggestion is made by Gottfried Martin (1985), p. 67. Martin reads Kant in the Prize Essay as “brusquely rejecting” the “approach” presented in “Leibniz’s projection of the ars characteristica universalis, at least in the form in which Kant was acquainted with it through Lambert”. As evidence he cites the passage I quote next, (Ak. 2: 278). The passage does not extend an argument and Martin does not use it to motivate one. Compare New Elucidation of the First Principles of Metaphysical Cognition (Ak. 1:385–416) of 1755 and Concerning the Ultimate Foundation of the Distinction of Directions in Space (Ak. 2:375–83), of 1768. Gottfried Martin reads the Critical passages at A724/B752 as making a similar allusion. 22. I will turn to these remarks and to the lectures in what follows. I use material Kant used in teaching to illustrate and clarify his statements in the published Prize Essay. In general, it should be borne in mind that what we have as interpretive evidence is notes for the lectures, and these may not be entirely faithful to them. 23. Pace Martin. See n. 21 above. 24. The clarification of this point and of the basis Kant has for claiming it, will engage us in what follows. 25. The difference is that the correctness of geometrical cognition does, whereas arithmetical cognition does not, depend on the relations among “characteristic marks. . . to be found in the concept”. This will shortly be clarified by turning to the differences in Kant’s conception of the respective notions of definition. 26. The subject of Kant’s conceptions in 1763 of “inner sense”, “inner experience” and the data or evidence that might be derived from them cannot be addressed here in any way beyond what will directly bear on the present topic. But see False Subtlety (Ak. 2:60–1). See also Negative Magnitudes (Ak. 2: 190–3).
42
Intuition and the Axiomatic Method
27. At Ak. 2:284, Kant says that “die Bedeutung der Zeichen” is certain [sicher], and then at Ak. 2:291 that defined concepts have “keine Bedeutung als die, so ihn die Definition gibt”. The German term translated as “significance” in the Cambridge Edition is, consistently, “Bedeutung”. Kant uses this term in the mathematical cases consistently. For the geometrical case, see especially Ak. 2:276 and Ak. 2:292. The former passage distinguishes what “the concept of a cone may ordinarily signify (bedeuten)” and the significance of the concept of a cone “arrived at” in the mathematical way. The latter passage makes the claim that “in geometry, the signs are similar to the things signified”. Kant in some contexts uses Bedeutung in relation to metaphysical concepts, e.g., Ak. 2:291–2. 28. In what follows I propose that, whereas the account may be sufficient for showing why definitions cannot synthesize — and can therefore not “define” — empty concepts, they fall short of establishing the conceptual character of their product, let alone its mathematical character. To put the matter in Kantian terms, the account of “the universal” purportedly considered in and under the “synthesized” concrete sign is significantly incomplete. It is worth pointing out that a difficulty in accounting for the notion of a numberconcept reappears in the interpretation of the philosophy of arithmetic of the Critique. 29. The translators of the Cambridge Edition appeal to A729–730/B757–758 in support of their translation of both “Definition” and “Erkl¨arung” by the English “definition”. But this passage seems to me to go against their claim that “in the Prize Essay Kant uses the terms. . . as synonyms” (n.6˜ to the Glossary in Kant (1992b)), if it can be used as evidence for such a matter at all. 30. Some technical delicacy would be expected if the Latin cognates were introduced late in the day into the German philosophical jargon. 31. The distinction recovered in this section is relevant to Mirella Capozzi’s analysis in Capozzi (1980), pp. 423–452. I regret that these views cannot be addressed in this paper. 32. There seem to me to be reasons for questioning the basis on which the translators of the Cambridge Edition date the Blomberg Logik around 1770. Briefly, since these are compiled transcripts, the “internal evidence”, for instance, that a passage from the Dissertation is found in them, is not sufficient for a conclusive verdict about the whole batch. The transcripts are attributed to Blomberg because they were found in his possession and his name appears on the title page. The translator’s comments suggest that they came into his possession, but it is not implausible that parts of them originated with him. Blomberg himself had completed his studies in 1764. On this basis, the translators conclude that the presence of later material entails that he could not have been responsible for the transcripts himself. But it is also acknowledged that the lectures on which the transcripts are based are earlier. It does seem to follow that his lecture notes cannot be taken to constitute everything that is included in the compiled text, since the transcripts include Dissertation material. But the evidence taken together, as well as the contents of the lectures, suggests that the transcript compiles material from various periods, including earlier than 1770. 33. Pace Polonoff, who takes Kant to hold that in mathematics definitions play the role of grammatical stipulations. According to Polonoff “the mathematician is expected to say with certainty that whatever his definition did not intend to represent in the concept of the object defined is not included in the concept of the object as far as his discussion is concerned.” (1973, p. 186) But the relevant passage says that “what is not intended to be represented in the object by means of the definition does not belong to that object” (Ak. 2:291). Polonoff is, however, clearly aware of the incompatibility of the weaker claim he attributes to Kant with Kant’s claim that the mathematical definition “is the first thought I can have of my object”. He concludes that for Kant in the Prize Essay “the definition is neither true nor false but merely stipulated” (ibid.). 34. “A case in point would be that of a philosopher arbitrarily thinking of a substance endowed with the faculty of reason and calling it a spirit. My reply, however, is this: such determinations of the meaning of a word are never philosophical expositions (Erkl¨arungen). If they are to be called expositions at all they are merely grammatical expositions. For no philosophy is needed to say what name is to be attached to an arbitrary concept. Leibniz imagined a simple substance, which had nothing but obscure representations, and he called it a slumbering monad. But, in doing so he merely invented [erdacht] it.” (Ak. 2:277) 35. Kant’s example is an interesting choice if, as it seems, it is taken from Euclid’s Definition 22 (Elements, Book 10) with significant alterations. First, Kant emphasizes the ostensive and singular, and in Euclid the trapezium is one of the few figures whose concept occurs in a definition in the plural. Second, Euclid’s definitions are characteristically stated in subject-predicate form. Definition 22 is this: Of quadrilateral figures, a square is that which is both quadrilateral and right-angled; an oblong that which is right-angled but not equilateral; a rhombus that which is equilateral but not right-angled; and a rhomboid that which has its opposite sides and angles equal to one another but is neither equilateral nor right-angled. And let quadrilaterals other than these be called trapezia.
The View from 1763: Kant on the Arithmetical Method before Intuition
43
36. Compare Beweisgrund (Ak. 2:95–6). 37. In Blomberg Logic § 157, Kant distinguishes between two types of connection between certainty and the senses. “Sensible certainty must be distinguished from certainty of the senses. Sensible certainty. . . does not really arise through actual intuition of objects by means of our senses but rather merely through the fact that we make cognitions distinct, convincing and capable of insight more easily and with less effort”. 38. It is often overlooked that it is not analytic to the “consideration of a concept in concreto” that it be considered in the individual. See, for example J¨asche Logic § 16, Notes 1 and 2. In the Critique, it is a substantive claim that the concept is exhibited in concreto in intuition. In the Prize Essay, however, it is also in the individual that a concept is considered in concreto. This is what Kant emphasizes by contrasting the consideration of a universal concept merely “through” words but — in the mathematical cases — under signs. 39. It is, of course, one thing to hold Kant to be making a connection between the mathematical use of signs and the senses; it is quite another to hold him to be making such a connection between these signs and sensibility. We cannot pursue either direction here any further. 40. Kant had already rejected the Leibnizian principle of the identity of indiscernibles (New Elucidation of the First Principles of Metaphysical Cognition, Proposition XI, § 2, Ak. 1:409–410) and asserted that “things which are distinguished at least in virtue of place are not one and the same thing at all” (Ak. 1:409; see also Friedman (1992), p. 8, note 11). I do not, of course, consider this sufficient evidence for the supposition that Kant came close to attributing to space the individuation of construction-dependent objects, for this would be very close to holding space to provide the grounds for the claim that constructions can be carried out. 41. Ak. 2:276. 42. Beatrice Longuenesse views the Critical contention that in arithmetic “there are no axioms” as arising from a disillusionment with a Leibnizian conception of definitions, which she attributes to the preCritical Kant. See Longuenesse (1998), pp. 277–278. Gottfried Martin proposed that the discovery that such proofs as Leibniz’s of 2 + 2 = 4 from definitions in Leibniz (1765; 1996), Book IV, ii § 10, presuppose synthetic principles like the principle of associativity of addition convinced Kant of the syntheticity of arithmetic. But the premise of these views does not seem to find its support in the lectures. 43. Similarly for multiplication, subtraction and division. 44. Kant formulates a few general schematic principles for the operation signs “+”, “−”, “×” and “:”. 45. Leibniz and Wolff are known to have considered the binary system in a certain sense superior to the canonical decimal system with respect to its revelatory character. Kant does not seem to share such a view. See, e.g., remarks quoted in Martin (1995), pp. 90–91. The issues raised by these comparisons are important for the topic of this paper, but they cannot be pursued here. 46. We would only have to introduce the distributive law to deduce this if we define p 0 as p × 10, assuming a number like 165 is defined as 100 + 60 + 5. 47. Assuming that we have a criterion for sameness of numeral-type. 48. There may be an ambiguity in what I just said which I want to dispel. I am not raising the question of the constraints on individuating occurrences of numerical symbols, something that is surely necessary for clarifying the idea of the application of “substitutions and transformations” etc., but a different question. The relevant question is how the distinctness of the symbol might be construed so that we may conceive of the operations with these symbols in accordance with the rules of a specific system of notation as yielding the cognition of the results of arithmetical calculation. 49. In the lectures Kant indicates that Einheit is taken as primitive (Ak. 29:1-1, 52). 50. Kant’s pre-Critical usage of “positing” or posisiones, associated with definitions by arbitrary combination (e.g., Blomberg Logic § 190, Ak. 230, Beweisgrund, False Subtlety), is used ambiguously in other pre-Critical contexts. 51. See Ak. 2:276, 2:284, 2:291 resp. 52. Compare Blomberg logic § 143: “. . . to understand something [is] to cognize something distinctly through the understanding”. The Cambridge Edition translators use “understanding” for “Faßlichkeit” as well as for “verstanden”. The former is used in relation to the grasp of “an arithmetical object, which contains an immense multiplicity”. The latter is used for the grasp of the “relation of a trillion to unity” which amounts to the distinct cognition of that immense multiplicity “contained in” the arithmetical object of cognition. 53. This suggests that Kant’s resources for conceiving the mathematical notion of the iteration of a mechanical operation do not by themselves encourage the appeal to the sensible signs, but rather that the basis for the assurance that distinct applications of the same rule to the same argument necessarily yield the same value.
44
Intuition and the Axiomatic Method
54. That Kant omits mention of counting in the Prize Essay is, of course, one of the pivotal contrasts with the Critical interest in the concept of number, but does not distinguish the Prize Essay from the Critical bounds of the remarks in the Discipline on the subject of symbolic construction. The omission is important for the pre-Critical identification of the arithmetical method with the use of signs within a canonical notation. In the lectures Kant makes clear that numerieren is not counted among the rules of the notation [52]. This underscores his concern with the notation-specific algorithms. 55. Albeit only an ordinal structure. 56. Though not of the plane; then they have to be oriented in the same way.
References All references to Kant’s Critique of Pure Reason are given by the standard pagination according to the original (A for first edition [1781], B for second edition [1787]). All references to Kant’s writings except the Critique of Pure Reason are given by volume and page number of the Akademie edition of Kants gesammelte Schriften (Berlin, 1902–). The English citations of the Critique of Pure Reason follow the translation of Paul Guyer and Allen Wood, in the Cambridge Edition of the Works of Immanuel Kant. All citations of pre-Critical works follow the translation of David Walford in collaboration with Ralf Meerbote in Theoretical Philosophy 1755–1770, in the Cambridge Edition. All citations from Kant’s lectures on logic follow J. Michael Young’s translation in the Cambridge Edition. Works of Immanuel Kant Kant, I. (1902–), Kants gesammelte Schriften, edited by the Royal Prussian (later German) Academy of Sciences (later the G¨ottingen Academy of Sciences), G. Reimer, subsequently Walter de Gruyter & Co., Berlin. Kant, I. (1992a), Lectures on Logic, translated by J. Michael Young, The Cambridge Edition of the works of Immanuel Kant, Cambridge. Kant, I. (1992b), Theoretical Philosophy, 1755–1770, translated by D. Walford in collaboration with R. Meerbote, The Cambridge Edition of the Works of Immanuel Kant, Cambridge. Kant, I. (1998), Critique of Pure Reason, translated by P. Guyer and A. Wood, The Cambridge Edition of the Works of Immanuel Kant, Cambridge. Kant, I. (1999), Correspondence, translated and edited by A. Zweig, The Cambridge Edition of the Works of Immanuel Kant, Cambridge. General Bibliography Arnauld, A. and P. Nicole (1996), Logic or the Art of Thinking, edited and translated by J. V. Buroker, Cambridge. Baumgarten, A. G. (1779), Metaphysica, seventh edition, Halle. Beiser, F. C. (1992), “Kant’s Intellectual Development: 1746–1781” in: The Cambridge Companion to Kant, edited by P. Guyer, Cambridge, 26–61. Broad, C. D. (1941), “Kant’s Theory of Mathematical and Philosophical Reasoning” in: Proceedings of the Aristotelian Society 42, 1–24. Broad, C. D. (1978), Kant: an Introduction, edited by C. Lewy, Cambridge University Press. Capozzi, M. (1980), “Kant on Mathematical Definition” in: Italian Studies in the Philosophy of Science, edited by M. L. Dalla Chiara, Reidel, Dordrecht, 423–452. Carson, E. (1999), “Kant and the Method of Mathematics” in: Journal of the History of Philosophy 37 (4), 629–652. Frege, G. (1884), The Foundations of Arithmetic, translated by J. L. Austin, Basil Blackwell, Oxford, 1974. Friedman, M. (1992), Kant and the Exact Sciences, Harvard, Cambridge. Grier, M. (2001), Kant’s Doctrine of Transcendental Illusion, Cambridge University Press.
The View from 1763: Kant on the Arithmetical Method before Intuition
45
Hintikka, J. (1967), “Kant on the Mathematical Method” in: The Monist 5, 352–375; reprinted in Hintikka, Knowledge and the Known: Historical Perspectives in Epistemology, Reidel, Dordrecht, 1974, 160–183. Hintikka, J. (1973), Logic, Language Games and Information: Kantian Themes in the Philosophy of Logic, Clarendon, Oxford. Kitcher , P. (1975), “Kant and the Foundations of Mathematics” in: The Philosophical Review 84, 23–50; reprinted in Posy (1992), 109–134. Martin, G. (1985), Arithmetic and Combinatorics: Kant and His Contemporaries, translated and edited by J. Wubnig, Southern Illinois University Press, Carbondale. Leibniz, G. W. (1765), New Essays on Human Understanding, translated and edited by P. Remnant and J. Bennett, Cambridge, 1996. Longuenesse, B. (1998), Kant and the Capacity to Judge: Sensibility and Discursivity in the Transcendental Analytic of the Critique of Pure Reason, translated by C. T. Wolfe, Princeton. Mancosu, P. (1996), Philosophy of Mathematics & Mathematical Practice in the Seventeenth Century, Oxford. Parsons, C. (1969), “Kant’s Philosophy of Arithmetic” in: Philosophy, Science and Method: Essays in Honor of Ernest Nagel, edited by S. Morgenbesser, P. Suppes, and M. White, St. Martins, New York, 1969; reprinted, with Postscript, in Parsons (1983); reprinted in Posy (ed.) (1992), 43–79. Parsons, C. (1983), Mathematics in Philosophy: Selected Essays, Cornell University Press, Ithaca, New York. Parsons, C. (1984), “Arithmetic and the Categories” in: Topoi 3, 109–121. Polonoff, I. (1973), Force, Cosmos, Monads and Other Themes of Kant’s Early Thought, KantStudien Erg¨anzungshefte 107, Bouvier, Bonn. Posy, C. (1984), “Kant’s Mathematical Realism” in: The Monist 67, 115–134; reprinted in Posy (ed.) (1992), 293–313. Posy, C. (ed.) (1992), Kant’s Philosophy of Mathematics: Modern Essays, Kluwer, Dordrecht. Pycior, H. M. (1997), Symbols, Impossible Numbers, and Geometric Entanglements, Cambridge. Shabel, L. (1998), “Kant on the ‘Symbolic Construction’ of Mathematical Concepts” in: Studies in History and Philosophy of Science 29A (4), 589–621. Schultz, J. (1791), Erl¨auterungen u¨ ber des Herrn Professor Kant Critik der reinen Vernunft, Hartung, K¨onigsberg; reprinted in Aetas Kantiana. Bruxelles: Culture et Civilisation, 1968, 160–161; translated by J. C. Morrison and published as Philosophica series 47, University of Ottawa Press, 1995, 159. Schultz, J. (1792), Pr¨ufung der kantischen Kritik der reinen Vernunft, Hartung, K¨onigsberg 1789 (vol. I); 1792 (vol. II). Sch¨onfeld, M. (2000), The Philosophy of the Young Kant: The Precritical Project, Oxford. Thompson, M. (1972), “Singular Terms and Intuitions in Kant’s Epistemology” in: Review of Metaphysics 26, 1972–3, 314–343; reprinted in Posy (1992), 82–101. Tonelli, G. (1959a), “Der Streit u¨ ber die Mathematische Methode in der Philosophie in der ersten H¨alfte des 18 Jahrhunderts und die Erststehung von Kants Schrift u¨ ber die “Deutlichkeit”” in: Archive f¨ur Philosophie 9, 37–66. Tonelli, G. (1959b), Elementi Metodologici e Metafisici in Kant dal 1745 al 1768, Torino. Wolff, C. (1713), Elementa Matheseos Universae, 2 vols., Halle, 1713–15; English trans. of selected parts by J. Hanna, A Treatise of Algebra, London, 1739. Wolff, C. (1719), Vern¨unftige Gedanken von Gott, der Welt und der Seele des Menchen, auch allen Dingen u¨ berhaupt, Halle; also in Gesammelte Werke, edited by J. Ecole et al., George Olms, Hildesheim, 1960–. Young, M. (1982), “Kant on the Construction of Arithmetical Concepts” in: Kant-Studien 73, 17–46. Young, M. (1984), “Construction, Schematism and Imagination” in: Topoi 3; reprinted in Posy (1992), 159–176.
46
Intuition and the Axiomatic Method
de Vleeschauwer, H. J. (1939), The Development of Kantian Thought, translated by A. Duncan, London 1962.
THE RELATION OF LOGIC AND INTUITION IN KANT’S PHILOSOPHY OF SCIENCE, PARTICULARLY GEOMETRY Ulrich Majer Universit¨at G¨ottingen, Germany
In his so-called dissertation of 1770 “De mundi sensibilis atque intelligibilis, forma et principiis” Kant put forward for the first time his celebrated distinction between the world of sense and intellect, between sensible intuition and thinking. This distinction, although by no means new and original, became not only the basis of his critical philosophy but also the object of heated disputes about its legitimacy. The reasons for these disputes rest to my mind not so much in the ambiguity of the distinction, but in Kant’s epistemological assertion that the two principal forms of intuition — space and time — not only are, but necessarily must be recognised as among the a priori sources of our empirical knowledge (of nature). This opinion was refuted almost immediately after it had been asserted by the development of the so-called non-Euclidean geometries by Gauß, Bolyai and Lobachevsky, that is by geometries in which the axiom of parallels does not hold, although these geometries are logically consistent. Nonetheless many scientists and philosophers still believe that there is a grain of truth in Kant’s epistemological assertion that we need besides pure thinking (logic) another non-empirical source of knowledge, called pure intuition, in order to understand how empirical knowledge is possible at all. So far so good. The difficult question is, however, what type of intuition it is that can or could serve as an a priori foundation for our empirical knowledge, which has experienced a very rapid growth during the last two centuries. A certain rough, yet unshakable answer can be given almost immediately: Because mathematics1 plays not just a pre-eminent but an indispensable role2 in our scientific understanding of nature, the kind of intuition we are looking for must be related somehow to our knowledge of mathematics.3 Unfortunately, however, this reflection alone does not suffice to determine precisely the type of intuition we are looking for. It only indicates that we have to take 47 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 47–66. © 2006 Springer. Printed in the Netherlands.
48
Intuition and the Axiomatic Method
at least two further considerations into account in order to answer our main question. First, we have to ask: How do we come to know mathematics and the descriptive content of its different branches? Is it just a particular kind of reasoning (as Dedekind and Frege assumed), or do we rely in addition on some special type or form of intuition? Secondly, how is mathematics with its different branches related to our understanding of nature? Is it just a question of trial and error which branch of mathematics has to be applied in the analysis and presentation of certain natural phenomena or is there a more intimate relationship between mathematics and its different branches on the one hand, and our understanding of nature and its phenomena on the other? Both questions, in my view, have not been satisfactorily answered — neither by the scientists themselves, nor by the philosophers of science. In this essay I will concentrate on the first question and postpone the second. I will treat the first question by engaging in a certain debate about Kant’s philosophy of mathematics, in particular his account of geometry and the kind of intuition, which he supposes to be the foundation of our geometrical knowledge.
1.
Logical versus phenomenological interpretations of Kant
In the last two decades, a discussion has taken place about the correct, or perhaps better, the most “intelligent”, interpretation of Kant’s distinction between ‘sensibilia’ and ‘intelligibilia’ in general and the relation between intuition and logic in particular and their respective role in and influence on Kant’s epistemology of science. Two principal lines of interpretation can be discerned. The first line — put forward mainly by E. Beth, J. Hintikka and M. Friedman — argues that Kant’s concept of logic was so restricted that he needed to suppose (in addition to his narrow notion of logic) a second source of knowledge, the famous outer intuition of space, in order to present geometry in a logical deductive form as Euclid had exemplified. Yet today — and this is, of course, the main point of this line of interpretation — we have a much more powerful form of logic, namely mathematical logic, such that we can avoid any recourse to intuition. I have to say more about this line of interpretation, but first let me sketch the second line.4 The second line has been articulated as an alternative interpretation to the first line by Charles Parsons and Emily Carson. It is best characterised by the observation that geometry cannot be “reduced” to logic, at least not completely, because it has a descriptive content, which is more specific than the content of logically valid sentences (which are ‘true’, semantically speaking, in every correct interpretation of their descriptive vocabulary). In this sense, geometry goes definitely beyond first-order logic and discursive thinking in general abstract concepts, if we understand by the latter any conservative extension of first-order logic by introducing new concepts and relations. Consequently, Kant was right to insist on a second non-logical source of knowledge
The Relation of Logic and Intuition in Kant
49
(alongside logic and discursive thinking) which ensures us that the sentences of Euclidean geometry are true with respect to something non-logical with which we are acquainted as sensible beings and not just as consequences of a deductive system of axioms. The intricate question is, however, what this non-logical something is in respect to which the truth of geometrical sentences has to be judged. One answer is, of course, straightforward; it’s the empiricist answer that the truth of geometry has to be judged with respect to the results of geometrical measurements of angles, distances, etc. But this is, of course, not Kant’s answer, because this answer, despite its obvious circularity, could only corroborate but never demonstrate the truth of Euclidean geometry.5 For this reason we have to look for something quite different, something which renders the truth of Euclidean geometry intelligible. Kant calls this non-logical something “the form of our outer intuition”. But this is only a phrase, and the difficult point is therefore to spell out what Kant means by this phrase, in particular, what he means by “intuition”. This is the hard core of the recent debate. What I want to do is roughly the following. I will develop an argument in connection with Hilbert’s foundations of geometry which favours the second line of interpretation more than the first. I hope this will help to balance the debate, because I think the present state of the discussion is too much in favour of the first position. To begin with, let me stress that the difference between the two lines of interpretation is not the necessity of intuition in geometry — in this respect both parties agree in their Kant interpretation — but the reason why intuition is necessary on Kant’s view of a sound foundation of geometry. According to Friedman, intuition is necessary only because Kant’s conception of logic is too restricted: he cannot deal with sentences of the form ∀x ∃y xRy, in which a universal quantifier governs an existence claim. Such sentences are, however, crucial for an axiomatic, i.e. a deductive presentation of geometry (as I’ll show in a moment). For Parsons and Carson, on the other hand, recourse to intuition would remain necessary, even if Kant had had a more powerful logic than he in fact had. That is to say even if Kant had modern predicate logic available (with two- and more place relations, instead of monadic Aristotelian syllogistic) the recourse to intuition would remain necessary in order to recognise the truth of Euclidean geometry, in particular the truth of the axioms. In other words, the difference between the two lines of interpretation is one of relative versus absolute necessity. The first line deems intuition to be only relatively necessary, depending on the type of logic one uses for an axiomatic presentation of geometry. The second line takes intuition to be absolutely necessary, i.e. independent of the type of logic Kant (or anyone else) uses for an axiomatic presentation of geometry.
50
2.
Intuition and the Axiomatic Method
Friedman’s interpretation of Kant and his notion of intuition
In this section I will deal with the first line of interpretation, more precisely with Friedman’s interpretation of Kant’s view of spatial intuition as outlined in his paper “Geometry, Construction, and Intuition in Kant and his Successors”.6 The second line of interpretation, as I understand it, will then emerge automatically in the next section as soon as I begin to develop my own and, I hope, more reasonable interpretation of Kant in the light of Hilbert’s foundational work in geometry. In his paper Friedman tries to defend his view of Kant’s theory of intuition as a stopgap in Kant’s deductive approach to geometry against some of the objections that were raised by Parsons and Carson (henceforth P&C) in respect of his constructivist interpretation of Kant’s notion of spatial intuition.7 Friedman’s strategy of defence, as I understand it, is roughly the following: First he accepts the criticism of P&C that there is a “second component” in Kant’s notion of spatial intuition, namely the immediate acquaintance with objects given in spatial intuition, which is not captured or sufficiently reflected in his own constructivist interpretation of the role of spatial intuition as a logical stopgap in Kant’s deductive theory of geometry. At this junction I should perhaps be more precise about the central idea behind Friedman’s logical interpretation. Kant introduced the notion of “intuition” according to Friedman in conscious opposition to the notion of “concept” just because Kant recognised correctly that Aristotelian syllogistic is not sufficient for an axiomatic foundation of geometry. In particular, Aristotelian syllogistic cannot deal with quantified sentences of the form ∀x ∃y xRy, in which the existence claim is dependent on the universal claim. Such sentences are, however, indispensable in an axiomatic presentation of geometry. Take for example Hilbert’s axiom II 2: “For any two points A and C there is always at least one point B on the line AC, such that C lies between A and B.”8 This is by no means the only example of a universal-existential claim; Hilbert’s Foundations of Geometry is in fact replete with such sentences. Now, if Kant could not handle such sentences in logical deductions (because he had only Aristotelian syllogistic at his disposal), he had to find a substitute in order to close the gap. Hence, Kant introduced — according to Friedman — the notion of “intuition” (in addition to that of concepts), which permitted him to construct singular objects recursively, piece by piece, in the pure intuition of space, instead of making “universal-existential claims”, such as Hilbert did in axiom II 2. In other words, Friedman regards Kant as an inadvertent representative of a constructive in distinction to an axiomatic approach to geometry in the modern sense of this opposition.9 This becomes particularly clear in the following consideration. Friedman refers to a passage in Kant’s handwritten notes (Ak. 20, pp. 419– 421) in which Kant distinguishes between two examinations of space: space as “generated” by geometry, and space as “given originally” in metaphysics.
The Relation of Logic and Intuition in Kant
51
The interesting point is now that Friedman, although he admits that Kant’s metaphysical consideration of space is “congenial to the phenomenological approach”, he nevertheless favours the geometrical approach as more fundamental and intelligible than the phenomenological approach. The reason for this rather puzzling interpretation is the circumstance, I presume, that Friedman wants to put Kant’s approach to geometry in close parallel to his view on arithmetic, which is a good deal more constructivistic than Kant’s approach to geometry. I think however one should not overemphasise this parallel, because in Kant’s time nobody had any idea what an axiomatic presentation of arithmetic would look like, and for this simple reason Kant was forced (like Leibniz) to take a more “genetic” or “constructive” route in the presentation of arithmetic than in geometry. Kant himself, by the way, was completely aware of this, because he stresses repeatedly that arithmetic has no axioms (because their number would have to be infinite, which makes no sense) but only postulates or “Zahlformeln” (CPR, B206). It would be very tempting to dig deeper into the differences between arithmetic and geometry, but I return instead to Friedman’s line of defense. As already mentioned, Friedman seems first to accept the criticism of P&C that there is a second important aspect in Kant’s notion of intuition, which is not reflected by the role of intuition as a logical stopgap in geometry, namely the remarkable fact that spatial intuition is the only way in which we become immediately acquainted with external objects, whereas concepts lead only to an indirect cognition of objects. Hence, outer intuition (and not concepts) makes us acquainted with spatial features of objects like form, size, dimension, etc. So far at least Friedman seems to agree with the phenomenological interpretation of Kant’s notion of spatial intuition. But suddenly, in a rather unexpected move, Friedman begins to attack the phenomenological approach as inferior to his own interpretation, instead of accepting both interpretations as complementary. In order to make this point clear beyond any possible doubt let me quote the decisive passage in full: So far, then, the basic ideas of the phenomenological approach appear to be vindicated. The crucial question, however, concerns how exactly metaphysical space — “the pure form of the mode of sensible representation of the subject as a priori intuition” [Ak. 20, pp. 419–21] — is supposed to accomplish this grounding [of geometrical space]. Is the given infinity of space as a pure form of sensible intuition supposed to be directly seen, as it were, in a simple act of perceptual or quasi- perceptual acquaintance? Are we supposed to have a direct perceptual or quasi-perceptual access to such infinity entirely independent of geometry? Both of these ideas appear to be very doubtful. For we are certainly not perceptually presented with an infinite space as a single given whole; and since the visual field is always finite, it does not even appear to be true that any perceived spatial region is directly given or perceived as part of a larger such region. The idea of independently given phenomenological facts capable of somehow grounding or justifying the possibility of geometrical construction can quickly appear to be absurd.10
52
Intuition and the Axiomatic Method
After this warning-shot towards the phenomenological position that the problem of the infinity of space (and the justification of the corresponding geometrical axioms such as Hilbert’s axiom II 2) is not straightforward, Friedman proceeds basically towards his final goal in two main steps: the demonstration of the superiority of the logical over the phenomenological interpretation.11 First, Friedman launches his own reading of Kant’s notion of “external intuition” by quoting at length a number of remarkable passages, in which Kant explains what he understands by “figurative synthesis”, and how this is linked with the geometrical constructions of objects in space such as a straight line or a triangle. The quotations given by Friedman are indeed all very impressive because they show that Kant had a dynamical conception of “figurative synthesis” and not merely a static notion of representation of external objects.12 But whether the quotations show what Friedman wants them to show, namely, that geometrical space and, hence, geometry as the theory of space is the result of a dynamical process of our imagination, called figurative synthesis, this is quite another question, one to which I will return at the end of the paper. In the second half of his paper, Friedman shows that some of the successors of Kant in nineteenth century, in particular the mathematicians Helmholtz, Poincar´e and Lie (HLP), developed a view with respect the foundations of geometry, which is very similar to Kant’s “constructive” approach as presented in the dynamical conception of “figurative synthesis”. This is, in my view, the most interesting and convincing part of Friedman’s paper because he makes clear that the group-theoretical approach to geometry, as it was developed and ramified by HLP is in two respects very similar to Kant’s characterization of the “figurative synthesis” as a dynamical process in our imagination. First, the group-theoretical approach takes “linear transformations” of rigid bodies, i.e. translations and rotations, as the basic operations on which the theory of space has to be built. Quite similarly, Kant takes the “drawing of a straight line” and “the rotating of such a line around a fixed point” as “first constructions”, as he says, on which all other constructions of figures have to be built or to which they have to be reduced. The second point in which the group-theoretical approach comes close to Kant’s dynamical characterization of the figurative synthesis is the problem of understanding the infinity of space. According to the group-theoretical approach the infinity of space is not just a factum brutum of our outer intuition, but the result of an operation, which can be repeated without any limit. This sounds very similar to what Kant said about the figurative synthesis of a straight line: I can represent no line to myself, no matter how small, without drawing it in thought, that is gradually generating all its parts from a point . . . On this synthesis of the productive imagination in the generation of figures is based the mathematics of extension (geometry) together with its axioms, which express the conditions of a priori sensible intuitions under which alone the schema of a pure concept of outer appearance can arise. (A162–162/B203–204)
The Relation of Logic and Intuition in Kant
53
From this remark it seems to be only one step to the conclusion that the infinity of space is — like the infinity of the sequence of natural numbers — the result of an iterated operation, a successive synthesis in imagination, which can be repeated, at least in principle, without limit. This is basically, I suppose, what Friedman suggests when he relates Poincar´e’s approach to arithmetic to the group-theoretical approach to geometry, because one aspect of the analogy is quite certain. The infinity of the number-sequence rests — according to Poincar´e — on the principle of simple induction, the “immer noch eins” as Weyl calls it, and this iteration is an intuition par excellence, which can not be reduced to pure logic. Consequently, if the same iteration is at work in the figurative synthesis of the productive imagination, then it is almost certain that the infinity of space does not rest on an immediate perceptual intuition, as the phenomenological view supposes, but is the result of a constructive procedure, which generates all geometrical objects piece by piece, instead of perceiving them at a single glance. In a final move Friedman links the group-theoretical approach of HLP to his own dynamical interpretation of Kant and his notion of (geometrical) space as the result of an iterated construction. By this, Friedman suggests, he has shown two things. First, he has shown that the dynamical interpretation is superior to the phenomenological one in at least two respects: (1) it makes all those pieces in Kant’s texts intelligible, in which Kant favours obviously a more constructive or genetic approach — not only with respect to arithmetic, but also to geometry; (2) it renders the infinity of space not a mere “mystery” — like the phenomenological approach, which supposes that we can somehow perceive the infinity of space in an act of contemplating the pure form of outer intuition — but makes it the result of an inner process, an iterative synthesis, which is still “rational”, insofar as it is constructive, although it transcends the capacities of our conceptual thinking. Second, Friedman thinks he has shown that, and in what sense, a recourse to a pure form of outer intuition was not only necessary in Kant’s time but also reasonable and justified from a more modern point of view — although it could be dismissed in principle, as soon as a more powerful logic than Aristotelian syllogistic was available. Kant’s successors, Helmholtz, Lie, and in particular Poincar´e had made it evident that the role of our outer intuition is a constructive one in the dynamical sense that certain groups of transformations generate spaces of constant curvature, spaces in which rigid bodies can move freely in every direction. In other words “free mobility of rigid bodies” is the fundamental intuition on which geometry has to be built, and this is precisely what Friedman maintains at the end of part I: The spatial intuition grounding the axioms of geometry is fundamentally kinematical, in my view, and it is expressed in the formal structure of translational and rotational motion (in modern terminology, the structure of the group of Euclidean motions).
54
Intuition and the Axiomatic Method
Now let’s come to the main question: How shall we judge this retrospective interpretation? What should we think of this chain of systematic reasoning? It is, of course, very enchanting and very interesting as a reconstruction of the development of geometry and its philosophical implications since Kant. But is it also compelling as an interpretation of Kant’s philosophy regarding the epistemological foundations of geometry? Is Kant really an inadvertent but nonetheless pure constructivist, as Friedman’s interpretation suggests? Precisely in this respect a number of doubts and objections have been raised. What I intend to do in the second half of this paper is to add just another objection — a rather unexpected one, I believe — and to use its systematic core in support of the phenomenological interpretation.
3.
In favour of the phenomenological interpretation
In order to prepare the ground for my objection, I have to point out that the story of the development of geometry after Kant, as it is told by Friedman, is by no means the only story one can tell; it is not even the most important and interesting one. There are indeed many different stories, or to express my point more appropriately, the one real history of modern geometry since Kant has many more facets and is a good deal more complex than Friedman suggests. In fact, if there is a lesson to be drawn from the history of modern geometry then it is exactly this: that there are several competing schools, different approaches, many distinct points of view, and last but not least that some schools and traditions died out and new ones emerged. This is not the place to recapitulate the history of modern geometry,13 but I have to draw a few broad strokes in order to make my point plausible that there exists another approach favouring the phenomenological interpretation. The nineteenth century witnessed one of the most, if not the most complex development of geometry since Euclid. The reasons for the complexity have to be traced back to the eighteenth century. In it, two schools of geometrical research dominated the development: analytical and projective geometry. This promoted towards the end of the eighteenth century a reconsideration of the problem of parallels, which then led to the discovery and exploration of nonEuclidean geometries. So far, I think, the history is well-known. What is not so well-known is the fact that afterwards several rather different research programs emerged regarding the unification and foundation of the different branches of geometry. Of these we have to consider in the present context only the two most prominent and fundamental ones: 1. the so called “Erlanger Programm” of Felix Klein and 2. the axiomatic approach by Pasch, Hilbert and the “Italian school”.14 These two research programs (or schools) stood — the friendly relation between Klein and Hilbert notwithstanding — in strong competition or even opposition to each other. And it’s this opposition that I have to explain in order to make the phenomenological interpretation of Kant more intelligible.
The Relation of Logic and Intuition in Kant
55
The “successors” of Kant whom Friedman presents as standing in close relation to Kant’s constructive approach to geometry all belong, of course, more or less immediately to the Erlanger Program of F. Klein: Helmholtz — having studied Riemann’s work — opened the door, Klein designed the program, Lie executed it to a large extent, and Poincar´e unveiled its philosophical implications. This is not the place to analyze the historical relations between the members of this school; it suffices to know that they all favoured a purely group-theoretical approach to geometry, based on the notion of “free mobility of rigid bodies”. But what about the competing program? How could the members of the axiomatic approach sympathise with the phenomenological interpretation taking space as (more or less immediately) given in intuition, as they in fact did? How could they support such an obsolete view? Was it not one of the cornerstones of Hilbert’s Foundations of Geometry that intuition plays no role in the axiomatic presentation of geometry? Before I explain why Hilbert’s position in fact favours the phenomenological interpretation much more than it does Friedman’s dynamical reconstruction of Kant, I have to clear away some popular prejudices regarding Hilbert’s axiomatic approach — prejudices which prevented a clear comprehension of what Hilbert had achieved in geometry. Next, I’ll compare the group-theoretical with Hilbert’s axiomatic approach and show, in what sense the latter penetrates more deeply into the epistemological foundation of geometry than the former. Once this is clarified, the essential argument in favour of the phenomenological interpretation can be easily stated. Finally, I will indicate, where Friedman went wrong (according to my view) in his interpretation of Kant’s philosophy of space and his corresponding meta-theory of geometry. By far the worst prejudice is the opinion that Hilbert was a formalist in the sense that one can choose the geometrical axioms arbitrarily, “at will” so to speak; the only constraint on this kind of freedom is the request for consistency that the axioms do not contradict each other, more properly speaking, that the resulting axiom system (including a deductive logic) does not entail any contradictions. This opinion is reinforced by a second prejudice, which is almost as absurd as the first: We have the freedom of choice in geometry just because geometrical expressions like “point”, “straight line”, and “plane” have no meaning in the sense that these expressions do not denote anything specific, which is determined independently of our will. We have the freedom to determine their meaning as we like, and we achieve this by choice of an appropriately designed system of axioms — “axioms” which are taken as “implicit definitions” of the respective geometrical expressions. That this is not Hilbert’s point of view should be evident to anybody who has ever taken a serious look into Hilbert’s book Grundlagen der Geometrie. Unfortunately, however, Hilbert did not always express his point of view regarding geometry clearly and unambiguously. Frequently he wanted to stress the new, exciting aspects of his axiomatic approach, which seduced him, quite naturally, into underemphasising the traditional Euclidean aspects of his ax-
56
Intuition and the Axiomatic Method
iomatic approach. But as we know from the lectures which immediately preceded the publication of the Grundlagen, he was a great admirer of Euclid and had no intention at all of abandoning Euclid. On the contrary, he wanted to improve Euclid’s Elements by supplementing Euclid’s approach with a new systematic means to prove the logical independence of axioms and other metalogical properties of axiom systems such as consistency and completeness. This means is the famous “axiomatic method”. I’ll turn to it in a moment. But first let me point out (in a somewhat dogmatic fashion) what Hilbert’s point of view is by introducing a terminological distinction which cannot explicitly be found in Hilbert’s writings, at least not in full strength, although it is implicitly present. For Hilbert, geometry has a descriptive content, and this content has to be presented somehow, either in a “genetic” or in an “axiomatic” form. Hilbert prefers, for reasons which I’ll discuss later, the axiomatic form. Every such presentation of a content I will call an “axiomatic presentation” in clear and sharp distinction to the term “axiomatic method”, which denotes, as its name says, a method of inquiry into the logical relations among the axioms of an axiom system. An axiomatic presentation of a certain content has to fulfill, of course, certain requirements such as logical consistency, “material” adequacy and logical transparency, to name only the most important. Of these, only the first and third requirement can be investigated by means of the axiomatic method, whereas the second, the material adequacy of the presentation of a given content, has to be judged on a completely different ground. Which ground this is and in what it consists, is in the case of geometry a difficult question, a question which I will come back to in the next section. But first I have to point out what the axiomatic method is (in distinction to the axiomatic presentation of a given content). In which way, for example, can it be used to prove the logical independence of the axiom of parallels from the remaining axioms of Euclidean geometry? If we can answer this question, we have understood the central way in which Hilbert improved upon Euclid. The basic idea is this: the axiom of parallels is replaced by one of its two negations; if the resulting axiom system has a “model” (in the modern technical sense of the term), then the new system is relatively consistent. If that is the case, then the axiom of parallels is logically independent of the remaining axioms of Euclid, just because there is a “world”, in which its negation (together with the other axioms of Euclid) is valid.15 In other words the axiomatic method is a deliberate variation of a system of axioms in order to see whether a certain sentence is logically independent of the other sentences of the system in question. Although it cannot actually be separated from the axiomatic presentation of a content, because it is the optimal procedure for enhancing the logical transparency of an axiomatic presentation, it has to be distinguished painstakingly from the presentation itself in order to avoid total confusion between means and goal. Having clarified the worst misconceptions regarding Hilbert’s axiomatic approach to geometry let us turn to the main question in the present context: What
The Relation of Logic and Intuition in Kant
57
is the ground of our geometrical knowledge — insofar as not only logic and conceptual thinking are involved? What’s the source of our knowledge, if we know, let’s say, Pasch’s axiom?16 Hilbert’s principal answer is straightforward: it is our “spatial intuition”, or to quote Hilbert’s formulation in the introduction to the Grundlagen: Setting up the axioms of geometry and investigating their interconnections is a task that has been discussed in numerous excellent treatises in the mathematical literature since Euclid. This task amounts to a logical analysis of our spacial intuition.17
I admit that this formulation is not as clear and distinct as one would wish, but this is no reason to ignore it completely, as most modern interpreters of Hilbert’s axiomatic approach to geometry do. Instead, one should try to clarify it and to make it more precise. In order to begin with the obvious, let me stress that Hilbert’s task of “establishing the axioms of geometry” is precisely what we have called the “axiomatic presentation of geometry” in conscious distinction to the axiomatic method, which is nothing but a tool for performing the “logical analysis of our spatial intuition”. This means that the task of “establishing the axioms of geometry” according to Hilbert’s point of view, has to have recourse to our “spatial intuition” and to analyze this intuition logically by means of the axiomatic method in order to recognise the logical connections between these axioms. So far, I think, there can be no doubt or debate about Hilbert’s intentions. The decisive and difficult question however is: What does Hilbert mean by “spatial intuition”? In the rest of this paper, I’ll try to give an answer to this subtle question and argue that it is more in favour of the phenomenological than the constructive approach. In a first preparatory step I’ll compare Hilbert’s axiomatic point of view, i.e. his logical analysis of our spatial intuition (as presented in the Grundlagen) with the group-theoretical approach of HLP.
4.
Hilbert’s axiomatic versus the group-theoretical approach of HLP
Fortunately, Hilbert himself prepared the ground for a profound comparison of the two approaches, which reveals their different epistemological stances. This facilitates my task considerably. In a long, sophisticated and ingenious ¨ essay “Uber die Grundlagen der Geometrie” (GG) of 1902, which has been ignored almost totally by philosophers (some remarkable exceptions notwithstanding18 ) Hilbert points out the fundamental differences between his and the group-theoretical approach. The opening section of this essay is worth quoting in full, because it entails a sketch of the essential points of divergence between the two approaches. Hilbert begins the essay with a very careful and polite but in the end fundamentally critical remark: The investigations of Riemann and Helmholtz on the foundations of geometry prompted Lie to attack the problem of an axiomatic treatment of geometry by
58
Intuition and the Axiomatic Method using the group concept as starting point, and they led this astute mathematician to a system of axioms of which he proved, on the basis of his theory of transformation groups, that they are sufficient for the development of geometry.
This was the polite part. Hilbert goes on: Now, in the development of his theory of transformation groups Lie always assumed that the functions which define the groups are differentiable, and thus it remains undiscussed in Lie’s work whether the assumption of differentiability is indeed unavoidable when we ask what ought to be taken as axioms for geometry, or rather whether it appears that the differentiability of the functions concerned is just a consequence of the group concept together with the other geometrical axioms. Because of his way of proceeding, Lie is also compelled explicitly to adopt the axiom that every group of motions is generated by infinitesimal transformations. These requirements . . . can be expressed by purely geometrical means only in a contrived and complicated way, and moreover they appear to be required only by the analytic method used by Lie rather than by the problem itself. (In Hilbert (1899), Anhang IV, pp. 178–179.)19
The politeness in the first part of the remark cannot obscure the fact that Hilbert’s attitude towards Lie’s group-theoretical approach to geometry is highly critical. Nonetheless, two comments seem to be appropriate in order to avoid any misunderstanding. First, and most important, the remark is not so much an attack on the use of group theory in geometry, but a critique of the intuition underlying the approach. This becomes clear if one realises what Hilbert does in the paper. He shows that one can set up a system of axioms “which likewise rests on the concept of group, but which entails only simple and geometrically perspicuous assumptions and in particular in no way presupposes the differentiability of the functions that mediate the motions.” In other words, the whole idea of basing geometry on the intuition of differentiable transformation functions, namely infinitesimal rotations and translations of rigid bodies, presupposes from a strictly logical point of view much too much, and should be replaced by an approach which avoids the idea of motion [as a primitive concept of geometry] altogether and works instead with “pure geometrical” concepts and relations. This is exactly what Hilbert does in the paper, because he uses — besides group theory — only Jordan’s theorem according to which any continuous, closed curve in a plane without ‘Doppelpunkte’ (points of intersection) separates the plane into an inner and an outer domain. Before I go on to explain what Hilbert means by “geometrically perspicuous assumptions” (and why the differentiability of the transformation functions is not such an assumption), let me make a second remark. The “kind of representation” of geometry by means of group theory which Hilbert developed in his essay represents by no means his own axiomatic point of view as it is presented in the book Foundations of Geometry, but is already a compromise between the group-theoretical approach of Lie and his own axiomatic point of view, which includes — this is important to note — a logical analysis of our spatial intuition. This is made crystal clear by the conclud-
The Relation of Logic and Intuition in Kant
59
ing remark of the essay, in which Hilbert stresses the crucial difference between the “foundation [Begr¨undung]” of geometry in the essay and that in the monograph.20 In the essay, the assumption of continuity is put in the first place before all other axioms. In the monograph, the order is exactly the opposite. Here the two axioms of continuity are quite consciously the last ones, so that one can recognise which of the well-known sentences of elementary geometry are logically independent of the assumption of continuity and which are not, e. g., Pascal’s theorem. Furthermore, only the (two) axioms of continuity turn the preceding axiom system into a “complete” theory, as Hilbert calls it, i.e., a theory whose domain of objects cannot be expanded while consistently maintaining all the other axioms. In modern terms, the axioms of continuity turn geometry into a “categorical” theory that has (up to isomorphism) exactly one model. Bearing both remarks in mind, let us come to the question of what Hilbert means by “geometrically perspicuous assumptions” or to put it more explicitly: Why is the (kind of) intuition underlying the group-theoretical approach of HLP not the best choice from Hilbert’s perspective? The answer to this question depends, of course, on the goal that one wants to achieve with an axiomatic presentation of geometry. If one wants to reveal the close inner relationship between geometry and physics (in particular with the kinematics of pre-relativistic physics) then the motions of rigid bodies, their infinitesimal rotations and translations, are presumably the best choice for an axiomatic presentation of geometry. 21 But if the goal is a totally different one, if the goal is an analysis of the logical dependence and independence of geometrical concepts and sentences, as in the Grundlagen, then the best choice is a different one. We have to strip off all non-geometrical properties and relations and concentrate our logical analysis on the “pure” geometrical facts. But which are the pure geometrical facts? Hilbert’s answer is straightforward: the geometrical facts are the facts of our “spatial intuition”, as he calls it. But this answer seems to be question-begging. Consequently, I have to explain why it is not, or to put it differently, I have to point out what Hilbert understands by spatial intuition in such a way that it becomes evident that motions (of rigid bodies) do not belong to geometry proper.
5.
Hilbert’s notion of spatial intuition and the phenomenological interpretation of Kant
The task of making Hilbert’s notion of “spatial intuition” explicit is simultaneously easy and difficult: easy, in that only the structure of our spatial intuition is at stake; but difficult, if we wish to reveal the real reasons why Hilbert held such an apparently antiquated view, why he was not convinced that the HLP approach was a more appropriate, and therefore also the more modern,view. Regarding the question of the structure of our spatial intuition, the answer can be stated in one sentence: the form or structure of our spatial intuition is the same as the structure of the (infinite) 3-dimensional Euclidean space, which
60
Intuition and the Axiomatic Method
is (up to isomorphism) the unique model of Euclidean geometry if, as Hilbert assumes, Euclidean geometry includes the “Vollst¨andigkeitsaxiom”. This is in perfect harmony with Kant’s view that the form of our outer intuition is the 3-dimensional space of Euclidean geometry. But there is, of course, a methodological difference: whereas Kant was acquainted only with Euclidean geometry22 and, hence, had no choice but to take Euclidean space as the form of outer intuition, Hilbert was in a completely different position: he, of course, knew a large number of models of all “kinds” of geometry (non-Euclidean, non-Archimedean, etc.) and could therefore choose among different models as representing the form of our outer intuition. Taking this into account, his choice of Euclidean space as the canonical form of our spatial intuition is much more astonishing than Kant’s. This brings me to the more difficult task of explaining the reasons for Hilbert’s apparently outmoded choice. In his later writings Hilbert became a bit more explicit regarding the role of spatial intuition in the foundations of geometry than he was in his monograph Grundlagen der Geometrie of 1899. As I understand these later writings, Hilbert presents at least three distinct reasons for basing geometry on intuition, and in particular for taking Euclidean space as the structure of our outer intuition. The reasons are very different in style and character, but each can be taken as an implicit argument for not following the HLP approach and, hence, as an argument against Friedman’s constructive Kant interpretation. The first reason concerns the question of the validity or invalidity of Euclidean geome try. At first glance, this question seems so simple that one may wonder what it has to do with our main concern of what kind of intuition geometry should be based on, if it is to be based on intuition at all, and not simply taken as an empirical matter. In another contribution to this volume, written in collaboration with Tilman Sauer (in Part II of this volume), we deal with this question at length, and show that it is much trickier and subtler than the stubborn empiricist believes. Here it must suffice to remind the reader that at least until 1919 the question of the objective validity of Euclidean geometry was not settled, because the observations were not sufficiently precise to refute Euclidean geometry. Why is this relevant for our present concern? The answer is simple. If the observational means were not sufficiently precise to refute Euclidean geometry, then our knowledge of geometry (be it Euclidean or another one) must have a source different from mere observation and measurement. This “non-empirical” source,23 by which we have more or less “immediate” contact with external objects like houses and trees, is called “spatial intuition” by Hilbert, and on it rests our knowledge of geometry — not only in everyday life, but also in science. So much for the role of intuition in geometry. But why should we not follow the HLP approach and take our awareness of motions of rigid bodies as the primary intuition on which geometry should be built? Let me summarise the argument of my other paper: Hilbert shows that Poincar´e’s so-called conventionalism with respect to geometry is the result of a conceptual confusion, namely, taking as simple what, in fact, is more compli-
The Relation of Logic and Intuition in Kant
61
cated. What is Hilbert’s argument? It is this: “free mobility” of rigid bodies is no presupposition for the application of geometry to the external world. Consequently the alleged “freedom of choice” among different “spaces of constant curvature” and the inherent possibility of choosing Euclidean geometry as the simplest case, is nothing but a conceptual illusion. The real reason for the choice of Euclidean space is this: the observed deviations from the structure of Euclidean space are so incredibly small that nobody could conceive how we could ever improve the precision of our instruments in such a way that we could measure these minute differences. Poincar´e’s view that it is simpler to stick to Euclidean geometry and to explain possible deviations by introducing appropriate force-functions is conceptually mistaken, because to explain such deviations by the introduction of force-functions is in fact much more complicated than to have no need for an explanation at all, as is the case if we are prepared to choose another, a non-Euclidean geometry. (For more details, see the paper by Majer and Sauer in Part II of this volume.) It is important to note that Hilbert’s critique of Poincar´e’s conventionalism and his adherence to free mobility of rigid bodies as a methodological presupposition for the application of geometry to the external world is not totally new. Already in 1901, in his lecture before the Royal Academy of Science in G¨ottingen, Hilbert had criticised Lie’s group-theoretical approach to geometry for introducing “superfluous” elements: On the basis of Riemann and Helmholtz Lie set up a system of axioms which differs fundamentally from those systems that are developed according to the Euclidean model. Lie’s axioms contain function-theoretic parts since he requires motion to be expressed by differentiable functions. . . . The question arises whether the function-theoretic components are only necessary because of the desire to apply this (group-theoretic) method, or whether they are foreign to the subject matter itself and are thus superfluous. It turns out that in fact they are. Thereby we once again draw closer to the old Euclid, insofar as we don’t need to impose the additional infinitesimal properties on the concept of motion which Lie still thought necessary. Instead, the elementary postulates which are already contained in the Euclidean concept of congruence suffice, a concept with which we are all familiar, due to the theorems about the congruence of triangles known from school.24
There are two further reasons for Hilbert to base geometry, in principle, on spatial intuition in general and on Euclidean space in particular. In his later lectures on geometry (after 1899 until 1927) he always points out that his way of doing and presenting geometry is neither analytic nor synthetic but different from both. If we understand the reasons for this “neither-nor” we see much better why Hilbert does not follow the HLP approach, but is much closer to the phenomenological view according to which our spatial intuition is the unlimited, all-embracing form of extension, through which we (more or less) immediately become aware of external objects. Consequently, we have to ask: what are the reasons Hilbert has for rejecting not only analytic but also synthetic geometry? Let’s begin with analytic geometry.
62
Intuition and the Axiomatic Method
In analytic geometry one presupposes the concept of numbers and the laws of arithmetic in order to move as quickly as possible to the different applications of geometry in the quantitative sciences. But this is exactly the reverse of what Hilbert wants to achieve: namely, a detailed logical analysis of the deductive relations between the principles and theorems of geometry. Such a logical analysis is seriously hampered (if not totally blurred) by introducing numbers into geometry right at the beginning, as is done in analytic geometry. Furthermore, numbers don’t intrinsically belong to geometry;25 they are, in fact, rather different from points on a line: each number has its own unmistakable character; almost exactly the opposite is true of geometrical points; they are as mere points indistinguishable, and can be discerned only by their relative position; numbers in geometry are like intruders from an alien star, the star of “counting and calculating” or, more generally, “recursive” reasoning. This means that it has to be shown that and how they can be incorporated into geometry without contradiction. For this reason, numbers should be introduced into geometry after and not before the logical analysis of spatial intuition has been accomplished, i.e., at the end and not at the beginning of an axiomatic presentation of geometry. The fundamental reason for this is that the two kinds of “infinity” are conceptually distinct, and rest on fundamentally different intuitions: The infinity of the natural number sequence is rooted in the “immer noch eins”, as Weyl calls the recursion which is the foundation of counting. The infinity of space, on the other hand, has nothing to do with such inductive procedures, but is rather the expression of a “part-whole” relation of our spatial intuition: every straight-line segment is part of a larger straight-line segment, or to put the same point another way: every straight-line segment can be expanded ad infinitum, i.e., in principle without any limit. In order to recognise that this relation has absolutely nothing to do with induction, we express it in a way that shows it has no inner connection with time: to any pair of points A, B on a straight line α there exists another point C on α, which does not lie between A and B. Although this axiom, first stated by Pasch, is not sufficient to prove the infinity of straight lines, it is nonetheless the core of it. We need only the axiom of Archimedes that a fixed segment of a straight line has an invariable length (speaking intuitively), in order to obtain the desired result. These reflections have, I think, made it clear that the assertion that the infinity of space rests on the same kind of induction as that from which the infinity of the number sequence results, as Friedman appears to believe, is purely wishful thinking. This has, of course, consequences for his Kant interpretation, to which I’ll come in a moment. First let me discuss Hilbert’s rejection of synthetic geometry. Why does Hilbert not accept synthetic geometry? This ought, it would seem, accord far more closely to his fundamental view of geometry. Hilbert’s reasons for rejecting synthetic (and not just analytic) geometry are much more subtle and, hence, more intricate to explain. The subtlety has to do with an im-
The Relation of Logic and Intuition in Kant
63
plicit switch in the methodological point of view. In synthetic geometry spatial intuition, or an aspect of it (like operations with ruler and compass) is used to “construct” geometrical objects like lines, circles, triangles, etc. But this is not Hilbert’s point of view — at least not in an axiomatic presentation of geometry. He, instead, wants to analyze our spatial intuition, not to use it. This is important. It means that he makes intuition the object of his investigation, not the instrument, as in synthetic geometry. This change in perspective has dramatic consequences, quite similar to those changes in physics, when the instruments themselves were made the objects of physical inquiry, as happened to some extent in the theories of quantum mechanics and special relativity. The most dramatic consequence is, of course, that Hilbert does not rely, and does not need to rely as in synthetic geometry, on the validity of our spatial intuition. More precisely, he doesn’t need to rely on the unlimited validity of our spatial intuition in the sense that it represents up to an arbitrary degree of precision the objective structure of physical space. There may be utterly small differences between both structures, which have not been, and (as far as we know) cannot be, accounted for in the sense that we can find causes for these differences in the form of ‘special forces’ or such like. This gives us the freedom to study our spatial intuition for its own sake, just as we do with other faculties of our mind, like thinking and reasoning, which we can also make, and have made, an object of inquiry. The method of inquiry is even in both cases the same: namely, a deliberate “variation” of the basic principles, in the case of geometry, the axioms of (Euclidean) geometry. By means of the axiomatic method we can bring the sentences of geometry into a logical order — an order reflecting the deductive strength. In this way we recognise which axioms are more central, and which more peripheral. This order should reflect the robustness of our spatial intuition. Let me conclude with a remark regarding Friedman’s constructive interpretation of Kant. First, Friedman points out that Kant distinguished between two examinations of space, a metaphysical and a mathematical one. Next, he concedes that Kant’s metaphysical examination is “congenial to the phenomenological approach” of P&C. Secondly, he nonetheless sides with the mathematical examination, because in the light of the HLP approach Kant’s “explanation” of the infinity of space by “figurative synthesis” makes more sense than the metaphysical explanation of the phenomenological approach. However, I think, that Hilbert’s rehabilitation of the Euclidean view, which does not invoke “free mobility” of bodies, shows that the metaphysical explanation of the infinity of space is at least as reasonable as that in the constructive approach. I suspect that the deeper reason for Friedman’s rejection of the phenomenological approach has to do with his tendency throughout the paper to assimilate spatial intuition to spatial perception and vice versa. This, I think, is a serious mistake. But this is another topic, one that needs careful examination.
64
Intuition and the Axiomatic Method
Notes 1. By “mathematics” I mean here not only arithmetic and analysis, but also geometry and topology, as well as all other branches of modern mathematics, like algebra and probability theory. 2. This has been questioned by some philosophers, such as Hartry Field. This position is, however, to my mind completely untenable and misguided. Field holds that numbers are not genuinely part of nature, but an “invention” of the human mind and, for this reason, have in the end to be eliminated from any description of nature. But this has to be realised in a completely different way, not by dispensing with mathematics, but rather by analyzing its relation to nature. 3. Here (and in the rest of this essay) I take it for granted that mathematics, in each of its branches, is not a purely logical but a substantive discipline, with a descriptive content going beyond first-order predicate logic and its conservative extensions. 4. This is more difficult, because this line was first articulated as a distinct position in reaction to the first line, although it is presumably much older than the first line. Indeed, it is almost the original interpretation, or something close to it, as I will show. 5. The distinction between corroboration and demonstration is meant to capture Kant’s distinction between contingent empirical and necessary universal truths. 6. This paper was not yet published when I presented my “second thoughts” on it for the first time at the conference “Intuition in Mathematics and Science” in Montreal in 1999. It has meanwhile been published as Friedman (2000). 7. See Carson (1997) and Parsons (1992). The characterization of Friedman’s view as “constructivist” is my own. P&C don’t use it, although it reflects Friedman’s Kant interpretation very well, as I am going to show. They restrict their critique to the point that Friedman’s interpretation of Kant’s notion of intuition is too one-sided and propose instead to supplement it by a more ‘contemplative’ view of intuition. This is again my term, and I’ll later explain the difference (respectively, contrast) the two expressions are intended to capture. 8. “Zu zwei Punkten A und C gibt es stets wenigstens einen Punkt B auf der Geraden AC, so daß C zwischen A und B liegt” (Hilbert (1899)). Lehrsatz 8 in Pasch’s Vorlesungen u¨ ber neuere Geometrie (1882) has the same logical form. In fact Hilbert took Pasch’s Lehrsatz 8 and made it his Axiom II 2 in the Foundations of Geometry. 9. There is a striking resemblance between Weyl’s criticism of the axiomatic approach and Friedman’s Kant interpretation. Both take universal-existential sentences to be the core of the problem. The difference, however, is that Weyl regards them as senseless if they are not supported by a construction according to a corresponding law, whereas Friedman seems to think that Kant would have accepted them, if he had had modern logic. In any case, Friedman does not recognise that Kant might have rejected universal-existential sentences for very similar reasons as Weyl did. See Majer (1988). 10. Friedman (1999), p. 189. 11. For the sake of brevity, I have to omit the details in the exposition of Friedman’s position and to concentrate on the main points in his sophisticated chain of reasoning. 12. Imagination and representation are, according to the quotations, two quite distinct capacities of our mind. 13. There are many good books on this subject; let me mention just three, Kolmogorow and Yushkevich (1996), Hartshorne (2000), and Scriba and Schreiber (2002). 14. For the sake of simplicity, I will neglect the Italian school of geometry and concentrate on the development in Germany. 15. The burden of proof of the logical independence of a sentence is, as one can infer from the above formulation, transmitted to the proof of the existence of a “model”. The inauguration of something like model theory is one of Hilbert’s achievements. 16. Pasch’s axiom says that a straight line cutting one of the sides of a triangle also cuts one of the two opposite sides of the same triangle if the trivial yet necessary conditions are fulfilled that the straight line and the triangle lie in the same plane, and the line does not pass through the third vertex. 17. “Die Aufstellung der Axiome der Geometrie und die Erforschung ihres Zusammenhanges ist eine Aufgabe, die seit E UKLID in zahlreichen vortrefflichen Abhandlungen der mathematischen Literatur sich er¨ortert findet. Die bezeichnete Aufgabe l¨auft auf die logische Analyse unserer r¨aumlichen Anschauung hinaus.” 18. To my knowledge, Husserl was one of the few philosophers who took notice of Hilbert’s essay and recognised its epistemological significance; see my forthcoming paper Majer (2003).
The Relation of Logic and Intuition in Kant
65
19. “Die Untersuchungen von Riemann und Helmholtz u¨ ber die Grundlagen der Geometrie veranlaßten Lie, das Problem der axiomatischen Behandlung der Geometrie unter Voranstellung des Gruppenbegriffs in Angriff zu nehmen, und f¨uhrten diesen scharfsinnigen Mathematiker zu einem System von Axiomen, von denen er mittels seiner Theorie der Transformationsgruppen nachwies, daß sie zum Aufbau der Geometrie hinreichend sind. Nun hat Lie bei der Begr¨undung seiner Theorie der Transformationsgruppen stets die Annahme gemacht, daß die die Gruppe definierenden Funktionen differenziert werden k¨onnen, und daher bleibt in den Lieschen Entwicklungen uner¨ortert, ob die Annahme der Differenzierbarkeit bei der Frage nach den Axiomen der Geometrie tats¨achlich unvermeidlich ist oder ob die Differenzierbarkeit der betreffenden Funktionen nicht vielmehr als reine Folge des Gruppenbegriffes und der u¨ brigen geometrischen Axiome erscheint. Auch ist Lie zufolge seines Verfahrens gen¨otigt, ausdr¨ucklich das Axiom aufzustellen, daß die Gruppe der Bewegungen von infinitesimalen Transformationen erzeugt sei. Diese Forderungen, . . . lassen sich rein geometrisch nur auf recht gezwungene und komplizierte Weise zum Ausdruck bringen und scheinen u¨ berdies nur durch die von Lie benutzte analytische Methode, nicht durch das Problem selbst bedingt.” 20. That this final remark is crucially important for a correct understanding of the essay (that the essay doesn’t represent Hilbert’s position) is made clear by the fact that in the first reprinting of the essay (in the second edition of the monograph), Hilbert points immediately to the final remark in a footnote to the title. 21. Even this assumption can be debated in the light of the General Theory of Relativity. 22. Some interpreters have argued that Kant was aware of geometries other than Euclidean. It is true that he speculated about the possibility of non-Euclidean geometry, but I would hesitate to call this knowledge. In any case, he did not know any models of non-Euclidean geometry and, hence, could not choose any such model as the form of our outer intuition. 23. By “non-empirical” I don’t want to exclude the possibility that our spatial intuition is the result of an evolutionary process over many generations, shaped by natural selection and cultural inheritance. All I want to exclude is the empiricist’s assertion that it is exclusively the result of individual experience and learning. 24. “Fussend auf Riemann und Helmholtz hat nun Lie ein Axiomensystem aufgestellt, welches sich von den nach Euklidischem Muster entwickelten Systemen wesentlich unterscheidet. Lie’s Axiome enthalten funktionentheoretische Bestandteile, indem Lie verlangt, dass die Bewegung durch differenzierbare Funktionen vermittelt wird. . . . Es fragt sich, ob die Funktionsbestandteile nicht bloß wegen des Wunsches diese (gruppentheoretische) Methode anzuwenden n¨othig waren und nicht vielmehr der Sache selbst fremd und deswegen u¨ berfl¨ussig sind. Es zeigt sich, dass sie es in der Tat sind. Wir n¨ahern uns dadurch wieder dem alten Euklid, insofern wir dem Begriff der Bewegung nicht die weitergehenden infinitesimalen Eigenschaften, die Lie noch n¨otig hatte, brauchen aufzuerlegen, sondern mit den elementaren Postulaten auskommen, die bereits in dem Euklidischen Begriff der Kongruenz enthalten sind, und die uns allen aus den von der Schule her bekannten S¨atzen u¨ ber die Kongruenz von Dreiecken gel¨aufig sind.” 25. By this assertion I do not mean that there is no structural similarity between the points on a line and the sequence of numbers. In fact there is such a structural similarity, even an identity in the order of points and numbers. But this does not imply that points and numbers, taken individually, are similar objects.
References Carson, E. (1997), “Kant on Intuition in Geometry” in: Canadian Journal of Philosophy 27, 489–512. Friedman, M. (1999), “Geometry, Construction, and Intuition in Kant and His Successors” in: Between Logic and Intuition, edited by G. Sher and R. Tieszen, Cambridge University Press, 2000. Hartshorne, R. (2000), Geometry: Euclid and Beyond, Springer, New York. Hilbert, D. (1899), Grundlagen der Geometrie, Thirteenth Edition, Stuttgart, Leipzig, 1987. ¨ Hilbert, D. (1902), “Uber die Grundlagen der Geometrie”, Anhang IV in: Hilbert (1899). Kolmogorov A. N. and A. P. Yuskevich (eds.) (1996) Mathematics of the Nineteenth Century, Birkh¨auser, Basel. ¨ Majer, U. (1988), “Uber eine bemerkenswerte Differenz zwischen Brouwer und Weyl” in: Exacte Wissenschaft und ihre philosphische Grundlegung, edited by Deppert et al., Lang, Frankfurt am Main. Majer, U. (2003) “Husserl and Hilbert on Geometry”, forthcoming.
66
Intuition and the Axiomatic Method
Majer, U. and T. Sauer (2005) “Intuition and Axiomatic Method in Hilbert’s Foundation of Physics”, this volume. Pasch, M. (1882), Vorlesungen u¨ ber neuere Geometrie, Teubner, Leipzig. Parsons, C. (1988), “The Transcendental Aesthetic” in: The Cambridge Companion to Kant, edited by P. Guyer, Cambridge University Press, 1992. Scriba, C. J. and P. Schreiber (2002), 5000 Jahre Geometrie, Springer, Berlin.
EDMUND HUSSERL ON THE APPLICABILITY OF FORMAL GEOMETRY Ren´e Jagnow Middlebury College, Vermont, U.S.A.
1.
Introduction
In his Grundlagen der Geometrie, David Hilbert treated the axioms of geometry in a radically new way.1 In the context of his investigation, he suggested that we understand axioms not in the traditional sense as sentences stating fundamental facts about spatial intuition, but rather as logical forms devoid of intuitive content. Accordingly, geometric terms like ‘point,’ ‘line,’ and ‘plane’ did not refer to intuitable objects, but rather functioned as purely syntactic elements whose interrelations were determined by the axioms. This view on the axioms of Euclidean geometry changed how mathematicians and philosophers of mathematics understood this science: pure geometry came to be seen as a formal deductive system. Edmund Husserl was well acquainted with Hilbert’s view, as he showed, for example, in the excerpts he copied out from the Hilbert-Frege correspondence, where he remarked: “Frege does not understand the meaning of Hilbert’s ‘axiomatic foundation’ of geometry, namely that it is concerned with a purely formal system of conventions, which is identical in form with the Euclidean.”2 For Husserl, one of the most important questions arising from Hilbert’s approach to geometry was whether such purely formal inquiry had any relevance beyond a pure theory of deductive systems. In his last work Die Krisis der europ¨aischen Wissenschaften und die transzendentale Ph¨anomenologie, Husserl demanded that we recover the original intention or meaning that lead to the formation and development of the various mathematical sciences.3 He believed that the abstract and formal character of the modern sciences posed the danger of them becoming mere formal games; only by philosophically elucidating these sciences’ origins in the life-world could scientists assure that they were engaging in a meaningful practice. In the Krisis, Husserl suggests undertaking this task as a means of guiding us towards a phenomenological investigation into the constitution of the life-world. 67 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 67–85. © 2006 Springer. Printed in the Netherlands.
68
Intuition and the Axiomatic Method
Nevertheless, it is difficult to see why such an inquiry into the origins of the sciences is necessary. After all, Husserl admits that the sciences function well and make highly successful predictions. Husserl could simply say that formal geometry as established by Hilbert is a meaningful practice because it is successfully applied in physics. I think, however, that Husserl found such an answer insufficient, because, on its own, the mere application of a given formal science in physics does not ensure its conceptual continuity with the life-world. It is possible that a given formal concept of space, say an axiomatic system describing a non- Euclidean space, will not have any connection to our pre-scientific concept of space. If this were to be the case, we would lose any justification for saying that physics expresses a genuine concept of space. According to Husserl, we can give such a justification only by showing how the formal concept of space is grounded in a pre-scientific concept of space, that is, how the former applies to the latter. Thus, the formal sciences raise two applicability problems — one of explaining how a given formal concept can be applied in the physical sciences, and the other of showing how formal inquiry applies to pre-scientific experience. The inquiries into these two problems serve two radically different purposes: Whereas the first ensures the correctness of certain scientific results, and is thus mathematical or physical, the second secures the meaningfulness of our formal concepts, and is thus essentially philosophical. I believe that one can reconstruct Husserl’s second applicability problem and his answer to it by diverting one’s attention from the Krisis and toward some of his early writings, specifically I am referring to his notes to his socalled Raumbuch.4 In the following, I will use these writings in order to outline Husserl’s view on the relationship between formal inquiry and the life-world in respect to formal geometry. I will begin by describing the specific form the problem of the applicability of geometry took on within Husserl’s genetic approach to the concept of space. I then show that Husserl was involved in answering two questions. He had to show first how the results of formal inquiry could apply to the space of geometry, and second, how the results of geometry could apply to intuitive space, i. e., the space of everyday experience. In the Krisis, Husserl addressed the second question. After outlining his answer to this question, I will argue that Husserl’s notion of the definite manifold answers the first question, since it was originally intended as a condition that, if fulfilled, would guarantee the relevance of the results of formal geometry for geometric space.5
2.
The problem of the applicability of formal geometry
In the early 1890s, Husserl intended to write a book about space. In his notes to this so-called Raumbuch, he distinguished between different concepts of space — the space of intuition (which he also called “space of everydaylife”), the space of pure geometry (or simply, geometric space), the space of applied geometry (which Husserl also called “space of natural science”), and
Edmund Husserl on the Applicability of Formal Geometry
69
the space of metaphysics.6 After becoming acquainted with Hilbert’s approach to geometry, Husserl also used the concept of formal space.7 According to the typology of spaces he was developing at that time, the problem of the applicability of formal geometry had to be framed as a question about the relationship between formal inquiry and different concepts of space. More precisely, Husserl’s typology raised the two questions of (i) how formal geometry related to intuitive space (i. e., the space of everyday experience); and of (ii) how formal geometry related to physical space, that is, to the space constituted by the natural sciences. Yet because Husserl later abandoned the Raumbuchproject in favour of his transcendental phenomenology, he never dealt with these questions extensively. In spite of this, his notes and the Krisis contain enough material to enable us to understand his general position with respect to the first question, that is, the applicability of formal geometry to intuitive space. In order to outline his answer to it, we have to consider not only formal and intuitive space, but also the space of pure geometry (which I will call in the following just “geometric space”). I will justify this way of proceeding in due course. Both in his early investigations into descriptive psychology and his later phenomenological research, Husserl believed that any given entity was intelligible only as a correlate of a particular type of experience. In order to define clearly the different kinds of space, Husserl, therefore, had to analyse the experiences or forms of consciousness in which a subject became aware of them. Husserl provided us with the following characterization of intuitive space: By representation of space we may first mean the space of intuition, that is, the space of extra-scientific experience, the space which everyone, children or adults, scholars or laymen, are able to experience in lived perception and fantasy.8
Husserl here determines intuitive space as the spatial correlate of everyday experience, or, as he calls it alternatively, of everyday consciousness, which is the primary mode in which any person (independently of age or education) experiences the world. Husserl distinguishes everyday from other kinds of consciousness, notably from scientific and ethical consciousness, by the attitude, or interest a person takes towards the world.9 In contrast to the attitude of scientific consciousness, that of everyday experience takes the life-world unreflectively as an unanalysed whole. For pre-scientific consciousness space is an integral part of a larger array of experiences. As an intentional correlate of certain pre-scientific experiences, intuitive space has an objective character. Although Husserl elaborates this view more fully only in his lecture course Ding und Raum from 1907, in which he develops a theory of the constitution of intuitive space, some fundamental aspects of it are already present in his notes to the Raumbuch.10 There, Husserl remarks that one important general feature of intuitive space is that its elements, spatial objects like houses, trees, and landscapes, for example, are not entirely intuitively accessible. These spatial objects can be given to consciousness only as a continuous series of partial intuitions, with the objects themselves as the
70
Intuition and the Axiomatic Method
ideal limits. The elements of intuitive space are, therefore, ideal objects — unities of conceptual and intuitive elements.11 The same is true of intuitive space as a whole: it is an ideal object correlating to certain features of everyday consciousness. Husserl writes: “The space of ‘intuition’ is rather a complex of intuitions united, made into an identical unity, by [a] judgement, which accompanies uniform types of processes, thus fixing them.”12 In sum, intuitive space is a set of ideal objects and relations that are part of a larger spatial structure that in turn forms an integral part of everyday consciousness. According to Husserl, the space of intuition is distinguished from geometric space, which he describes as “a conceptual construct produced through logical treatment of the representation of space present in pre-scientific consciousness.” Geometric space is therefore no longer “represented in intuition or intuitable, but rather only thinkable.”13 The concept of geometric space is expressed by an axiomatic system that results from two idealizing processes: The origin of a geometric concept of space already presupposes the origin of basic geometric concepts. Because only on the basis of these idealizations of original concepts of objects, as we find them in intuition, can we make such quasi-inductions, which we can also call idealizations. The ideal concept that we call ‘geometric space,’ however, is defined though ideal concepts and ideal propositions.14
The first type of idealization constitutes the basic geometric concepts, whereas the second constitutes the axioms of a geometric theory such as Euclid’s. The first process of idealization departs from everyday perception, which enables human beings to form pre-geometric concepts such as points, lines, and planes, etc., through a process of partitioning: We can divide a physical body in fantasy multiple times, without destroying its unity. . . . Since each body is a physical part of the total space, it has to have a border to the remaining space. This border is its “surface.” Similarly, we can divide planes physically. We can divide each plane into two planes, which do not have any parts in common, without destroying the unity of the original plane. The spatial object, which is common to both of them, and “by means of which” they border on each other, is called a “line.” In the same way, the physical partition of lines leads to borders, which we call “points.” Points cannot further be divided spatially.15
The concepts resulting from the partitioning are inexact. Husserl later calls them “morphological” in order to distinguish them from geometric concepts, which he understands as ideal boundaries.16 Thus, the geometer must idealize the correlating morphological concepts in order to form the basic concepts of Euclidean geometry. This, Husserl believes, can be accomplished only by ascribing continuity to the space of everyday experience.17 Since we cannot actually see the continuity of a given extension, say, of a given line segment, in the sense relevant to geometry, Husserl concludes that continuity is an ideal concept constructed on the basis of certain perceptual processes familiar to everybody. He gives as his example the situation of two visible points bordering on each other. One can focus more sharply on these points by using
Edmund Husserl on the Applicability of Formal Geometry
71
instruments or by diminishing the distance between oneself and the points. By improving viewing conditions in this way, one can discover a visible distance between the points that can then be filled in with a further point. Since this process can be repeated over and over again, we can arrive at the general rule that it is always possible to place a further point between any two points under ideal conditions of observation. Although one cannot actually complete an infinite process of adding further points, one can stipulate ideal limits as the products of such an infinite division and add them to the line. As a result, the points can be understood as extensionless, and the line which they are a part of as continuous.18 According to Husserl, this example shows us how the process of idealization departs from the objects and relations of intuitive, i. e., experiential, space. It ends up constituting a completely new kind of object, however: geometric objects like points, lines, and planes that cannot be perceived, but rather only thought.19 Once the basic concepts of geometry have been constituted in this manner, the geometer can construct an axiom system, thus constituting the geometric concept of space. This construction is guided by experience and based on the observation of relations between empirical objects. These facts do not justify the axioms, however: Pure deduction is a matter of pure theory. Deduction never asks where the basic assumptions come from, it assumes them. One may disagree about the cognitive value of the basic concepts of geometry and the basic assumptions; the geometric sentences are beyond suspicion. Naturally so, since their validity has no meaning other than the correctness of the consequences drawn from the basic geometric assumptions.20
The geometer accepts the axioms, which for the early Husserl include axioms of existence and axioms defining properties of the objects postulated by them, simply on the basis that they are independent of each other and that the resulting theory is consistent.21 Finally, in order to establish the concept of formal space, the geometer has to formalize the axiomatic system of geometry, substituting representations of concrete objects with representations of objects as such: In the highest and most comprehensive sense, mathematics is the science of theoretical systems as such; and, in abstraction, a science of that which is being theorized in the given theories of the various sciences. If in any given theory or deductive system, we abstract away from its material, or from the unique species of objects that it seeks to control, and if we substitute materially determined object representations with mere formulas, that is, with representations of the objects controlled by theories of such a kind, then we will have accomplished a generalization, which considers the given theories as a mere special case of a theory class, or, rather, of a theory-form (Theorienform).22
What does Husserl mean in this quote by representations of objects as such, or what he often calls ‘object-forms’? The following quote gives us a clue: This (the object-forms) is a domain, which is solely determined by the fact that it is governed by a theory of this form, that is, its objects can be connected only ac-
72
Intuition and the Axiomatic Method cording to certain relations, which follow certain basic laws of a predetermined form. These objects are completely undetermined in respect to their matter — in order to indicate this, the mathematician speaks preferably of ‘Thought-Objects.’ They are neither determined directly as individual or specific singularities, nor indirectly through their inner kinds or species; but rather exclusively through the form of the relations prescribed to them.23
Object-forms are abstract objects whose properties are determined through the relations defined in the axiomatic system. As such object-forms are completely void of any material content. Yet Husserl also believes that an object-form in the sense of categorial form corresponds to the formal theory in its entirety: If we understand space as the categorial form of the space of the world (Weltraum) and, correlatively, geometry as the categorial theory-form (kategoriale Theorienform) of geometry in the common sense, then space is being subsumed under a species of categorially determined manifolds that is delimited by laws, in relation to which one will naturally speak of space in a broader sense.24
The phrase ‘space as [a] categorial form’ here simply means that the object ‘space’ is determined only by its relational properties. Accordingly, formal space is the correlate of a formal axiomatic system and has the ontological status of a general individual. Since the formal theory represents only the deductive form of the material theory from which it departed, it is not a theory in the proper sense, but rather a theory-form (Theorienform). This form can be common to an infinite number of material theories.25 Husserl draws two important consequences from his view. First, the fact that a formal theory constitutes its own domain allows the mathematician to ignore the material science and to restrict his/her interest to formal inquiry. Once mathematical inquiry has reached the formal level, it is free from its genetic roots in intuition. At this level, the mathematician is not only delivered from the responsibility of paying attention to content, but also from the constraints of the process by which the formal theory was generated. This freedom from the material domain finds its expression in the fact that it is legitimate to change the formal theory by adding or eliminating axioms that define new operations. In Husserl’s words: Once one ascends to the pure system of operations, leaving the original real domain of objects, be it line-segments or numbers, behind and considering in most general generality a domain as such which is determined by these forms of operation, one is able to modify the idea of such a system in various ways, in the sense of an extended or restricted system of operations, or of axioms.26
Second, the fact that the formal theory exhibits the form of a material discipline allows the geometer to consider the former a meta-theory of the latter. Considering the axiomatic form of a theory independently of content facilitates the process of optimizing its logical structure. In this sense, “mathematics is, according to its highest idea, theory of theories, the most general science of deductive systems in general.”27 In order for a deductive theory to have an optimal form, it had to fulfil a number of conditions. All the facts had to follow
Edmund Husserl on the Applicability of Formal Geometry
73
from the axioms, and the axioms themselves had to be consistent and independent of each other.28
3.
Husserl’s solution to the problem of the applicability of formal geometry
Let me now return to my original question of the relevance of formal geometry for intuitive space. We have seen that formal inquiry can function as meta-theory. Yet Husserl claims that formal inquiry achieves more; it is an instrument of mathematical discovery developed to replace material inquiry. He writes: Formal mathematics claims to be an instrument of concrete mathematical discovery. In the same way as the old mathematics of quantities was the instrument of research in the natural sciences, that is, the instrument of deductive theoreticization of the various domains of inquiry in physics (for which induction had derived the appropriate theorems), so the new formal mathematics does not only want to achieve the same, but much more. Formal mathematics wants to create methods of infinitely higher generality and power which render all methodical inquiries in real mathematics superfluous.29
Since formal geometry is a method of higher generality, its axioms can be changed so that new operations define new elements. Taking this into consideration, how is it still possible for the results of formal geometry to apply to intuitive space? According to Husserl, this question has to be answered in two steps. First, we have to explain how the results of formal inquiry can apply to geometric space, and then how the results of the geometry that constitutes geometric space can apply to intuitive space. We cannot apply the results of formal inquiry directly to intuitive space, because, for Husserl, the latter is not a model of the formal axiomatic system. In contrast to the objects and relations of geometric space, the objects and relations of intuitive space are not represented by ideal, but rather by inexact morphological concepts.30 As a result, if we were to interpret formal axioms and theorems with intuitive objects, we would produce falsehoods. Therefore, there is no direct path leading from abstract to intuitive space. Rather, we must proceed from formal through geometric to intuitive space. I will first consider the relationship between geometric and intuitive space. Although Husserl addressed this problem in the Krisis, he gave only a very sketchy idea of what he understood this relationship to be. In his historical analysis of the genesis of modern science, he argued that geometry originated from the practice of land surveying (Feldmeßkunst). He wrote: Geometry (of idealities) was preceded by the practice of land surveying which was not concerned with idealities. Such pre-geometric achievement, however, was a foundation for the meaning of geometry, a foundation for the important invention of idealization.31
74
Intuition and the Axiomatic Method
Husserl’s dictum that land surveying represented the ultimate source of meaning for geometry implies that, at least historically, geometry was intended to improve surveying techniques in order to achieve more adequate results. Thus, he pointed to a cultural practice as the original field of application of geometry. It is important to note that, according to Husserl, the technique of land surveying remained entirely within the limits of the intuitive world, because it applied only morphological concepts of everyday language. Husserl viewed the relation between geometry and the intuitive world as one of approximation. Since geometric concepts were constituted in a process of idealization, they can be applied to intuitive space only by reversing the process. For Husserl, the possibility of applying the conclusions of geometric reasoning to the intuitive world was thus a result of transforming the technique of land surveying. He wrote: By affiliating itself with the art of measurement and by now governing this art, mathematics has shown that one can have universally an objective-real knowledge of a completely new kind about the things of the intuitive world. This kind of knowledge also accords with its special interest as mathematics of forms in which all things participate, namely a form of knowledge, which is related through approximation to its own idealities. Thus, mathematics must descend again from the world of idealities to the empirical intuitive world.32
The development of geometry transformed the practice of land surveying into an essentially mathematical technique that specified its own conditions of application. Husserl mentions two ways in which this was accomplished. On the one hand, the surveyor establishes certain rules that allow him/her to interpret the measurements in such a way that they approximate the idealized constructions of geometry. On the other hand, the surveyor strives to improve constantly his/her measuring-techniques to gain results that will approach as closely as possible those predicted by geometry. Unfortunately, Husserl does not elaborate this suggestion any further. The central notion in Husserl’s solution to the problem of the applicability of formal geometry to geometric space was his concept of a definite manifold, which he defined in Ideen I, Formale und transzendentale Logik, and in the Krisis. In these works, he intended this notion to characterize the ideal of a formal scientific theory, which consisted of sentences that formed a universal nomological or explanatory connection; that is, all the theorems of the ideal theory followed logically from a finite number of axioms. In Husserl’s words: The axiomatic system which defines a [definite manifold] is characterized by the fact that every sentence (sentence-form) which can be constructed by means of logic alone from the concepts (concept-form) of this manifold is either “true,” that is, an analytic (purely deductive) consequence of the axioms, or “false,” that is, an analytic contradiction: tertium non datur.33
According to Husserl, the property of definiteness of an axiomatic system had a consequence for its domain, that is, as he says, “its manifold.” If the axiomatic system is definite, then the elements in its domain are fully determined in re-
Edmund Husserl on the Applicability of Formal Geometry
75
spect to their mutual formal relations. We can say that, according to Husserl, an ideal formal theory defines a definite manifold. The context in which Husserl introduced the notion of a definite manifold in his major works, however, is partly misleading, since he originally introduced it as a condition ensuring the connection between formal and material arithmetic.34 In order to be able to appreciate the full significance of the definite manifold for the applicability of formal geometry, we have, therefore, to look at this context. Husserl developed the notion of a definite manifold in the lecture at the G¨ottingen Mathematical Society that I mentioned in the introductory section of this paper. In the following, I will first reconstruct the argument of this lecture with respect to arithmetic.35 Then, I will show that Husserl considered his results as a solution to the problem of the applicability of formal geometry. In his G¨ottingen lecture, Husserl was concerned with what he called the “problem of imaginary magnitudes” (imaginary mathematical objects). This problem has to do with the fact that in mathematics we are able to define imaginary objects such as negative, real, and irreal numbers. These objects are intuitively contradictory; that is, they are not part of an arithmetic that is based on the intuitive apprehension of numbers (Husserl uses the term Anzahlenarithmetik).36 Nevertheless, we can use these objects in our calculations and achieve results that are valid in such an arithmetic. This happens in all those cases in which the final results do not contain any of these imaginary objects. Husserl states the problem of the imaginary numbers as follows: Let us say, we are presented with a domain of objects in which the specific nature of the objects determines forms of connections and relations which are expressed in a certain axiomatic system. Certain forms of connection have no real meaning; that is, they are contradictory, due to this system, that is, because of the specific nature of the objects. On what grounds can we apply the contradictory in our calculations; on what grounds can we treat the contradictory in axiomatic systems as non-contradictory. How can we explain, that we can operate with the contradictory according to rules, and that, if the contradictory later disappears from the sentences thus gained, these sentences are true?37
Let me illustrate Husserl’s problem of imaginary objects by means of an example. In Anzahlenarithmetik we cannot solve the following system of equations: (a) 7 + x = 2; (b) 6 + x = y. The reason for this is that x < 0 and thus not a number, but rather an imaginary object. Yet, in Anzahlenarithmetik, we can give a partial solution and solve this system for y. Since x = 2 – 7, y = 6 + (2 – 7), i. e., y = (6 + 2) – 7, i. e., y = 1. If we admit the negative numbers into our system of arithmetic, we can solve the system of equations in a different way. We get x = (–5) and again y = 1. What Husserl wants to explain is how it is possible that this application of imaginary objects yields a result that is valid in Anzahlenarithmetik. So far this problem seems to have no connection to our question about the relation between formal and material arithmetic. Yet, Husserl established this connection in the course of explaining why previous theories were unable to justify the use of imaginary or contradictory mathematical objects in obtaining
76
Intuition and the Axiomatic Method
valid arithmetic results. He believed that previous theories had misconstrued the process by which these objects were constituted. Husserl wrote: It is clear that we cannot extend the concept of number (Anzahlbegriff) arbitrarily. Yet, we can leave the concept of number and define a new purely formal concept of positive numbers through the formal system of definitions and operations which hold for numbers (Anzahlen). And this formal concept of positive numbers, just as it is defined by definitions, is able to be extended by new definitions, that is, extended without contradiction.38
In contrast to other mathematicians, Husserl thought that the construction of imaginary mathematical objects was a complex process, which involved a transition to formal arithmetic.39 More precisely, this process consisted of two steps. First, the material arithmetic of numbers (Anzahlenarithmetik) was formalized and thereby a new domain of formal objects — the positive integers — constituted. Second, the resulting system was consistently extended so as to include the negative numbers. Accordingly, the problem of imaginary mathematical objects had now to be stated as a problem of the relation between formal and material arithmetic. Husserl was asking in his G¨ottingen talk how it was possible for the results of calculations in this extended formal system to be true in material arithmetic. In order to answer this question, Husserl needed to consider only the relation between unextended and extended formal theories. As we have seen, the unextended formal axiomatic system represented for him the logical form of material arithmetic, which could be consistently axiomatized. As a result, all the logical consequences of the unextended formal axiomatic system were true sentences in material arithmetic. The important question then is under what condition the results of the extended theory could also be true sentences (i. e., logical consequences) of the unextended axiomatic system, and thus of the material discipline. We can specify this question even more precisely by looking at Husserl’s concept of the extension of an axiomatic system. In a fragment published as “Zur formalen Bestimmung einer Mannigfaltigkeit,” Husserl considered several ways in which an axiomatic system could be extended. Among them was the ideal of an axiomatic system that could not be extended by the introduction of a further independent axiom if it was to keep its domain of objects fixed.40 Husserl identified this type of definiteness with Hilbert’s notion of completeness. A perfect axiomatic system of this kind defined its elements exhaustively with respect to their mutual formal relations. Husserl further stated that such an axiomatic system could only be extended by adding axioms in such a way that the consequences of the unextended system would form a subset of the consequences of the extended system.41 He wrote: The concept of extension entails, if we relate it to the axioms, that the new axioms comprise the old axioms and that the new axioms in addition to this admit operations excluded by the old axioms.42
Edmund Husserl on the Applicability of Formal Geometry
77
Under the assumptions that (i) material arithmetic could be consistently axiomatized and that (ii) the formal theory derived from material arithmetic was perfect (in the sense that it could only be extended by further axioms that defined new relations between its elements), Husserl now asked what conditions the unextended formal axiomatic system would have to fulfill to guarantee that all those consequences of the extended formal theory, which contained only concepts defined in the unextended system, were derivable in it. In other words, Husserl was looking for a condition that would make a formal theory perfect to such a degree that it could only be extended conservatively.43 In order to find this condition, Husserl first considered the following justification for the applicability of the imaginary: We are allowed to apply the imaginary [that is, the formal theory is definite]: if the imaginary is formally definable in a consistent extended deductive system and if 2) the original domain of deduction has the property that, if we formalize it, every sentence which falls into its domain is either true according to the axioms or false, i. e., contradicts the axioms.44
Since consistency is already entailed in Husserl’s concept of extension, the part relevant for us here is the second condition that demands completeness with respect to the original domain of an axiomatic system.45 In Anzahlenarithmetik, the original domain are the numbers (Anzahlen). Thus, Husserl demands that all sentences about the numbers be either true or false in the fomalized axiomatic system of Anzahlenarithmetik. He calls this particular type of completeness “relative definiteness” and defines it as follows: “A system of axioms is ‘definite’ in a relative manner if any sentence that is meaningful in it is decided in the limits of the domain of this system.”46 As a number of commentators have pointed out, this notion of definiteness is identical to the contemporary concept of semantic completeness, which states that an axiomatic system is semantically complete, if all true sentences about its domain follow from the axioms.47 This property of relative definiteness, however, does not solve the problem of imaginary magnitudes. Even if the unextended formal axiomatic system was definite relative to its domain of formal objects, this would not guarantee that every sentence derived in the extended system that is meaningful in the unextended system would also be a theorem in it. Since the extended system contains new objects and operations, it may allow a mathematician to derive sentences that are not provable in the unextended system, even if they are meaningful in it. In order to illustrate how this can happen, I will adapt an example given by Da Silva.48 Let us take formal arithmetic restricted to the positive integers as the unextended system A and an arithmetic containing the whole numbers as its extension B. In B we can prove the sentence “For any n, there is an m such that n + m = 0.” This does not hold in A, where we can show that no number is smaller that 0, i. e., “if n = 0, then there is no m such that n + m = 0.” In order to exclude cases like this one, Husserl introduces a second, more comprehensive, notion of completeness by demanding that “if a
78
Intuition and the Axiomatic Method
sentence is meaningful according to the axioms [of the unextended system], then the axioms also decide whether it is true or false.”49 He calls this concept of completeness “absolute definiteness.”50 As Da Silva has pointed out, absolute definiteness is not identical with the syntactic completeness of an axiomatic theory. Rather, absolute definiteness is a property of the domain of a formalized theory, a domain containing formal objects, which has the consequence that every grammatically correct sentence that can be formulated in the language of the theory is decided by its axioms. We can now summarise Husserl’s solution by saying that any extension of a formal axiomatic theory, which defines a manifold that is definite in the absolute sense, is a conservative extension. Under the conditions defined by Husserl this solution is indeed correct. In order to justify this, let us call the unextended axiomatic system (theory) A and its extension B. Husserl presupposes that A is perfect and can only be extended in such way that the theorems of A form a subset of the theorems of B. Since B is a consistent theory, none of its theorems can contradict any of the theorems of A; that is, theories A and B are consistent with each other. As a result, all of B’s theorems either follow from the axioms of A or are left undecided by them. Finally, Husserl’s demand that A be definite in the absolute sense guarantees that A leaves undecided only those sentences that are meaningless in it. Consequently, an axiomatic system, which defines a definite manifold, can only be extended conservatively; this means that all the consequences of the extended axiomatic system, which are meaningful in the unextended system, will also be theorems in it and thus express truths about the formal objects of its domain.51 Up to this point, I have dealt with Husserl’s notion of definiteness only in the context of arithmetic. However, he also believed that this notion provided a solution to the problem of the applicability of formal geometry. A first indication for this is that even in his notes to his G¨ottingen lecture he spoke about geometry and described Euclidean geometry as a special case of a broader concept of a formal space, namely, of Riemannian space. He wrote: Euclidean geometry is a concrete theory which becomes, if formalised, the theory-form, which we call theory of a threefold Euclidean manifold, and which in turn is only a special case of the systematically connected class of manifolds with variable curvature.52
Further, as we have already seen, in the Prolegomena to his Logische Untersuchungen Husserl uses geometry in order to elucidate the distinction between material and formal theories.53 Finally, in the second study of his “Drei Studien zur Definitheit,” he dealt almost exclusively with geometry and gave an example that showed that he did not restrict his search for definiteness to arithmetic. Let me outline this example here. Assume that a three-dimensional Euclidean manifold (ME ) is an extension of the manifold consisting of a plane and a given line, not lying in that plane (M0 ). Thus, ME consists of the elements of M0 plus additional elements contained by all the lines that are parallel to the given line. Husserl then demands: “The extension to ME must not disturb M0
Edmund Husserl on the Applicability of Formal Geometry
79
as that what it is, and, above all, must not specialize it, i. e., the defined determinations of ME must be a mere extension of those of M0 .”54 In other words, ME is an extension of M0 only if a reduction of ME to M0 contains only those relations between the elements of M0 that belonged to it before the extension. Therefore, the elements of M0 have to be defined in such a way that an extension to ME does not add further determinations. For Husserl, this is the case if M0 is a definite manifold. We can thus conclude that the notion of a definite manifold solves Husserl’s problem of the application of formal geometry. The results of any extended formal geometry will be valid for formalized Euclidean geometry and thus also for Euclidean geometry itself, if they are meaningful therein. G¨odel’s incompleteness results restrict the applicability of Husserl’s notion of the definiteness of a manifold to relatively simple systems and renders it wrong if applied to the whole of arithmetic or plane geometry. As many of his commentators have pointed out, Husserl’s solution to the problem of the applicability of the results of formal to material inquiry is thus only partly satisfactory. However, I think that, independent of his solution, Husserl’s genetic approach to geometry has two very interesting results. First, by raising the question of the applicability of formal geometry as one about the relationship between formal inquiry and intuitive space, he made plausible the idea that the applicability of formal geometry is not restricted to the space of physics. Rather, formal geometry can also tell us something about the structure of intuitive space. Thus, the standard answer that applied geometry is an interpretation of formal geometry solves only part of the applicability-problem connected to formal geometry. Second, by introducing his notion of the definiteness of a manifold, Husserl drew a distinction between concepts of space that were actually related to intuitive space and those that were not; only spaces that were defined by axiomatic systems that were extensions of the definite manifold of Euclidean geometry were spaces proper.55 According to this criterion, many axiomatic systems that contemporary mathematicians believe to define spaces are actually just mathematical constructs and do not express genuine concepts of space. Again, given G¨odel’s incompleteness results, Husserl’s criterion fails. Nevertheless, asking at which point formal geometry moves from a literal to a metaphorical sense of the term ‘space’ is still important, if one does not want to give up a genetic relation between the different concepts of space. In sum, in this paper, I have reconstructed Husserl’s view on the relationship between formal inquiry and the life-world. I think that this allows us to make plausible the thesis that the problem of the meaningfulness of the formal sciences as it is put forward in the Krisis is identical with the problem of the applicability of the formal sciences to the life-world. As we have seen, for Husserl this problem had to be framed as a twofold question about the relationships between formal inquiry and Euclidean space and between Euclidean geometry and intuitive space. Husserl’s remarks concerning the relation between intuitive space and Euclidean geometry were too sketchy to be properly evaluated. Likewise, Husserl’s notion of a definite manifold was not success-
80
Intuition and the Axiomatic Method
ful in saving the applicability of formal to material geometry. Nevertheless, it should be valued as an attempt to give a criterion that secures the connection between concepts of space at different levels of abstraction.
Notes 1. David Hilbert, Grundlagen der Geometrie, Leipzig: B. G. Teubner, 1899. 2. [“Ich merke dazu an. Frege versteht nicht den Sinn der Hilbertschen ‘axiomatischen Begr¨undung’ der Geometrie, n¨amlich daß es sich um ein rein formales System von Konventionen handelt, das sich der Theorieform nach mit dem Euklidischen deckt.”], Husserl’s commentary to a passage excerpted from a letter from Frege to Hilbert from 27/XII/99. Edmund Husserl, Philosophie der Arithmetik. Mit Erg¨anzenden Texten (1890–1901), Lothar Eley (ed.), Husserliana XII, The Hague: Martinus Nijhoff, 1970, p. 448. All translations in this paper are my own. It must be pointed out here that Husserl’s view on the nature of a formal axiomatic system was not identical with Hilbert’s. As I will argue in the first part of this paper, according to Husserl, a formal axiomatic theory was about formal objects. 3. Edmund Husserl, Die Krisis der europ¨aischen Wissenschaften und die transzendentale Ph¨anomenologie (1936), Walter Biemel (ed.), Husserliana VI, The Hague: Martinus Nijhoff, 1954. 4. These notes are published in: Edmund Husserl, Studien zur Arithmetik und Geometrie: Texte aus dem Nachlaß (1886–1901), Zweiter Teil. Philosophische Versuche u¨ ber den Raum (1886–1901), Ingeborg Strohmeyer (ed.), Husserliana XXI, The Hague: Martinus Nijhoff, 1983, p. 261–311. 5. Husserl originally developed the notion of a definite manifold in a lecture given at the G¨ottingen Mathematical Society in the winter term 1900/1 and in a number of studies. He never published the text of this lecture and it cannot be found in its final version in the Husserl archive. However, some materials containing notes and sketches of this lecture are published. See Das Imagin¨are in der Mathematik, Husserliana XII, pp. 431–451 and Drei Studien zur Definitheit und Erweiterung eines Axiomensystems, Ibid., pp. 452– 461. Husserl’s materials concerning his G¨ottingen lecture have also been published in a more complete and accurate form. Cf. Elisabeth Schuhmann and Karl Schuhmann, “Husserls Manuskripte zu seinem G¨ottinger Doppelvortrag von 1901,” Husserl Studies 17 (2001), 87–123. That Husserl ascribed a great significance to the ideas developed in this lecture is indicated by the fact that in both Ideen zu einer reinen Ph¨anomenologie und ph¨anomenologischen Philosophie and Formale und transzendentale Logik he mentions this unpublished text. See Edmund Husserl, Ideen zu einer reinen Ph¨anomenologie und ph¨anomenologischen Philosophie, Husserliana III, Walter Biemel (ed.), The Hague: Martinus Nijhoff, 1950, p. 137, Footnote 1, and Edmund Husserl, Formale und transzendentale Logik, Husserliana XVII, P. Jansen (ed.), The Hague: Martinus Nijhoff, 1974, p. 85. 6. Edmund Husserl, Mehrfache Bedeutung des Terminus Raum, Husserliana XXI, p. 270–274, p. 270. 7. He wrote, for example: “If we conceive of the elements and their relations as formally defined by their laws and their points of relation as not further specified, then we have a formal geometry, i. e. the form of a geometry.” [“Denken wir uns die Elemente und ihre Relationen formell durch ihre Gesetze definiert und die Beziehungspunkte im u¨ brigen unbestimmt, so haben wir eine formale Geometrie, d.i. die Form einer Geometrie.”], Edmund Husserl, Das Gebiet eines Axiomensystems/Axiomensystem-Operationssystem, Husserliana XII, pp. 470–488, p. 486. In the Prolegomena to his Logische Untersuchungen, Husserl calls the space defined by a formal theory a “categorial form.” Edmund Husserl, Logische Untersuchungen: Erster Band. Prolegomena zur reinen Logik (1900), E. Holenstein (ed.), Husserliana XVIII, The Hague: Martinus Nijhoff, 1975, p. 251. 8. [“Unter Raumvorstellung kann f¨urs erste gemeint sein der Raum der Anschauung, ich meine den Raum des außerwissenschaftlichen Bewußtseins, den Raum, wie ihn alle, ob Kinder oder Erwachsene, ob Gelehrte oder Laien, in lebendiger Wahrnehmung und Phantasie vorfinden.”], Edmund Husserl, Mehrfache Bedeutung des Terminus Raum, Husserliana XXI, p. 271. 9. Husserl uses the term ‘attitude’ (Einstellung) in Ideen zu einer reinen Ph¨anomenologie und ph¨anomenologischen Philosophie, Husserliana III, see, in particular, §27, pp. 49–50; In his notes to the Raumbuch, Husserl uses the term ‘interest’ (Interesse). Husserliana XXI, p. 264. 10. Edmund Husserl, Ding und Raum: Vorlesungen 1907, Ulrich Cleasges (ed.), Husserliana XVI. Husserl describes the constitution of the space of experience as an ideal object in chapter 13. 11. Husserl writes: “We called landscapes, trees, houses, and so on spatial unities and showed that they were not contents of momentary intuitions, but rather ideal objects.” [“Wir nannten Landschaften, B¨aume, H¨auser usw. r¨aumliche Einheiten und wiesen nach, daß sie nicht Inhalte von Momentanschauungen, sondern ideelle Objekte sind.”], Edmund Husserl, Der anschauliche Raum, Husserliana XXI, pp. 275–284, p. 281.
Edmund Husserl on the Applicability of Formal Geometry
81
12. [“Der Raum der ‘Anschauung’ ist vielmehr ein Komplex von Anschauungen, geeinigt, zu einem identisch Einen gemacht, durch <ein> Urteil, welches sich an gleichf¨ormige Weisen von Prozessen anschließt, sie fixiert.”], Husserliana XXI, p. 308. 13. [“Von diesem Raum der Anschauung ist zu scheiden der Raum des wissenschaftlichen Denkens, der geometrische Raum, ein begriffliches Gebilde logischer Bearbeitung der Raumvorstellung des außerwissenschaftlichen Bewußtseins, von dem nicht mehr gesagt werden kann, daß es anschaulich vorgestellt oder vorstellbar sei, sondern nur denkbar.”], Edmund Husserl, Mehrfache Bedeutung des Terminus Raum, Husserliana XXI, pp. 270–274, p. 271. 14. [“Der Ursprung der geometrischen Vorstellung vom Raume setzt bereits den Ursprung der geometrischen Grundbegriffe voraus. Denn erst durch diese Idealisierungen der urspr¨unglichen Begriffe von Gebilden, wie wir sie in der Anschauung finden, sind jene Quasi-Induktionen, die wir auch Idealisierungen nennen k¨onnen, m¨oglich. Der Idealbegriff, den wir auch geometrischen Raum nennen, ist aber definiert durch die Idealbegriffe und Ideals¨atze.”], Ibid. 15. [“Ein K¨orper l¨aßt sich, ohne seine Einheit zu verlieren, in Wirklichkeit oder Phantasie mannigfach teilen . . . Da jeder K¨orper physischer Teil des Gesamtraumes ist, so muß er gegen¨uber dem u¨ brigen Raum eine Grenze besitzen. Dies ist seine ‘Oberfl¨ache.’ In a¨ hnlicher Weise wie K¨orper lassen auch Fl¨achen physische Teilungen zu. Wir k¨onnen jede Fl¨ache, ohne ihre Einheit zu st¨oren, in zwei Fl¨achen zerst¨uckt denken, welche also keinen Fl¨achenteil gemein haben. Das R¨aumliche, das ihnen gemeinsam ist, und ‘wodurch’ sie aneinandergrenzen, heißt ‘Linie.’ Ebenso f¨uhrt die physische Teilung von Linien auf Grenzen, die man ‘Punkte’ nennt. Punkte sind r¨aumlich unteilbar. Edmund Husserl, Der anschauliche Raum, Husserliana XXI, pp. 275–284, p. 278. 16. Cf., Edmund Husserl, Ideen zu einer reinen Ph¨anomenologie und ph¨anomenologischen Philosophie. Erstes Buch, Husserliana III, pp. 138–140. 17. Husserl writes: “Wir schreiben dem Raum, so wie den r¨aumlichen Gebilden, Linien, Fl¨achen, ferner Richtungs¨anderungen, Abst¨anden, ‘innere Unendlichkeit’, d.h. ‘Stetigkeit’ zu.” [“We ascribe ‘inner infinity’, i. e. ‘continuity’ to space, as well as to spatial forms, lines, planes, changes in direction, and distances.”], Edmund Husserl, Zur Entstehung der idealen r¨aumlichen Vorstellung, Husserliana XXI, pp. 286–290, p. 286. 18. I want to point out that continuity in the sense of denseness is not sufficient for Euclidean geometry. 19. Husserl writes: “Admittedly, one can doubt whether one can have actual intuitions of geometric objects; or, rather, it is certain that this is not the case. Intuition and the empirico-spatial attitude contain points of departure and leading motives for the geometric formation of concepts. Yet, the abstract objects belonging to the concepts and the attributes of these concepts are not to be obtained simply through ‘abstraction’ (in the presently common sense of attentively emphasizing singular features) from intuitions. The concepts are not embedded in intuition like the seen shape in the seen ‘plane.’ The triangle as an intuited abstract object is not a geometric figure. The triangle serves the geometer as mere symbol whose characteristic type possesses dispositional connection in the geometer’s mind with the correlating pure concept and its ideal, merely ‘thought’, object.” [“Freilich wird man es bezweifeln d¨urfen, ob von den geometrischen Objekten in Wahrheit Anschauung m¨oglich ist, oder vielmehr ist es sicher, daß dies nicht der Fall ist. Die Anschauung und die empirisch-r¨aumliche Auffassung enth¨alt die Ausgangspunkte und die leitenden Motive f¨ur die geometrische Begriffsbildung, aber die den Begriffen zugeh¨origen abstrakten Gegenst¨ande und deren Attribute sind nicht einfach durch ‘Abstraktion’ (in dem jetzt u¨ blichen Sinn aufmerksamer Pointierung von Einzelz¨ugen) aus den Anschauungen zu gewinnen, sie liegen in diesen nicht eingebettet wie die gesehene Gestalt in der gesehenen ‘Fl¨ache.’ Das Dreieck als angeschautes Abstraktum ist keine geometrische Figur, es dient dem Geometer als bloßes Symbol, dessen charakteristischer Typus in seinem Geist dispositionelle Verkn¨upfung besitzt mit dem zugeh¨origen reinen Begriff und seinem idealen und bloß ‘gedachten’ Gegenstand.”], “Intentionale Gegenst¨ande,” (1894) Edmund Husserl, Aufs¨atze und Rezensionen (1890–1910), Husserliana XXI, Bernhard Lang (ed.), The Hague/Boston/London: Martinus Nijhoff, 1979, pp. 269–348, p. 327. 20. [“Sache der reinen Theorie ist die reine Deduktion. Sie fragt u¨ berall nicht, woher die Grunds¨atze ¨ kommen, sie assumiert sie. Uber den Erkenntniswert der geometrischen Grundbegriffe und Grunds¨atze mag Streit bestehen, die geometrischen S¨atze sind u¨ ber jeden Streit erhaben. Ganz nat¨urlich, da ihre G¨ultigkeit keine andere Bedeutung hat als die Triftigkeit der Konsequenz aus den geometrischen Grundannahmen.”], Edmund Husserl, “Assumption der Axiome in Geometrie, Mannigfaltigkeitslehre und reiner Mechanik,” in Edmund Husserl, Aufs¨atze und Rezensionen (1890–1910), Husserliana XXI, pp. 430–431, p. 431.
82
Intuition and the Axiomatic Method
21. Husserl expresses this in the following passage: “[The] geometer simply accepts the axioms and draws his consequences. At most, he restricts the number of basic concepts and axioms and attempts to fix a minimal number of them that are independent of each other and able to carry the entire system of deductions.” [“[Der] Geometer . . . nimmt die Grunds¨atze einfach hin und zieht seine Konsequenz. H¨ochstens, daß er die Zahl der Grundbegriffe und Grunds¨atze beschr¨ankt und eine Minimalzahl zu fixieren sucht, welche, voneinander deduktiv unabh¨angig das ganze System der Deduktionen zu tragen verm¨ogen.”], Ibid., p. 431. It is interesting to contrast Husserl’s view here with that of Moritz Pasch in his Vorlesungen u¨ ber neuere Geometrie from 1882. Pasch believed that the geometric axioms were rational extensions of statements about observable facts. In his construction of geometry he therefore first formulated certain fundamental observational statements about visible objects, which he called “nuclear sentences” (Kerns¨atze). Nuclear sentences were about physical bodies. For example, the term ‘point’ referred to a physical body whose subdivision into parts is incompatible with the limits set by perception. On the basis of these nuclear sentences, Pasch then constructed an axiomatic system whose justifications, logical derivations, were entirely independent of the original meanings of these terms. By doing so, he changed the meanings of the original terms. They became relational concepts that were exhaustively defined by the axioms. No appeal to perception was required in order to justify the theorems. In other words, the extension of the nuclear sentences to the axioms of a geometric theory rendered the latter a formal theory similar to Hilbert’s. Such a formal theory could then be applied to physical reality by interpreting its terms with the original objects. Husserl also thought that the construction of a geometric axiomatic system began with observations of perceptual objects. Yet, in contrast to Pasch, he did not believe that the idealization, and thus the subject matter of geometry, was a result of the axiomatization. Rather, according to Husserl, the subject matter of geometry (in the sense of material geometry) had to be constituted, at least partly, before one could construct an axiomatic theory that captured its properties. For Pasch’s view see his Vorlesungen u¨ ber neuere Geometrie, Leipzig: B.G. Teubner, 1882. 22. [“Mathematik im h¨ochsten und umfassendsten Sinn ist die Wissenschaft von den theoretischen Systemen u¨ berhaupt und in Abstraktion von dem, was in den gegebenen Theorien der verschiedenen Wissenschaften theoretisiert wird; abstrahieren wir bei irgendeiner gegebenen Theorie, bei irgendeinem gegebenen deduktiven System von seiner Materie, von den besonderen Gattungen von Objekten, auf deren theoretische Beherrschung sie es abgesehen hat, und substituieren wir den materiell bestimmten Objektvorstellungen die bloßen Formeln, also die Vorstellung von Objekten u¨ berhaupt, die durch solch eine Theorie, durch eine Theorie dieser Form beherrscht wird, so haben wir eine Verallgemeinerung vollzogen, welche die gegebenen Theorien als einen bloßen Einzelfall einer Theorie-Klasse auffaßt oder vielmehr einer Theorienform.”], Edmund Husserl, “Das Imagin¨are in der Mathematik,” Husserliana XII, pp. 430–440, p. 430. 23. [“Es ist also ein Gebiet, welches einzig und allein dadurch bestimmt ist, daß es einer Theorie solcher Form untersteht, d.h. daß f¨ur seine Objekte gewisse Verkn¨upfungen m¨oglich sind, die unter gewissen Grundgesetzen der und der bestimmten Form (hier das einzig Bestimmende) stehen. Ihrer Materie nach bleiben die Objekte v¨ollig unbestimmt — der Mathematiker spricht, dies anzudeuten, mit Vorliebe von ‘Denkobjekten.’ Sie sind eben weder direkt als individuelle oder spezifische Einzelheiten, noch indirekt durch ihre inneren Arten oder Gattungen bestimmt, sondern ausschließlich durch die Form ihnen zugeschriebener Verkn¨upfungen.” Edmund Husserl, Logische Untersuchungen, Erster Band, Husserliana XVIII, p. 250. The same passage is also to be found in: Edmund Husserl, Formale und transzendentale Logik, Husserliana XVII, p. 94. 24. [“Nennen wir Raum die bekannte Ordnungsform der Erscheinungswelt, so ist nat¨urlich die Rede von ‘R¨aumen’, f¨ur welche z.B. das Parallelenaxiom nicht gilt, ein Widersinn. Ebenso die Rede von verschiedenen Geometrien, wofern Geometrie eben die Wissenschaft vom Raume der Erscheinungswelt genannt wird. Verstehen wir aber unter Raum die kategoriale Form des Weltraums und korrelativ unter Geometrie die kategoriale Theorienform der Geometrie im gemeinen Sinn, dann ordnet sich der Raum unter eine gesetzlich zu umgrenzende Gattung von rein kategorial bestimmten Mannigfaltigkeiten, mit Beziehung auf welche man dann naturgem¨aß von Raum in einem noch umfassenderen Sinne sprechen wird.”], Edmund Husserl, Logische Untersuchungen. Erster Band. Prolegomena zur reinen Logik (1900), p. 251. 25. Cf. Logische Untersuchungen. Erster Band, Husserliana XVIII, p. 249; Formale und transzendentale Logik, Husserliana XVII, p. 95. 26. [“Erhebt man sich zum reinen Operationssystem, verl¨aßt man das urpr¨ungliche reale Objekt-Gebiet, ob es Strecken oder Anzahlen sind, und faßt in allgemeinster Allgemeinheit ein Gebiet u¨ berhaupt ins Auge, das durch solche Operationsformen definiert ist, dann kann man die Idee eines solchen Gebietes verschiedentlich modifizieren, bald im Sinn eines weiteren, bald in dem eines engeren Operationssystems bzw. Axiomensystems.”], Edmund Husserl, Das Imagin¨are in der Mathematik, Husserliana XII, p. 430– 440, p. 437. See also Husserl’s concept of Erweiterung eines Axiomensystems, ibid., p. 439.
Edmund Husserl on the Applicability of Formal Geometry
83
27. Husserl writes: “Mathematik ist also ihrer h¨ochsten Idee nach Theorienlehre, die allgemeinste Wissenschaft von den deduktiven Systemen u¨ berhaupt.” Ibid., p. 431. 28. Ibid., p. 431. The distinction between three different concepts of space — intuitive, geometric, and formal — is also still present in the Krisis. Cf., Edmund Husserl, Die Krisis der europ¨aischen Wissenschaften und die transzendentale Ph¨anomenologie (1936), Husserliana VI, pp. 54–56. 29. [“Die formale Mathematik will das Instrument konkret mathematischer Entdeckungen sein; wie die alte Quantit¨atsmathematik das Gr¨oßeninstrument der naturwissenschaftlichen Forschung, n¨amlich das Instrument deduktiver Theoretisierung der verschiedenen Gebiete physischer Erkenntnisse, f¨ur welche die Induktion die passenden Lehrs¨atze geliefert hatte, so will die neue formale Mathematik nicht nur dasselbe, sondern sehr viel mehr leisten. Sie will Methoden von unvergleichlich gr¨oßerer Allgemeinheit und Kraft schaffen, welche alle methodischen Arbeiten real mathematischer Art entbehrlich macht.”], Edmund Husserl, Das Imagin¨are in der Mathematik, pp. 430–440, p. 432. 30. Husserl writes: “Pure geometry is a purely a priori science. Whether it can find some application depends on whether an empirical concept of space can be correlated to, or rather, subsumed under, its pure concept of space. But, in fact, this is actually impossible. Among the objects of our experience there is no such thing as a point in the sense of pure geometry, that is, something that is spatially indivisible.”[“Die reine Geometrie ist eine reine apriorische Wissenschaft. Ob sie nun irgendeine Anwendung finden kann, das h¨angt davon ab, ob ihrem reinen Raumbegriff ein empirischer gegen¨ubergestellt oder vielmehr subsumiert werden kann. Das ist aber tats¨achlich nicht m¨oglich. Einen Punkt im Sinne der reinen Geometrie, also ein r¨aumlich Unteilbares gibt es nicht unter den Objekten unserer Erfahrung.”], Edmund Husserl, Philosophische Versuche u¨ ber den Raum (1886–1901), Husserliana, XXI, p. 296. 31. [“Der Geometrie der Idealit¨aten ging voraus die praktische Feldmeßkunst, die von Idealit¨aten nichts wußte. Solche vorgeometrische Leistung war aber f¨ur die Geometrie Sinnesfundament, Fundament f¨ur die große Erfindung der Idealisierung.”], Die Krisis der europ¨aischen Wissenschaften und die transzendentale Ph¨anomenologie, Husserliana, VI, p. 52. 32. [“In Konnex mit der Meßkunst tretend und nunmehr sie leitend, hat die Mathematik — damit von der Welt der Idealit¨aten wieder zur empirisch anschaulichen Welt herabsteigend — gezeigt, daß man universal an den Dingen der anschaulich-wirklichen Welt, und zwar auch nach der sie als Gestaltenmathematik allein interessierenden Seite (an der alle Dinge notwendig teilhaben), eine objektiv-reale Erkenntnis von einer v¨ollig neuen Art, n¨amlich eine approximativ auf ihre eigenen Idealit¨aten bezogene, gewinnen kann.”], Ibid., p. 33. 33. [“Das eine [definite Mannigfaltigkeit] definierende axiomatische System ist dadurch ausgezeichnet, daß jeder aus den in diesem auftretenden Begriffen (Begriffsformen nat¨urlich) rein logisch-grammatisch zu konstruierende Satz (Satzform) entweder ‘wahr’, n¨amlich eine analytische (rein deduktive) Konsequenz der Axiome, oder ‘falsch’ ist, n¨amlich ein analytischer Widerspruch: tertium non datur.”], Edmund Husserl, Formale und transzendentale Logik, Husserliana VXII, p. 100. 34. Cf., Ulrich Majer, “Husserl and Hilbert on Completeness,” Synthese 110 (1997), pp. 37–56. 35. My exposition of Husserl’s notion of a definite manifold relies on Jairo Jose Da Silva and Ulrich Majer. See Jairo Jose Da Silva, “Husserl’s Two Notions of Completeness,” Synthese 125 (2000), pp. 417– 438 and Ulrich Majer, “Husserl and Hilbert on Completeness.” 36. The problem of imaginary mathematical objects, thus, arises from Husserl’s earlier project in Philosophie der Arithmetik of providing an intuitive foundation for arithmetic. The connection between Husserl’s original project and his G¨ottingen lecture is analyzed in Ulrich Majer’s article “Husserl and Hilbert on Completeness.” 37. [“Es sei ein Gebiet von Objekten gegeben, in welchem durch die besondere Natur der Objekte Verkn¨upfungs- und Beziehungsformen bestimmt sind, die sich in einem bestimmten Axiomensystem aussprechen. Aufgrund dieses Systems, also aufgrund der besonderen Natur der Objekte haben bestimmte Verkn¨upfungsformen keine reale Bedeutung, d.h. es sind widersinnige Verkn¨upfungsformen. Mit welchem Recht darf das Widersinnige rechnerich verarbeitet, mit welchem Recht kann also das Widersinnige in deduktiven Systemen verwandt werden, als ob es Einstimmiges w¨are. Wie l¨aßt sich erkl¨aren, daß sich mit dem Widersinnigen nach Regeln operieren l¨aßt, und daß, wenn das Widersinnige aus den S¨atzen herausf¨allt, die gewonnenen S¨atze richtig sind.”], Edmund Husserl, Das Imagin¨are in der Mathematik, Husserliana XII, pp. 430–447, p. 433. 38. [“Daß wir nicht den Anzahlbegriff willk¨urlich erweitern k¨onnen, ist sicher. Aber wohl k¨onnen wir den Anzahlbegriff verlassen und durch das formale System der f¨ur Anzahlen geltenden Definitionen und Operationen einen neuen, rein formalen Begriff definieren, den der positiven ganzen Zahlen. Und dieser formale Begriff der positiven Zahlen l¨aßt sich, so wie er selbst durch Definition abgegrenzt ist, durch neue Definitionen, und zwar widerspruchsfrei, erweitern.], Ibid., p. 435. 39. Husserl considers the explanations of mathematicians like Boole and Dedekind.
84
Intuition and the Axiomatic Method
40. Edmund Husserl, Zur formalen Bestimmung einer Mannigfaltigkeit, Husserliana XII, p. 493–500. In particular, p. 495. 41. “The system of axioms is kept fixed for the old domain of objects. Yet, new objects are being defined and a new system of axioms is being constructed in such as way that if we restrict it to the old domain of objects, it transforms into the old axiomatic system. But we do not need to demand a perfection in which such extension is not possible.” [“Das Axiomensystem wird festgehalten f¨ur das alte Gebiet. Es werden aber neue Objekte definiert und ein Axiomensystem so konstruiert, daß dasselbe bei Beschr¨ankung auf das alte Gebiet in das alte u¨ bergeht. Aber Perfektion in dem Sinn, daß eine solche Erweiterung nicht m¨oglich sein soll ist nicht zu fordern.”], Husserliana XII, p. 474. 42. [“Im Begriff der Erweiterung liegt, wenn wir ihn auf die Axiome beziehen, daß die neuen Axiome die alten mitumfassen, und daß die neuen zudem Operationsf¨alle zulassen, welche die alten ausschließen.”], Edmund Husserl, “Das Imagin¨are in der Mathematik”, pp. 430–440, p. 439. 43. Let A and B be theories in the languages L and L’. Then B is a conservative extension of A if B ∩ L = A (i. e. all theorems of B in the language of A are already theorems of A). See, for example, Dirk van Dalen Logic and Structure, Berlin, Heidelberg, New York, Tokyo: Springer Verlag, 1983, p. 107. 44. [“Ein Durchgang durch das Imagin¨are ist gestattet: 1) wenn das Imagin¨are sich in einem konsistenten umfassenden Deduktionssystem formal definieren l¨aßt und wenn 2) das urspr¨ungliche Deduktionsgebiet formalisiert die Eigenschaft hat, daß jeder in dieses Gebiet fallende Satz entweder aufgrund der Axiome dieses Gebietes wahr oder aufgrund derselben falsch, d.i. mit den Axiomen widersprechend ist.”], Ibid., p. 441. 45. Ibid. 46. [“Relativ definit ist ein Axiomensystem, wenn jeder nach ihm sinnvolle Satz in Beschr¨ankung auf sein Gebiet entschieden ist.”], Ibid., p. 440. 47. See, for example, Dieter Lohmar, Ph¨anomenologie der Mathematik: Elemente einer ph¨anomenologischen Aufkl¨arung der mathematischen Erkenntnis, Dordrecht/Boston: Kluwer Academic Publishers, 1989. 48. Jairo J. Da Silva, “Husserl’s Two Notions of Completeness,” p. 422. 49. [“Hat ein Satz Sinn gem¨aß den Axiomen, dann ist er durch die Axiome in Wahrheit oder Falschheit entschieden.”], Edmund Husserl, Drei Studien zur Definitheit und Erweiterung eines Axiomensystems, Husserliana XII, pp. 452–461, p. 454. 50. [“Absolut definit ist ein Axiomensystem dann, wenn jeder nach ihm sinnvolle Satz u¨ berhaupt entschieden ist.”], Edmund Husserl, “Das Imagin¨are in der Mathematik”, pp. 430–440, p. 440. 51. For a more explicit account of Husserl’s solution to the problem of imaginary magnitudes see Ulrich Majer, “Husserl and Hilbert on Completeness,” p. 47. 52. [“Die Euklidische Geometrie ist eine konkrete Theorie, welche formalisiert die Theorieform ergibt, die wir als dreifache Euklidische Mannigfaltigkeit bezeichnen, und diese wieder ist nur ein Einzelfall aus der systematisch zusammenh¨angenden Klasse der Mannigfaltigkeit von variablem Kr¨ummungsmaß.”] Edmund Husserl, “Das Imagin¨are in der Mathematik”, pp. 430–440, p. 431. 53. Edmund Husserl, Logische Untersuchungen. Erster Band. Prolegomena zur reinen Logik, Husserliana XVIII, pp. B150–51. 54. [“Die Erweiterung zu ME soll M0 als das, was es ist, nicht st¨oren, und vor allem nicht spezialisieren, d.h. die definitorischen Bestimmungen f¨ur ME m¨ussen eine bloße Erweiterung sein f¨ur diejenigen von M0 .”], Edmund Husserl, “Drei Studien zur Definitheit,” Husserliana XII, pp. 452–469, p. 460. 55. This is particularly important for Husserl’s analysis of the a priori structure of intuitive (perceptual) space. In Ding und Raum, for example, he argues that the fact that intuitive space is constituted in a kineasthetic system necessitates its a priori structure. We can intuit “thing-like” objects only if the space in which they are located affords movement without deformation. And this is possible, according to the Helmholtz-Lie theorem, only if the space has a constant curvature. But since the Helmholtz-Lie theorem is based on an abstract concept of space, namely on the Riemannian manifold, this argument requires that the latter express a genuine concept of space, and be more than just a formal construct. See Edmund Husserl, Ding und Raum. Vorlesungen 1907, Husserliana XVII, p. 243.
References van Dalen, D. (1983), Logic and Structure, Springer, Berlin. Da Silva, J. J. (2000), “Husserl’s Two Notions of Completeness” in: Synthese 125, 417–438. Hilbert, D. (1899), Grundlagen der Geometrie, Teubner, Leipzig.
Edmund Husserl on the Applicability of Formal Geometry
85
Husserl, E. (1886–1901), Studien zur Arithmetik und Geometrie. Texte aus dem Nachlaß (1886– 1901). Zweiter Teil. Philosophische Versuche u¨ ber den Raum (1886–1901), Husserliana XXI, I. Strohmeyer (ed.), Martinus Nijhoff, The Hague, 1983. Husserl, E. (1890–1901), Philosophie der Arithmetik. Mit Erg¨anzenden Texten (1890–1901), Husserliana XII, L. Eley (ed.), Martinus Nijhoff, The Hague, 1970. Husserl, E. (1890–1910), Aufs¨atze und Rezensionen, Husserliana XXI, B.Lang (ed.), Martinus Nijhoff, The Hague, 1979. Husserl, E. (1900), Logische Untersuchungen. Erster Band. Prolegomena zur reinen Logik, Husserliana XVIII, E. Holenstein (ed.), Martinus Nijhoff, The Hague 1975. Husserl, E. (1907), Ding und Raum. Vorlesungen 1907, Husserliana XVII, U. Claesges (ed.), Martinus Nijhoff, The Hague, 1973. Husserl, E. (1936), Die Krisis der europ¨aischen Wissenschaften und die transzendentale Ph¨anomenologie, Husserliana VI, W. Biemel (ed.), Martinus Nijhoff, The Hague, 1954. Husserl, E. (1950), Ideen zu einer reinen Ph¨anomenologie und ph¨anomenologischen Philosophie, Husserliana III, W. Biemel (ed.), Martinus Nijhoff, The Hague, 1950. Husserl, E. (1974), Formale und transzendentale Logik, Husserliana XVII, P. Jansen (ed.), Martinus Nijhoff, The Hague, 1974. Lohmar, D. (1989), Ph¨anomenologie der Mathematik: Elemente einer ph¨anomenologischen Aufkl¨arung der mathematischen Erkenntnis, Kluwer, Dordrecht/Boston. Majer, U. (1997), “Husserl and Hilbert on Completeness” in: Synthese 110, 37–56. Pasch, M. (1882), Vorlesungen u¨ ber neuere Geometrie, Teubner, Leipzig. Schuhmann, E. and K. Schuhmann (2001), “Husserls Manuskripte zu seinem G¨ottinger Doppelvortrag von 1901” in: Husserl Studies 17, 87–123.
THE NEO-FREGEAN PROGRAM IN THE PHILOSOPHY OF ARITHMETIC∗ William Demopoulos University of Western Ontario, Canada
1.
The methodology of Fregean logicism
Traditional “Fregean” logicism held that arithmetic would be shown to be free from any dependence on intuition once its basic laws were seen to follow from logic together with explicit definitions. It would then be clear that our knowledge of arithmetic is knowledge of the same character as our knowledge of logic, since an extension of a theory (in this case the “theory” of second order logic) by mere definitions cannot have a different epistemic status from the theory of which it is an extension. If the original theory consists of analytic truths, so also must the extension; if our knowledge of the truths of the original theory is for this reason a priori, so also must be our knowledge of the truths of its definitional extension. The uncontroversial point for traditional formulations of the doctrine is that a reduction of this kind secures the sameness of the epistemic character of arithmetic and logic, while allowing for some flexibility as to the nature of that epistemic character. Thus, it is worth remembering that in Principles (p. 457), Russell concluded that a reduction of mathematics to logic would show, contrary to Kant, that logic is just as synthetic as mathematics.1 The methodology underlying this approach to securing the aprioricity of arithmetic by a traditional logicist reduction has been questioned. For example, Paul Benacerraf,2 who focuses on Hempel’s3 classic exposition, tells us that . . . logicism was. . . heralded by Carnap, Hempel. . . and others as the answer to Kant’s doctrine that the propositions of arithmetic were synthetic a priori. . . . in reply to Kant, logicists claimed that these propositions are a priori because they are analytic — because they are true (or false) merely “in virtue of” the meanings of the terms in which they are cast. ... ∗I
wish to thank the Social Sciences and Research Council of Canada for support of my research.
87 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 87–112. © 2006 Springer. Printed in the Netherlands.
88
Intuition and the Axiomatic Method According to Hempel, the Frege-Russell definitions. . . have shown the propositions of arithmetic to be analytic because they follow by stipulative definitions from logical principles. What Hempel has in mind here is clearly that in a constructed formal system of logic (set theory or second-order logic plus an axiom of infinity), one may introduce by stipulative definition the expressions ‘Number,’ ‘Zero,’ ‘Successor’ in such a way that sentences of such a formal system using these introduced abbreviations and which are formally the same as (i.e., spelled the same way as) certain sentences of arithmetic — e.g., ‘Zero is a Number’ — appear as theorems of the system. He concludes. . . that these definitions show the theorems of arithmetic to be mere notational extensions of theorems of logic, and thus analytic. He is not entitled to that conclusion. Nor would he be even if the theorems of logic in their primitive notations were themselves analytic. For the only things that have been shown to follow from the theorems of logic by [stipulative definitions] are the abbreviated theorems of the logistic system. To parlay that into an argument about the propositions of arithmetic, one needs an argument that the sentences of arithmetic, in their preanalytic senses, mean the same (or approximately the same) as their homonyms in the logistic system. That requires a separate and longer argument.
Benacerraf is questioning whether the logicist can claim to have established any truth of arithmetic on the basis of a successful reduction. What is required, according to Benacerraf, is a supplementary argument showing that the logicist theorems have the preanalytic meanings of their ordinary arithmetical analogues. But Benacerraf’s demand for a further argument is not justified. To successfully dispel the belief that the laws of logic cannot have an arithmetical content it suffices to show that the concepts of logic allow for the explicit definition of notions which, on the basis of logical laws alone, demonstrably satisfy the basic laws of arithmetic. The philosophical impact of the discovery that the concepts and laws of logic have an arithmetical content in this sense would not in any way be diminished by the observation that the preanalytic meanings of the primitives of arithmetic are not the same as their logicist reconstructions. The sense in which the logicist thesis must be understood in order to be judged successful cannot therefore be the one for which Benacerraf claims Hempel must argue. Notice also that independently of one’s view of meaning and truth in virtue of meaning, it must be conceded that traditional logicism would have provided a viable answer to Kant if it had succeeded in showing that the reconstruction of arithmetical knowledge requires only an extension of logic by explicit definitions. Hempel’s appeal to these notions addresses a different issue: Frege left the problem of securing the epistemic basis of the laws of logic largely untouched. Benacerraf’s Hempel should be understood as proposing to fill this gap by suggesting that the laws of logic are true in virtue of the meanings of the logical constants they contain. Like Frege, Hempel sought to secure the aprioricity of arithmetic by an argument that proceeds from its analyticity. But Hempel’s version of logicism differs from Frege’s (for whom ‘analytic’ merely meant belonging to logic or a definitional extension of logic) by providing a
The Neo-Fregean Program in the Philosophy of Arithmetic
89
justification for the analyticity of logical laws: logical laws are analytic, not by fiat as on Frege’s account, but because they are true in virtue of the meanings of the logical terms they contain. From this it would follow that if logical laws are true in virtue of meaning, so also is any proposition established solely on their basis, where ‘established solely on their basis’ is intended to encompass the use of explicit definitions. The clarity of the thesis that the laws of logic are true in virtue of meaning is therefore central to Hempel’s presentation of the view. Also central is the substantive and further claim that the basic laws of arithmetic can be recovered within a definitional extension of logic. The implied criticism of Hempel’s appeal to truth in virtue of meaning gains its force from the difficulties that stand in the way of establishing the traditional logicist thesis that arithmetic is reducible to logic in the original sense of the doctrine. Certainly, the failure to sustain this thesis led to more ambitious applications of the notion of truth in virtue of meaning. But if the basic laws of arithmetic had been recovered as a part of logic — not merely shown to have analogues that are part of some formal system or other, but to be part of logic — what more would be needed to infer that they share the epistemic status of logical laws? Once Hempel is not represented as seeking to secure the truth of the basic laws of arithmetic by an appeal to the derivability of mere formal analogues or a blanket appeal to the notion of truth in virtue of meaning, it is clear that he simply doesn’t owe us the argument Benacerraf claims he does. The difficulties that attend traditional logicism are therefore not the methodological difficulties Benacerraf advances, but the simple failure to achieve the stated aim of showing arithmetic to be a definitional extension of logic. This point is obscured by Benacerraf’s suggestion that the reduction might proceed from second order logic with an axiom of infinity or from some version of set theory. Neither theory supports the truth in virtue of meaning account that underlies Hempel’s formulation of logicism. On the most natural interpretation, a reduction to second order logic with Infinity would mean a reduction to a system augmented with an axiom like Whitehead and Russell’s; but no one ever thought such a system was true in virtue of meaning. As for a reduction to set theory, set theory is properly regarded as the arithmetic of the transfinite. Why should a reduction of the natural numbers to set theory be regarded as a means of establishing its aprioricity on a less synthetic footing? The only coherent logicist methodology would therefore seem to be the one just outlined: to reduce arithmetic to a theory like Begriffsschrift’s. Unfortunately, such a theory is either too weak, or in the presence of Frege’s theory of classes, inconsistent.
2.
The neo-Fregean alternative
In Frege’s Conception of Numbers as Objects,4 Crispin Wright showed that it is possible to extract from Frege’s Grundlagen a valid second-order proof of
90
Intuition and the Axiomatic Method
the Dedekind infinity of the natural numbers using only a suitable formalization of the following “partial contextual definition” for numerical identity: For any concepts F and G, the number of Fs is identical with the number of Gs if and only if the Fs and the Gs are in one-one correspondence.5 Wright argued that this proof can form the basis of a defensible, if modified, formulation of Frege’s philosophy of arithmetic, one that relies on Frege’s thesis that numbers are objects, but makes no use of the problematic theory of classes. Bob Hale’s Abstract Objects,6 which appeared in 1987, is perhaps the next important landmark in the elaboration of a “neo-Fregean” philosophy of arithmetic along the lines of Frege’s conception. In an influential series of papers, George Boolos clarified and extended Wright’s discussion and its relation to the central mathematical argument of Grundlagen. Boolos’s designation of the partial contextual definition as “Hume’s principle” has since become established in the literature,7 as has his suggestion that the theorem that the second-order theory whose sole axiom is Hume’s principle (“Frege arithmetic”) has a definitional extension which contains Peano Arithmetic, be called “Frege’s theorem.” The neo-Fregean program of Wright and Hale seeks to imbed this logical discovery into a philosophically interesting account of our knowledge of arithmetic by subsuming Hume’s principle under a general method for introducing a concept by an “abstraction principle.”This program explains the epistemological interest of the discovery that arithmetic is a part, not of second order logic, but of Frege arithmetic, by the program’s account of conceptintroduction by abstraction. The key to achieving this goal is the idea that abstraction principles have a distinguished status: they are a special kind of stipulation. Their stipulative character shows them to be importantly like explicit definitions even if their creativeness suggests an affinity with axioms; and it is a central tenet of neo-Fregean logicism that abstraction principles are sufficiently like definitions to yield an elegant explanation, along traditional lines, of why arithmetical knowledge is knowledge a priori. By an “abstraction principle”, the neo-Fregean means the universal closure of an expression of the form Σ(X) = Σ(Y ) ↔ X Y , where is an equivalence relation, the variables X and Y may be of any order, and the function Σ may be of mixed type. In the case of Hume’s principle, the equivalence relation is the (second order definable) relation on concepts of one-one correspondence, and the “cardinality function” is a map from Fregean concepts to objects. As we will see in greater detail later, it is important for this program that concept formation by means of an abstraction principle should be distinguished from the ordinary practice of axiomatization and, indeed, that the case be made that not all cases of concept formation by an abstraction principle are on a par.
The Neo-Fregean Program in the Philosophy of Arithmetic
91
In the case of Hume’s principle, the basic idea appears to be the following: Hume’s principle is a stipulation which gives truth conditions for a restricted class of statements of numerical identity, namely, those of the form, the number of F s is identical with the number of Gs. Although the specification of truth conditions is partial — it concerns only a restricted class of statements of numerical identity — the resulting explanation of the concept of number is complete in so far as it suffices for the (second order) derivation of the basic laws of arithmetic. As Wright has emphasized,8 whether or not Hume’s principle counts as an analytic truth is entirely subordinate to its status as a stipulation governing our application of the concept of number. Since concept formation by Hume’s principle, like concept formation in accordance with any other abstraction principle, involves the introduction of a new concept, it would be a mistake to view the principle as analytic of the concept of number in the way in which the notion analytic of has traditionally been understood. Rather, the idea is that Hume’s principle is a trivial consequence of the nature of the concept of number because it is a stipulation governing the introduction of that concept (rather than the analysis of a pre-existing concept). The stipulative character of the principle is important, since it is this feature which allows it to fulfill (albeit in a restricted and modified sense) logicism’s promise to deliver the aprioricity of arithmetic from logic plus some species of stipulation — i.e., from assumptions that will uncontentiously secure arithmetic as a priori. Wright originally called his view “number-theoretic logicism” because it derives the basic laws of arithmetic from a stipulation governing the concept of number without, however, providing an explicit definition of the individual numbers or of immediately precedes on the basis of a purely logical vocabulary. Although number-theoretic logicism falls short of the goal of explicitly defining the vocabulary of arithmetic in purely logical terms, it proceeds on the basis of a single stipulation, and the “explanation” of numerical identity by Hume’s principle is achieved in terms of a concept of pure (second order) logic. There can be no question of the importance of Hale’s and Wright’s contributions and of the vitality they have imparted to a subject long regarded as having at most historical interest. Wright’s rediscovery of Frege’s theorem has been the major impetus to the reevaluation of Frege’s philosophy of arithmetic, and his and Hale’s defense of a modified form of Fregean logicism has elicited a critical reaction that has prompted a substantial clarification of the neo-Fregean position.9
3.
Is Hume’s principle correctly represented as a stipulation?
For reasons that will soon become evident, the first class of objections that have been urged against the neo-Fregean program are known as “bad company” objections. Wright addresses two such bad company objections to his
92
Intuition and the Axiomatic Method
position. First, concept introduction by an abstraction principle can fail, and can fail spectacularly, as the case of the abstraction principle expressed by Grundgesetze’s Basic Law V showed. But (the objection goes) how can we accept a view which invites us to rely upon a methodology which is known to be seriously flawed? And secondly, given our freedom to stipulate one or another abstraction principle, it is necessary to supplement the account with a criterion capable of governing the choice of one abstraction over another.10 Apropos of the first objection, Wright makes a convincing case that the fact that a methodology sometimes leads to flawed conclusions does not mean that it is itself irremediably flawed, and he reminds us that Hume’s principle is consistent relative to analysis.11 Why should it be necessary to show that the procedure will work whatever abstraction principle is employed? One difficulty with this response is that it tends to assimilate concept introduction by abstraction to the case of ordinary axiomatization, and this makes it difficult to appreciate Wright’s insistence that what he is proposing is sharply different from what he calls “the mere axiomatic stipulation of existence.” The contrast with axiomatic stipulation is important and I will return to it. However, the consistency of Hume’s principle fails to settle the second “bad company” objection which effectively raises the question whether, and in what sense, it is true. For suppose we grant that Hume’s principle stipulates or lays down the truth conditions for certain statements of numerical identity. Is the truth of Hume’s principle merely a matter of stipulation? That this is how Wright intends to be understood is suggested by several passages: . . . [a] state of affairs is initially given to us as the obtaining of a certain equivalence relation. . . ; but we have the option, by stipulating that the abstraction is to hold, of so reconceiving such states of affairs that they come to constitute a new kind of thing. . . (p. 277).
That Wright views abstraction principles as stipulations is reinforced by a contrast he draws between the basis for our knowledge of the truth of Hume’s principle and our knowledge of the existence of the numbers: abstract objects are not creations of the mind, brought into being by a kind of stipulation. What is formed — created — by such an abstraction is rather a concept: the effect is merely to fix the truth conditions of identity statements concerning a new kind of thing, and it is quite another question whether those truth conditions are ever realized (p. 278).
According to Wright, the representation of Hume’s principle as a kind of stipulation does not prejudice the question whether the existence of the numbers is a matter of stipulation. In fact, Wright maintains that the existence of the numbers is something discovered rather than stipulated, while holding that our a priori knowledge of their necessary existence is ultimately derived from a principle whose truth is a matter of stipulation. This is entirely credible as a view of conventional explicit definitions, but it cannot be so easily maintained in the case of a contextual definition like Hume’s principle. It is precisely because ordinary definitions are not creative
The Neo-Fregean Program in the Philosophy of Arithmetic
93
that we can say that anything, expressed in unabbreviated notation, which is established or discovered on their basis does not really depend on them, and, in consequence, does not share the conventional character of their epistemic basis. But even if Hume’s principle can be regarded as a stipulation, it is certainly not a stipulation of this character, since it is creative over the theory to which it is added.12 Indeed, this is precisely the point where the analogy between “concept introduction via an abstraction principle” and the methodology of concept introduction by ordinary explicit definition can break down, with the result that concept introduction by abstraction becomes difficult to distinguish from axiomatic stipulation. Hale and Wright13 emphasize the difference between treating Hume’s principle as a stipulation — something which they hold to be unproblematic — and treating the existence of the numbers as a matter of stipulation, which they agree is problematic. Their point is that Hume’s principle merely lays down partial satisfaction conditions for the relation of numerical identity, thereby establishing the existence of the relation in analogy with the use of axioms as implicit definitions of the terms they introduce. Hale and Wright insist that it is not part of their position that the stipulation of such satisfaction conditions should secure the existence of the objects related by the relation of numerical identity; rather, their existence is secured by a proof, the proof that there are objects related by the relation thus introduced. Indeed, the discovery of this proof gives the sense in which they hold that the existence of the numbers is “discovered.” Thus understood, their methodology differs from the “axiomatic stipulation of existence” that is embodied in, for example, the set-theoretic practice of simply laying down a comprehension axiom and claiming, on its basis, to have secured the existence of whatever sets — whatever “objects” — the axiom allows, while excluding whatever sets the axiom disallows. But the methodology Hale and Wright are here reporting is a straightforward variation on another thoroughly familiar and unexceptionable set-theoretic practice. In standard axiomatic set theory, we freely lay down a defining condition and then proceed to verify — to prove, on the basis of our axioms of set existence — that our definition is not vacuous and that we have not simply redefined the empty set. For example, we are free to define the power set of a set A by the familiar use of class-abstract notation, P (A) = {B : B ⊆ A}. But the proof that the set we have defined is the intended one and not just the empty set requires an axiom — the power set axiom, ∃B ∀C (C ∈ B ↔ C ⊆ A) — to ensure the existence of the appropriate set. It should be clear that in the absence of an account of our basis for believing the set-theoretic axioms, this methodology does not give — and does not pretend to give — an account of the basis for our beliefs about the existence of sets with the intended membership, so that if the set-theoretic axioms are represented as stipulations, this is naturally assumed to transfer to the basis for our belief in the sets whose exis-
94
Intuition and the Axiomatic Method
tence and non-vacuousness we are able to prove on their basis. In the absence of Hume’s principle, we can, following Frege, lay down defining conditions for all of the individual numbers. What we cannot do without Hume’s principle is prove that the numbers we have defined are distinct from zero and from one another. Now except for the fact that they are working in Frege arithmetic rather than Zermelo-Fraenkel set theory, this is exactly the methodological situation described by Hale and Wright. Why should their claim to have given us an account of the basis for our belief in the numbers be viewed any differently? In particular, if the set theoretic situation supports the idea that the basis for our belief in the existence of sets with the intended membership is the same as the basis for our believing the truth of the axioms — the same, that is, in the sense that by stipulating the truth of the axioms, we leave the existence of the intended sets resting on a stipulation, — then if Hume’s principle is represented as a stipulation, the existence of the numbers must also be regarded as resting on a stipulation. Perhaps the stipulative character of Hume’s principle should be thought of on the model of a reference-fixing stipulation, after the manner of Saul Kripke’s example of the introduction of a term like ‘meter.’ The idea would be that we stipulate that at time t, bar b is a meter long, from which it follows that at t, the length of b falls under the concept meter. Although we have laid down a stipulation (a reference-fixing stipulation), in doing so we have also succeeded in making a factual assertion, namely that b has a particular length at t — something that obtained before the convention was laid down, and something that will continue to obtain even if the convention is withdrawn. Might not something similar hold for concept introduction by an abstraction, so that it, too, may be seen to consist in laying down a stipulation while at the same time having a factual content? Couldn’t Wright argue that the opposition — fact vs. convention — has been overdrawn, obscuring the fact that the stipulative character of Hume’s principle, on which he has correctly insisted, does not preclude it from having a factual character as well? The difficulty with this response is that while Kripke’s model of reference-fixing may suffice to show that it is not in general true that stipulations are incapable of a factual content which is independent of the stipulations themselves, it is of no help in clarifying how the stipulative character of concept-introduction by an abstraction is capable of accommodating the factual content of a principle like Hume’s. This is because the factual content of the assertion that bar b is a meter long is carried by the possibility of displaying or ostending, independently of the reference-fixing stipulation, the length to which the concept of a meter is to be applied. But this possibility is precisely what is lacking in the account of the number-theoretic case in terms of concept introduction by abstraction. The idea that the truth of Hume’s principle is a matter of a stipulation that is unconstrained by any “antecedently determinate truth” allows Wright to sidestep the difficulties which confront showing that it is true or that it correctly captures our pre-analytic notion of numerical identity. However this comes at
The Neo-Fregean Program in the Philosophy of Arithmetic
95
a cost: the problem is that there are many abstractions, all of them satisfiable, but relative to certain assumptions, not necessarily mutually satisfiable. For example, in “The Standard of Equality for Numbers,” Boolos introduced several sentences with very different model-theoretic properties from one another, but with an equal claim to being treated as abstraction principles that introduce a class of abstract singular terms. In “On the Philosophical Significance of Frege’s Theorem,” Wright considers a modification of one of Boolos’s examples in order to motivate a constraint on “admissible” abstraction principles. The example involves a type-lowering function which takes two concepts to the same object just in case their symmetric difference is finite. I.e., the mapping n from concepts to objects satisfies the condition: n(F) = n(G) if and only if [x: x is F but not G or G but not F] is finite. It can be shown that the resulting abstraction principle, which Wright calls NP (for Nuisance Principle), holds in finite domains but fails if the domain of individuals is infinite and the range of the concept variables is the full power set of this domain.14 To see the difficulty that such a “bad company” example poses, assume (for whatever reason) that we are committed to a standard interpretation of second order logic. Then were we to adopt the stipulation embodied in NP, we would be restricted to models having only finitely many objects. Since Hume’s principle holds only in infinite domains, our adoption of NP would preclude us from stipulating that numerical singular terms should be used in accordance with Hume’s principle. But if we take seriously the idea that abstraction principles are merely stipulations governing the use of the singular terms they introduce — i.e., that they are conventions which we freely lay down — we might easily defeat the truth of Hume’s principle by an “incorrect” initial choice of abstraction principle, one satisfiable only in finite domains. This would occur if, for example, we had first chosen the stipulation embodied in NP. But if abstraction principles are stipulations, and thus, Hume’s principle just one stipulation among many, how can we make sense of the idea that there is a right initial choice? If the truth of an abstraction is a matter of stipulation, then the existence of a domain sufficiently large to contain the numbers would seem to depend upon which abstraction happened to be laid down first, so that whether the domain of objects contains a sub-domain capable of modeling the basic laws of arithmetic would come to rest on an arbitrary decision of ours. This is clearly a conclusion Wright and Hale do not wish to endorse and they recognize that what might be called the “quasi-conventionalist” features of their approach make it incumbent upon them to articulate a principled division among abstractions. One proposal Wright has explored — there are many other conditions which Wright and Hale have investigated in their insightful search for a theory of good abstractions — is that an abstraction is acceptable only if it satisfies a “conservativeness” requirement according to which an abstraction principle is acceptable if it does not constrain the cardinality
96
Intuition and the Axiomatic Method
of concepts with whose introduction it is not itself explicitly concerned. But to know that Hume’s principle constrains only the cardinality of the numbers, we must know of the cardinality function, not only that it associates concepts with objects, so that, in particular, non-equinumerous concepts are associated with distinct objects, we must also know that the objects thus associated are numbers. Otherwise, how are we to maintain, in accordance with the conservativeness condition, that it constrains only the numbers?15 The problem of demarcating Hume’s principle from “bad” abstraction principles on the basis of conservativeness would seem to presuppose a solution to the problem of demarcating the numbers from other objects on the basis of principles internal to number-theoretic logicism — the so-called “Julius Caesar problem.” Wright could respond16 to this charge by observing that it assumes we possess an unreconstructed concept of number to which Hume’s principle is responsible. But the whole point of concept introduction by abstraction, he will say, is to explain how numbers may be introduced without admitting to any such responsibility. There is therefore a certain internal coherence to Wright’s program that makes it resistant to criticisms of this sort. Nevertheless, even if it should turn out to be possible to draw a principled division between good and bad abstractions, the point of developing a theory of good abstractions rests on the assumption that an abstraction principle as rich as Hume’s is adequately represented as a stipulation.17 Recall that the key philosophical idea of Grundlagen — the idea that underlies the revival of Frege’s program — is the notion that numerical singular terms are “referential” because with Hume’s principle we are in possession of a clear criterion by which we can say when the same number has been “given to us” in two different ways, as the number of one or another concept. Once the truth of this criterion of identity has been secured, we are entitled by the context principle (Only in the context of a sentence does a word have reference [Bedeutung]) to infer that numerical singular terms refer. Our “access” to the numbers is therefore mediated by our recognition of the truth of Hume’s principle, and this same principle also serves as the only substantive premise in the proof of their infinity. Frege’s account of our access or reference to the numbers thus relies fundamentally on our recognition of the truth of Hume’s principle. What is needed to make this account viable is a satisfactory explanation of the sense in which Hume’s principle is true, and an account of how it is known to be true, since this is what is required to infer, by the context principle, that numerical singular terms refer. To my mind, number-theoretic logicism has not adequately addressed the consequences of treating Hume’s principle as a stipulation which we freely lay down. Instead it has concentrated on the question whether the surface grammar of the left-hand side of the principle can be taken seriously, whether, in other words, the apparent deployment of numerical singular terms as singular terms is justified. As Wright puts it, one has “to read the left-hand sides of the appropriate abstraction principles not merely as notational variants of the
The Neo-Fregean Program in the Philosophy of Arithmetic
97
right-hand sides, but in a way which is constrained by their surface syntax” (p. 271). Once this is established, Wright argues, no further question can arise over whether such terms genuinely refer, and there is “no good sense in which their reference could be stigmatized as semantically idle” (p. 270). But the difficulty, as I see it, is with the transition from the characterization of the truth of Hume’s principle as a simple stipulation to its deployment as a truth having significant existential implications. Since it is the latter characterization of Hume’s principle that its use in conjunction with the context principle relies upon, this transition is clearly required; nevertheless, it is deeply perplexing and constitutes the chief stumbling-block to accepting neo-Fregean logicism’s claim to have secured or explained the reference of numerical singular terms.
4.
Has the neo-Fregean explained the aprioricity of arithmetic?
The neo-Fregean program has a methodological dimension that parallels the role of the theory of definition in traditional logicism. Frege accords a statement the status of a proper definition if it meets conditions of eliminability and conservativeness. The classical Fregean theory of definition is supplemented by the neo-Fregean methodology of good abstractions. Thus, the theory of definition mandates that a definitional extension must be conservative in the familiar sense of not allowing the proof of sentences formulated in the unextended vocabulary which are not already provable without the addition of the definitions which comprise the extension. But “extensions by abstraction” need not be conservative in this sense; indeed interesting extensions are interesting precisely because they are not conservative in the sense of the theory of definition. The neo-Fregean theory of good abstractions allows for classically non-conservative extensions — extensions which properly extend the class of provable sentences — but imposes a constraint on the consequences an extension by good abstraction principles can have for the ontology of the theory to which they are added. This methodology is constrained and principled, it is just not constrained in the same way as the classical theory of definition. We can, perhaps, put the difference by saying that the constraints the classical theorist imposes on definition have a more purely epistemic motivation than do the constraints the neo-Fregean imposes on good abstractions. The reticence of the classical theory of definition to allow a mere definition to properly extend the theory to which it is added is well-founded, and it is at least arguable that it should also inform the epistemic basis of a principle as rich as Hume’s. But suppose we put these doubts to one side and ask whether the neo-Fregean account of Hume’s principle as a kind of stipulation can support the epistemological claim of neo-Fregeanism to have secured the aprioricity of our arithmetical knowledge as a consequence of its resting on a stipulation. The matter is taken up by Hale and Wright in their paper “Implicit Definition and A Priori Knowledge” — and by Wright in his “Is Hume’s Prin-
98
Intuition and the Axiomatic Method
ciple Analytic?” which, notwithstanding its title, is not concerned to secure the analyticity of Hume’s principle in any of several traditional senses of the notion but to address the question of its epistemic status within the neo-Fregean program and the light this can shed on our arithmetical knowledge. Wright and Hale use the stipulative character of Hume’s principle as a premise in an argument for the aprioricity of our arithmetical knowledge. In acquiring the concept of number, we acquire a criterion of identity for number — a criterion for saying when the same number has been given to us in two different ways as the number of one or another concept. This criterion of identity — Hume’s principle — is the only non-logical premise needed to derive the basic laws of arithmetic. Our arithmetical knowledge is secured, therefore, with our grasp of the concept of number and is based on nothing more than what we acquire when we are introduced to the concept. But since this knowledge rests on a stipulation, it is knowledge a priori. This is essentially the same account of the epistemological interest of Frege arithmetic that is elaborated by Fraser MacBride in two thoughtful papers18 that address the issue. For MacBride the neo-Fregean explanation of the aprioricity of our knowledge of arithmetic runs as follows: We first stipulate a criterion of identity for a special kind of objects; call them cardinal numbers. That certain fundamental truths about these objects are established on the basis of a stipulation guarantees that our knowledge of those truths is knowledge a priori. This is to be contrasted with an account which would seek to infer the aprioricity of our knowledge of arithmetic from theses about meaning or truth in virtue of meaning. The neo-Fregean account does not depend on a traditional notion of analyticity; since neo-Fregeanism demands only the relatively uncontentious concession of the aprioricity of a stipulation, it can claim that its explanation of the aprioricity of arithmetic need not address the difficulties associated with defending traditional conceptions of analyticity. The fact that the reduction to Frege arithmetic requires more than a merely definitional extension of second order logic suggests an objection to neo-Fregean logicism that is closely related to the one we saw Benacerraf urge against to Hempel: how, one might ask, does the neo-Fregean account of the character of our knowledge of Frege arithmetic bear on our knowledge of ordinary arithmetic? In so far as the epistemological issues are issues concerning ordinary arithmetic, have they even been addressed by the neo-Fregean? There are two neo-Fregean answers to this objection. The first holds that it is because ordinary arithmetic can be “modeled” in Frege arithmetic that the epistemological status of the truths of Frege arithmetic is shared by the truths of ordinary arithmetic. Wright remarks (p. 322) that this answer is too weak. And although MacBride does not endorse this response, neither does he reject it as altogether unlikely. Nevertheless, I think it is worth recording exactly why such a straightforward answer, couched in terms of the relatively unproblematic relation of modeling, can’t be right.
The Neo-Fregean Program in the Philosophy of Arithmetic
99
It is clearly possible to stipulate the conditions that must obtain for the properties of a purely hypothetical and imaginary “abstract” physical system to hold without in any way committing ourselves to the existence — or even the dynamical possibility — of such a system. Our knowledge that such abstract systems are configured in accordance with our stipulations is no less a priori than our knowledge that, for example, the four element Boolean algebra has a free set of generators of cardinality one. But it sometimes happens that abstract configurations “model” actual configurations, in the sense that there is a correspondence between the elements of the two systems that preserves fundamental properties. It is clear that in such circumstances we take ourselves to know more than that an imaginary example has the properties we stipulate it to have: if the example is properly constructed, we know the dynamical behavior of a part of the physical world. But of course the fact that an actual system is “modeled” by our imaginary system, together with the fact that our knowledge of the properties of our imaginary system is a priori knowledge because it depends only on our free stipulations, are completely compatible with the claim, obvious to pre-analytic intuition, that our knowledge of the dynamical behavior of the actual system is a posteriori. Whatever role stipulation may have in fixing the properties of the abstract system by which the behavior of some real process is modeled, it lends no support to the idea — and would never be regarded as lending support to the idea — that our knowledge of the real process is knowledge a priori. There is a disanalogy between the number-theoretic case and our example that might seem to undermine its effectiveness as a criticism. In the numbertheoretic case the existence of the correspondence between ordinary numbers and the “Frege-numbers” that model them is known a priori. But the correspondence between the abstract system of our example and the actual system is not known a priori; it depends on the a posteriori knowledge that there are in reality configurations of particles having the postulated characteristics. This is of course entirely correct. However it is of no use to the neo-Fregean, since to know a priori that there is a mapping between the ordinary numbers and the Frege-numbers it is necessary to have a priori knowledge of the existence of the domain and co-domain of the mapping. To be of any use to the neo-Fregean, the fact that the Frege-numbers model the ordinary numbers therefore requires that our knowledge of the ordinary numbers be a priori. But if the modeling of the ordinary numbers by the Frege-numbers presupposes that our knowledge of the ordinary numbers is a priori, it cannot be part of a non-circular account of why ordinary arithmetic is known a priori. The general point may be put as follows: the fact that M models N , so that for any sentence s, s is true in M if and only if s is true in N , does not entitle us to infer that because the sentences true in M are known a priori, the sentences true in N are known a priori. Indeed it is perfectly possible that (with the obvious exception of the logical truths) our knowledge of sentences true in N is wholly a posteriori. So even if we grant that an assumption rich
100
Intuition and the Axiomatic Method
enough to secure an infinity of objects is correctly represented as a stipulation, it remains unclear how the neo-Fregean can use this fact to answer the question which motivates his account of arithmetical knowledge — it remains unclear how it yields an account of our knowledge of the numbers, knowledge that we have independently of the neo-Fregean analysis. Let us turn now to the second neo-Fregean reply. This is an alternative to the response based on modeling. Here the idea is that since Frege arithmetic captures the “patterns of use” exhibited by our ordinary number-theoretic vocabulary, both in pure cases and in applications, we are justified in inferring not merely that the Frege-numbers model the ordinary numbers but that the Fregenumbers are the ordinary numbers, and that, therefore, the epistemic status of Frege arithmetic can illuminate the epistemic character of ordinary arithmetic. Frege arithmetic is justified as a reconstruction if it captures the fundamental features of the judgements — pure and applied — that we make about the numbers. But to support the epistemological claim of the neo-Fregean, the reconstruction must not only capture arithmetical concepts by recovering the patterns of use to which they conform, it must also illuminate the epistemic status of our pure arithmetic knowledge. Having the status of a stipulation is not, of course, a characteristic of Hume’s principle that is recoverable from our use of our arithmetical vocabulary, but is something the neo-Fregean reconstruction imposes on the principle in order to illuminate the basis for our knowledge of the propositions derivable from it. But it is unclear what follows from having captured the pattern of use of an expression by a principle that, in the reconstruction of the knowledge claims in which that expression figures, is regarded as a stipulation. The implicit — and undefended — neo-Fregean assumption is that we may suppose that the epistemological characteristics of a stipulation are shared by the knowledge claims that are reconstructed as following from it. But to establish that the epistemic basis for our unreconstructed arithmetical judgements resides in the stipulative character the neo-Fregean analysis assigns to Hume’s principle, it is not enough to show that Frege arithmetic captures patterns of use. The essential point is not all that different from what we have already noted when discussing the response based on modeling and can be appreciated by an example that is not all that different from the one cited in that connection. To see this, suppose the world were Newtonian. We could then give a reconstruction of our knowledge of the mechanical behavior of bodies by laying down Newton’s laws as stipulations governing our use of the concepts of force, mass and motion.19 But the fact that in our reconstruction the basic Newtonian laws have the status of stipulations would never be taken to show that they are in any interesting sense examples of a priori knowledge. Why then should the fact that the neo-Fregean represents Hume’s principle as a stipulation be taken to show that arithmetic is known a priori? The neo-Fregean reconstruction of the patterns of use of expressions of arithmetic leaves the epistemic status of the basic laws of arithmetic as unsettled as
The Neo-Fregean Program in the Philosophy of Arithmetic
101
it was on the suggestion that Frege arithmetic merely models ordinary arithmetic. Neither reconstruction supports the epistemological claim of the neoFregean to have accounted for the aprioricity of our knowledge of arithmetic. Whether that account is put forward as a theory within which ordinary arithmetic can be modeled, or whether it is said to capture the patterns of use of our number-theoretic vocabulary, it fails to have the direct bearing on the epistemic basis of our arithmetical knowledge that the neo-Fregean supposes it to have. Showing that Hume’s principle is correctly represented as a stipulation may be one route to securing it as a truth known a priori, but it is questionable whether, proceeding in this way, the task of revealing the proper basis for the aprioricity of arithmetic is made any easier than it would be by general reflection on why its basic laws are plausibly represented as known truths. Hale and Wright have replied to this argument: Let ‘Newton’ denote the conjunction of Newton’s laws as ordinarily understood, and ‘NewStip’ denote the (perhaps typographically indistinguishable) conjunction of the corresponding stipulations taken as introducing certain concepts of force, mass and motion. Then Demopoulos’s claim is — or ought to be, if the parallel is to be damaging — that while we may, by laying down NewStip, acquire some a priori knowledge (in some sense, knowledge about (some things we are calling) force, mass and motion), we obviously do not thereby acquire a priori knowledge of Newton — as we ought to do, if we can, in just or essentially the same fashion, acquire a priori knowledge of truths of ordinary arithmetic by stipulating Hume’s Principle, etc. . . . [But clearly,] the mere possibility of regarding (the sentences which formulate) Newton’s laws as stipulations introducing concepts of force mass and motion (as distinct from generalisations to which bodies conform) does not, and cannot, by itself justify the claim that NewStip ‘captures a pattern of use’ exhibited by ‘ordinary’ statements of Newtonian dynamics.20
But it was never claimed that regarding Newton’s laws as stipulations is what justifies the contention that NewStip captures a pattern of use. Since what justifies the contention that Frege arithmetic preserves a pattern of use is that it recovers the deductive structure of a body of pure and applied unreconstructed knowledge claims, the point at issue is whether, by representing certain principles as recoverable from a stipulation, a reconstruction sheds any light on their epistemic status. The comparison with the Newtonian case makes it transparent that from the fact that we can recover a pattern of use from a stipulation, nothing follows regarding the aprioricity or otherwise of our knowledge of the principles being reconstructed. The situation is, of course, entirely different if, in accordance with the methodology of Fregean logicism, it had proved possible to recover arithmetic from logic plus explicit definitions.
5.
Securing the truth of Hume’s principle
However sound the contrast of the methodology of concept introduction by an abstraction principle with ordinary axiomatization, the characterization of a fundamental principle like Hume’s as a kind of stipulation is certainly reminiscent of Hilbert’s account of his axioms as implicit definitions. In his correspon-
102
Intuition and the Axiomatic Method
dence with Hilbert and in (the first of) his series of essays on the foundations of geometry,21 Frege argued that by treating axioms as definitions Hilbert precluded taking them to express truths in the intended and “generally accepted” sense, since if Hilbert’s axioms are genuine axioms, and therefore for Frege, are not merely true but known to be true, then the reference of their constituent expressions must be settled independently of the axioms themselves. Hilbert’s response consisted in resisting the notion that his axioms express truths, arguing that they need only be susceptible of an interpretation under which (as we might say) “they come out true.”22 A curious feature of this debate is that Frege’s own suggestion for approaching the question of reference to abstract objects in terms of the context principle and the use of one or another contextual definition seems vulnerable to the same circularity with which he charged Hilbert over the matter of treating axioms as implicit definitions.23 There is even a sense in which Frege’s application of the context principle to contextual definitions, requiring, as it does, that contextual definitions are true simpliciter — “absolutely true,” as opposed to merely “true under an interpretation”— is a more appropriate target of these criticisms than Hilbert’s use of the doctrine of implicit definition. And indeed, the role of Hume’s principle in Frege’s ana lysis of our reference to the numbers appears no less ambiguous than the role Hilbert assigned to axioms. Frege had argued, against Hilbert, that treating axioms as definitions puts us in the position of having a single equation with two unknowns; but we might equally well ask, regarding a comparable circularity inFrege’s own methodology, how it is possible to know of Hume’s principle that it is true if we have not first specified the reference of its constituent expressions. The difficulty confronting a development of Frege’s philosophy of arithmetic which relies on the contextual definition and the context principle is that it requires Hume’s principle to fulfill roles that are typically in tension with one another. It must, first of all, be a substantive truth, one which implies the basic laws of arithmetic and forms the philosophical basis of our knowledge of them, a point which may be put by saying that it is the principle’s account of our application of the numbers in cardinality judgements which underlies our knowledge of their infinity. Secondly, it is supposed to give the sense of the cardinality operator — a function usually reserved for definitions, or, in any case, for sentences which are stipulated or laid down. I believe there is a way of understanding the truth of Hume’s principle which addresses both the ambiguity of its status as a stipulation vs. a substantive claim and the apparent circularity in Frege’s methodology — without, however, having to accept the conventionalist features of Hale and Wright’s account. In order to motivate this suggestion, it will be worthwhile to reflect, for a moment, on a feature of our ordinary model-theoretic analysis of logical truth, according to which a logical truth is one that is true in all models. Even though this analysis appeals to truth-in-a-model, and therefore assumes the notion of interpretation or reference, in the case of the logical truths this dependence turns out to be inessential, since the holding of a logical truth in
The Neo-Fregean Program in the Philosophy of Arithmetic
103
some particular model does not distinguish that model from any other structure. By contrast, an ordinary mathematical truth, such as the commutative law of group theory, holds precisely in those structures which are commutative groups. Hence, the truth-in-a-model of a mathematical law, as opposed to a logical law, depends essentially on the reference of (at least one of) its constituent expressions. Thus, even though the use of the model-theoretic framework to explain the truth(-in-a-model) of a statement requires the notion of reference, it allows that the situation is importantly different according to whether we are considering logical or non-logical truths; briefly put, the logical truths make no “special” demands on reference. This is one way of spelling out the topic-neutrality of logical truth in a model-theoretic setting. To put the point another way: the referential demands of the logical truths are reflected in the model-theoretic framework as a whole, rather than in any part of it. It is in this sense that the topic-neutrality of the logical truths shows them to have a special status vis a´ vis the existential commitments of the model-theoretic framework: the account of the logical truths requires only the minimal existential commitment that any model would meet. Logical truths are sometimes said to be “constitutive principles” in the sense that, together with principles of inference, they are constitutive of the meanings of the logical operators. But the topic-neutrality of the logical truths that emerges within our model-theoretic setting affords a slightly different and somewhat more traditional way of viewing their a priori character. Since the logical truths fail to distinguish one model from another — since every model is constrained to satisfy them — their truth is independent of the particular model in which the truth or otherwise of sentences of whatever character is being evaluated. In particular, the holding of the logical truths in a structure is independent of whether or not it models the facts of experience. The aprioricity which the topic-neutrality of the truths of logic supports can be plausibly extended to the epistemic status of Hume’s principle without, however, requiring that we take a stand on the vexed question of whether Hume’s principle is a logical truth. Although there are of course models in which Hume’s principle fails, we will simply assume that any model which contains sortal concepts, interrelated as our sortal concepts are, allows for the “unrestricted” application24 of the cardinality operator, thereby generating the “skeleton” of concepts deployed by Frege in the proof of his theorem. By a skeleton of concepts we mean any family of sortal concepts F defined over a domain D together with an assignment to each F in of an element of D (“F ’s number”) such that The concepts under which the numbers assigned to concepts themselves fall belong to ; is closed under finite concepts, where the notion of finiteness may be any one of the several weak notions of finiteness (e.g., Kuratowskifiniteness) adapted to the case of concepts;
104
Intuition and the Axiomatic Method
For any concepts F and G belonging to , the number of F s is the same as the number of Gs if and only if F and G are in one-one correspondence. Any model containing a skeleton of concepts also contains “the” cardinal numbers and the concepts associated with them in the development of Frege’s proof of his theorem. Even if we never achieve a complete specification of the model which the world comprises, it would suffice, so far as this account of our knowledge of arithmetic is concerned, that we should know of any candidate that it is constrained in this way. Under these circumstances Hume’s principle is true — not just consistent, not just true in some model or other, but true in that model, truth in which coincides with truth. But Hume’s principle is not merely true in a model that represents our absolute concept of truth: it would be true in any model containing a skeleton of concepts. Its generality consists in the fact that it holds in every such structure, whether or not the structure models the facts of experience. The correctness of Frege’s analysis of numerical identity in terms of the cardinality operator and the relation of oneone correspondence is a feature of every family of sortal concepts belonging to a possible model for our conceptual structure. Hume’s principle therefore enjoys a status in the class of models containing such skeletons of concepts which is entirely analogous to the status of the truths of logic in the class of all models. Just as a logical truth makes no demand on truth and reference which is not already implicit in every other truth, the referential demands of the truth of Hume’s principle are held in common by every truth, and in analogy with the truth of the laws of logic and the model-theoretic framework, its truth is reflected in our conceptual framework as a whole. It might be thought that if Hume’s principle is true, we are justified, by the context principle, in holding to a limited realism concerning the cardinal numbers. But the realism to which we are entitled fails to be a complete vindication of Frege’s program because the basic laws of arithmetic derivable from Hume’s principle allow us to capture the natural numbers at most only up to isomorphism; in particular, using only the resources of the context principle and the contextual definition, we are unable to characterize “the” cardinal numbers. There is therefore a sense in which the realism concerning the numbers which these considerations vindicate is attenuated relative to that which attaches to realism regarding the physical world: we do not think of the furniture of the world as specifiable only up to isomorphism but regard it as something we will have captured uniquely, should we succeed in characterizing the model which the physical world comprises. Frege seems also to have partially conceded this qualification when he sought an account of the numbers that would single out one natural sequence of numbers25 from all the rest by characterizing the numbers as extensions which comprehend all their applications. But if the numbers are captured only “structurally,” there is a certain conventionalism which attaches to the assertion of their existence: as far as the basic laws of arithmetic derived from Hume’s principle are concerned, any ω-sequence
The Neo-Fregean Program in the Philosophy of Arithmetic
105
will serve as the sequence of natural numbers; but this is a feature without parallel in our conception of the constituents of the physical world. A complete vindication of Fregean realism would require being able to distinguish “the” natural numbers, something demanding resources going beyond the contextual definition.26 Nevertheless, we have achieved something more than the conclusion that the contextual definition is merely true-in-a-structure. We have explained how it might be seen as true; as such, it would have a distinguished status among mathematical principles, as would the referential commitments of its constituent expressions.
6.
The philosophical interest of Frege Arithmetic
Leaving aside the historical question of Frege’s view in Grundlagen, I am claiming that a novel and plausible philosophy of arithmetic is at least suggested by the work, one which is both separable from his logicism and free of the conventionalism of later developments. On this interpretation of Hume’s principle, the significance of Frege’s theorem is this: the contextual definition is a constraint on the classes of models that are capable of adequately representing the structure of our sortal concepts, the relation of numerical identity and our absolute notion of truth, a constraint which bears a direct analogy to that imposed by the logical truths on the class of all structures. It is not, however, a constraint which is arbitrarily laid down, like a stipulation governing the introduction of a new concept. Rather, the sense in which the contextual definition is “definitional” is that it advances an analysis of a concept in use, namely our preanalytic notion of numerical identity. The argument for the correctness of this analysis depends on several important and non-trivial intermediate conclusions which are argued for throughout the course of Grundlagen. The most significant of these are: (i) the fundamental thought of §46, according to which a statement of number involves the predication of something of a concept; (ii) the assumption that the numbers are arguments to concepts of first level and that the cardinality operator is therefore properly interpreted by a type-lowering function; and (iii) the contention that as many as and, therefore, sameness of number are to be understood in terms of one-one correspondence. By Frege’s theorem, in order for any such structure to satisfy Hume’s principle, it must contain “the” numbers. The existence of the numbers — the “limited realism” of this view — is justified to the degree that we are justified in assuming the correctness of this analysis and of the supposition that our notion of truth is modeled by a structure in which it is reflected. Putting to one side the problem of establishing the aprioricity of arithmetic on a correct basis, a compelling argument that Frege arithmetic captures preanalytic intuitions about the numbers can be marshaled: Since the DedekindPeano axioms codify our pure arithmetical knowledge, their derivability constitutes a condition of adequacy which any account of our knowledge of number should fulfill. By Frege’s theorem, Frege arithmetic fulfills this condition
106
Intuition and the Axiomatic Method
of adequacy. But what makes Frege arithmetic an interesting analysis of the concept of number is that it not only yields the Dedekind-Peano axioms, it derives them from an account of the role of the numbers in our judgements of cardinality — from our foremost application of the numbers. As such, it is arguably a compelling philosophical analysis of the concept of number.27 Once the project of securing a correct analysis is divorced from the project of securing a body of truths as analytic or a priori, neither the fact that Frege arithmetic satisfies our condition of adequacy nor the fact that it connects the pure theory of arithmetic with its applications — essential as each is to securing it as a correct analysis of number — directly addresses the question of the epistemic status of our knowledge of arithmetic. This conclusion is not particularly surprising. The neo-Fregean program of Hale and Wright is a variant on the methodology of reconstruction associated with Carnap. For Carnap the thesis that arithmetical knowledge is non-factual, and therefore, a priori, was not in serious doubt. Since the aim of a Carnapian reconstruction is simply to delimit more precisely the extension of a predicate, it cannot be expected of such a reconstruction of our arithmetical knowledge that it will in any way justify the claim that our knowledge of arithmetic is a species of a priori knowledge. It is precisely in respect of their epistemological significance that Fregean logicism and neo-Fregean logicism — reduction to logic by explicit definition vs. the Hale-Wright reconstruction in which Hume’s principle occurs as a non-definitional stipulation — come apart. Although I have appealed to the notion analytic of, arguing that Hume’s principle is analytic of the notion of numerical identity, I have said nothing of its status as an analytic truth. Indeed, I have conceded that Hume’s principle need not be regarded as a truth of logic and have thereby renounced one means of securing its analyticity. I have also argued that it is not a mere convention because it fails to possess the triviality of a stipulation — something which the traditional use of the doctrine of analyticity was supposed to support. On the present account, Hume’s principle is a general truth, not just in the sense that it holds universally of everything in the relevant domain of quantification, but in the stronger sense that it is true in every “world,” i.e. in every model which is capable of representing our conceptual structure. While it is possible to introduce a notion of analyticity according to which a principle like Hume’s would come out analytic insofar as it satisfies this notion of generality, it is unclear what would be gained from doing so. Certainly there is nothing in such a notion of analyticity that would justify the usual conclusions that have been thought to follow from such a characterization: e.g., it would not follow that the principle is trivial and it would not follow that it is without existential commitment or “factual content.” The plausibility of the idea that Hume’s principle is analytic of the concept of numerical identity depends on the plausibility of a conceptual analysis; but the truth of the principle that expresses this analysis depends on the presuppositions of the framework of which the analysis is an analysis. These presuppositions are rather strong, and it is on
The Neo-Fregean Program in the Philosophy of Arithmetic
107
their satisfaction, rather than its analyticity of the relation of numerical identity, that the truth of Hume’s principle depends. How, then, are we to answer the question, ‘Is Hume’s principle analytic’? We should answer “Yes,” if our goal is to emphasize that the principle expresses the result of a conceptual analysis. The significance of such a positive answer is that it vindicates an important methodology, the method of analysis exhibited in Grundlagen, the first great work of the analytic tradition. If, however, the point of the question is to suggest that the principle is distinguished in some manner other than what is implied by its generality — that it is a convention of language or without significant existential presupposition — our answer must be “No.”
7.
A foundation for arithmetic?
By way of conclusion, it might be appropriate to address a concern Dummett has expressed, if only to further clarify what is and is not achieved by the view defended here. According to Dummett, Frege’s approach to the numbers is fundamentally misguided. Russell’s discovery of the contradiction in Frege’s theory of classes served only to bring the difficulty into sharper focus. But Frege’s basic approach would have been problematic even if no inconsistency had been discovered, since there is an unacceptable circularity in Frege’s procedure: the abstraction principle which introduces the numbers contains an implicit first order quantifier, so the numbers introduced on the left occur within the range of the (first order) variables bounded on the right in the explicit definition of one-one correspondence. Moreover, this feature is essential if Hume’s principle is to support the proof of the infinity of the number sequence, since that proof turns on the possibility of forming, for each n, a concept under which the numbers up to (and including) n all fall. But this is possible only if the numbers fall within the range of the first order variables, otherwise we will not have allowed for the formation of the concepts on which the success of the argument depends. Hence, if it is to support the proof of Frege’s theorem, Hume’s principle would seem to require that the numbers are “given to us” independently of the abstraction principle or contextual definition which introduces them — contrary to the promise of the neo-Fregean’s “abstractionist” methodology that it would replace the metaphor of the numbers as something “ostended in intuition” with an altogether different and nonmetaphorical account of our knowledge of them. Dummett concludes28 that not only Frege, but also neo-Fregeans who have succeeded him, are unable to solve the two problems to which logicism originally directed its attention: to secure the existence of the objects of number theory, and to show what our conception of a countably infinite domain rests upon, without in either case relying on an appeal to intuition or the facts of experience. Does our account fare any better? To see in what sense it does, it is necessary to distinguish two rather different ways of viewing the significance of Frege’s theorem. If we think of the proof as exhibiting a “construction”29 of the numbers, then in order to avoid circularity,
108
Intuition and the Axiomatic Method
it would be illegitimate to assume at an “earlier” stage of the construction the existence of numbers which arise only at some “later” stage of the construction. If, however, we view the proof less ambitiously, as a verification that Hume’s principle implies the Dedekind infinity of the numbers, so that we are therefore justified in claiming of it both that it captures a central feature of our notion of number and that it reveals the assumptions on which our conception of their infinity may be based, then there can be no circle. In fact, we may take the practice of recovering a central feature of a concept in use by revealing the assumptions on which our use of the concept depends as a characterization of what traditionally passes for a conceptual analysis. Thus understood, Frege’s theorem confirms that his analysis of numerical identity in terms of Hume’s principle is a compelling solution to the second of the two problems Dummett poses, that of explaining how we attain the concept of a countably infinite domain. Of course the Dedekind infinity of the numbers can be obtained directly from Dedekind’s and Peano’s well-known axiomatizations. Indeed, if the concepts over which the variables of Hume’s principle range are restricted to (the Kuratowski-) finite concepts, Frege arithmetic is equivalent to the DedekindPeano axiomatization. What is the advantage of Frege’s development of arithmetic from Hume’s principle? The answer, I think, is that by contrast with Dedekind and Peano, Frege derives the number-theoretic or pure properties of the numbers from an analysis of our applications of them in judgements of cardinality. By being based on the fundamental thought — the thought that a statement of number involves the predication of something of a concept — Frege’s account shows how a mathematical analysis such as Dedekind’s or Peano’s arises out of the most common everyday applications we make of the numbers. Such an account ties our conception of the numbers to our conception of families of sortal concepts and the interconnections of which they are capable, and in so doing, locates the origin of our conception of number in the character of our conceptual framework. But what of Dummett’s first problem? Have we given up on securing the existence of the numbers? If Hume’s principle is regarded as a conceptual analysis of our arithmetical knowledge, it presupposes rather than vindicates the existential commitments of what we take ourselves to know. Such an analysis can clarify the nature and extent of those commitments, and it can clarify the assumptions from which they derive, but it may still fall short of persuading someone who, for whatever reason, denies the existence of the numbers. We have simply not addressed the question whether it is coherent to deny the existential assumptions which, on the analysis on offer, the analysandum presupposes. Our account does not yield what Dummett has called a “suasive argument”30 for the existence of the numbers. At best, it affords an explanation of how it is possible to arrive at a conception of the natural number sequence, an explanation that is based on our applications of numbers to concepts. It is only in this sense that our account affords a foundation for arithmetic, a sense
The Neo-Fregean Program in the Philosophy of Arithmetic
109
which falls short of one that would secure our arithmetical knowledge against someone who would question the existential commitments it presupposes. But although such an explanation falls short of the traditional foundational justification Dummett is demanding, it may be said in its defense that it is not at all clear how anything stronger could be achieved.
Notes 1. In this connection, see J. Alberto Coffa, “Russell and Kant,” Synthese 46 (1981) 247–263. 2. “Frege: The Last Logicist,” reprinted in my collection Frege’s Philosophy of Mathematics, pp. 42 and 46. 3. C. G. Hempel, “On the Nature of Mathematical Truth,” Hilary Putnam and Paul Benacerraf (eds.), The Philosophy of Mathematics: Selected Readings, second ed. (Cambridge University Press, 1983). 4. Wright (1983). 5. This had been observed by Charles Parsons about 20 years earlier in his paper “Frege’s Theory of Number,” reprinted in his Mathematics in Philosophy, Cornell University Press, 1983 and in Frege’s Philosophy of Mathematics. So far as I am aware, the terminology “partial contextual definition” originates with Parsons. 6. Hale (1987). 7. So-called because in Grundlagen §63, Frege quotes from Hume: We are possest of a precise standard, by which we can judge of the equality and proportion of two numbers; and according as they correspond or not to that standard, we determine their relations without any possibility of error. When two numbers are so combin’d as that one has always an unite answering to every unite of the other, we pronounce them equal; and ‘tis for want of such a standard of equality in extension, that geometry can scarce be esteem’d a perfect and infallible science (Treatise I, iii, I para. 5).
It should be noted that Hume used ‘unite’ in roughly our sense of element; and by “combining” (comparing) two numbers, Hume meant what we would today call the comparison of two sets; to “pronounce them equal” is therefore to pronounce the two sets equal in size (rather than to assert their identity). (Many thanks to John Corcoran for these observations.) This may remove some of the irony of calling so fundamental a principle of Grundlagen, “Hume’s principle,” since it would show that Hume was not committed to the “featureless units” theory of numbers attacked in Grundlagen §§29–39. However, this use of ‘number’ (in the sense of ‘plurality’) is not an exact match of our use of ‘set’ since, at the very least, it misses the empty set and singletons and is therefore subject to an objection exactly analogous to one raised (in Grundlagen §28) to the thesis that numbers, in our sense of ‘number,’ are sets, in the sense of ‘plurality.’ ‘Hume’s principle’ should, therefore, be taken with a grain of salt. 8. “Is Hume’s Principle Analytic?” (Wright (1999), reprinted in The Reason’s Proper Study (Hale and Wright (2001)). Unless otherwise indicated, page references to Hale and Wright are to this volume. 9. In particular, both George Boolos and Michael Dummett have advanced a number of criticisms of the neo-Fregean program, prompting important clarifications of its interpretation of Hume’s principle and of the methodology of concept formation via “abstraction principles,” of which Hume’s principle and our concept of number are the primary examples. (References are given below.) Wright’s responses, “On the Philosophical Significance of Frege’s Theorem,” “On the Harmless Impredicativity of N= (‘Hume’s Principle’),” and “Response to Dummett,” are reprinted in The Reason’s Proper Study. 10. The first bad company objection was urged by Dummett in Frege: Philosophy of Mathematics, Harvard University Press, 1991; it is elaborated in his “Neo-Fregeans: In Bad Company?” (in Matthias Schirn (ed.), Philosophy of Mathematics Today (Oxford University Press, 1998). The second (as well as the first) was presented by Boolos in “The Standard of Equality of Numbers” (reprinted in Boolos’s Logic, Logic and Logic, Harvard University Press, 1998, and in my collection). 11. As shown by Boolos in “The Consistency of Frege’s Foundations of Arithmetic,” reprinted in Logic, Logic and Logic and Frege’s Philosophy of Mathematics. 12. Whether this is the theory of pure second order logic or the theory of second order logic with a single function symbol N of mixed type, one whose intended interpretation is that of a map from concepts to objects. It is Hume’s principle that makes N a cardinality function capable of supporting definitions of immediately precedes and zero which satisfy the basic laws of arithmetic.
110
Intuition and the Axiomatic Method
13. “Implicit Definition and the A Priori,” reprinted in The Reason’s Proper Study. 14. See Wright (pp. 289–293). The restriction to the full power set is necessary, since there is a Henkin model with an infinite domain in which NP is true; in this model the domain of the concept variables is also countable: Take as domain the natural numbers N and (identifying a concept with its extension) let the domain S(N) of the concept variables be {F: F is a finite or cofinite subset of N}. Observe, by the way, that S(N) is a field of sets just like P(N) — the power set of N. Now define n(F) by the condition n(F) = 0 if F is finite and n(F) = 1 if F is cofinite. Clearly the equivalence relation of NP — F and G are equivalent if their symmetric difference is finite — makes F and G equivalent if and only if F and G are both finite or both cofinite, so they will be assigned different values by n only when one is finite and the other is cofinite. (Proof: Obvious that if F and G are both finite or cofinite, F and G are equivalent, since in both cases we are dealing with a union of intersections, each of which is finite. For the contrapositive, take F finite and G cofinite. Then the complement of F is cofinite. But the intersection of two cofinite sets is cofinite, and the union of a cofinite set with any set is infinite. Hence F and G are inequivalent, and thus must be assigned distinct objects.) 15. Essentially the same point has been made by Boolos — although in a different context — in his “Is Hume’s Principle Analytic?” (in Richard Heck (ed.) Language, Thought and Logic: Essays in Honour of Michael Dummett (Oxford University Press, 1998); see especially, p. 253). 16. See, for example, “Is Hume’s Principle Analytic?,” p. 318. 17. In this connection, compare the penultimate paragraph of Wright’s “Is Hume’s Principle Analytic?” 18. “Finite Hume,” Philosophia Mathematica 8 (2000) 150–159, and “Can Nothing Matter?,” Analysis 62 (2002) 125–134. 19. There are of course well-known historical examples along these lines. Cp. Ernst Mach’s The Science of Mechanics (Open Court: 1960, sixth American edition, translated by Thomas J. McCormack), whose famous definition of mass (p. 266) even has the form of an abstraction principle. Thanks to Peter Clark for calling my attention to Mach’s rational reconstruction of Newtonian mechanics. 20. “Responses to Commentators,” Philosophical Books 44 (2003) 245–263, pp. 248–249. 21. “On the Foundations of Geometry: First Series,” translated by Eike-Henner W. Kluge, and collected in Gottlob Frege: Collected Papers on Mathematics, Logic, and Philosophy, Brian McGuinness (ed.), Basil Blackwell, 1984. 22. See Hilbert to Frege, 29.xii.99, draft or excerpt by Hilbert. In Brian McGuinness (ed.), Gottlob Frege: The Philosophical and Mathematical Correspondence, Hans Kaal, tr., University of Chicago Press, 1980. For a discussion of the issues raised by this correspondence, see my paper, “Frege, Hilbert, and the Conceptual Structure of Model Theory,” History and Philosophy of Logic 15 (1994) 211–225. 23. This point is made by Dummett in Frege: Philosophy of Language, Duckworth and Harvard University Press, 1973, p. 654. 24. I.e., restricted in its application only by the demand that the concepts to which it applies are sortal concepts. 25. For Frege, the natural sequence of numbers is the sequence, 0, 1, . . . ℵ0 , which is contrasted with the sequence of natural numbers, 0, 1, . . . . 26. As is well-known, this is what led to Frege’s explicit definition of the numbers in terms of classes of equinumerous concepts. 27. Wright has observed that one can show that the Frege-number of Fs = n if, and only if, there are, in the intuitive sense of the numerically definite quantifier, exactly n Fs. See The Reason’s Proper Study, p. 251 and pp. 330ff. 28. See Dummett, “Neo-Fregeans: In Bad Company?,” pp. 369 and 381. 29. Perhaps after the manner of the method of “domain extension” which Mark Wilson has argued was suggested to Frege by the work of von Staudt and others. See his “To Err is Humean,” Philosophia Mathematica Series III 7 (1999) 247–257. The historical background is developed more fully in his “Frege: The Royal Road to Geometry,” reprinted in Frege’s Philosophy of Mathematics. 30. “The Justification of Deduction,” in Truth and Other Enigmas, Harvard University Press, 1978, 290–318, cf. esp. pp. 295ff.
References Benacerraf, P. (1995), “Frege: The Last Logicist,” in: Frege’s Philosophy of Mathematics, W. Demopoulos (ed.), Harvard University Press, Cambridge, MA, 41–67.
The Neo-Fregean Program in the Philosophy of Arithmetic
111
Boolos, G. (1987), “The Consistency of Frege’s Foundations of Arithmetic,” in: On Being and Saying: Essays in Honor of Richard Cartwright, J. Thomson (ed.), The MIT Press, Cambridge, 3–20; reprinted in Boolos (1998) and Demopoulos (1995). Boolos, G. (1990) (ed.), “The Standard of Equality of Numbers,” in: Meaning and Method: Essays in Honor of Hilary Putnam, Cambridge University Press, 261–277; reprinted in Boolos (1998) and Demopoulos (1995). Boolos, G. (1997), “Is Hume’s Principle Analytic?,” in: Language, Thought, and Logic, R. G. Heck, Jr. (ed.) , Oxford University Press, New York, 245–261; reprinted in Boolos (1998) and Demopoulos (1995). Boolos, G. (1998), Logic, Logic, and Logic, Harvard University Press, Cambridge. Demopoulos, W. (1994), “Frege, Hilbert, and the Conceptual Structure of Model Theory,” in: History and Philosophy of Logic 15, 211–225. Demopoulos, W. (1995) (ed.), Frege’s Philosophy of Mathematics, Harvard University Press, Cambridge. Demopoulos, W. (1998), “The Philosophical Basis of our Knowledge of Number,” in: Noˆus 32, 481–503. Dummett, M. (1978), Truth and Other Enigmas, Harvard University Press, Cambridge. Dummett, M. (1991), Frege: Philosophy of Mathematics, Harvard University Press, Cambridge. Dummett, M. (1993), “Neo-Fregeans: In Bad Company?,” in: The Philosophy of Mathematics Today, M. Schirn (ed.), Clarendon Press, New York, 1998, 369–405; with a reply by C. Wright. Frege, G. (1950), Die Grundlagen der Arithmetik. Eine Logisch Mathematische Untersuchung u¨ ber den Begriff der Zahl, Basil Blackwell, Oxford. Frege, G. (1980), Philosophical and Mathematical Correspondence, University of Chicago Press. Frege, G. (1984), Collected Papers on Mathematics, Logic, and Philosophy, Basil Blackwell, Oxford; translated by M. Black, V. Dudman, P. Geach, H. Kaal, E. Kluge, B. McGuinness, and R. Stoothoff. Hale, B. (1987), Abstract Objects, Oxford University Press. Hale, B. and C. Wright (2001), The Reason’s Proper Study, Oxford University Press. Hale, B. and C. Wright (2003), “Responses to Commentators,” in: Philosophical Books 44, 245–263. Heck, R. G., Jr. (1999), “Grundgesetze der Arithmetik I §10,” in: Philosophia Mathematica 7, 258–292. Hempel, C. G. (1983), “On the Nature of Mathematical Truth,” in: The Philosophy of Mathematics: Selected Readings, second ed., H. Putnam and P. Benacerraf (ed.), Cambridge University Press, 377–393. Mac Bride, F. (2000), “Finite Hume,” in: Philosophia Mathematica 8, 150–159. Mac Bride, F. (2002), “Can nothing matter?,” in: Analysis 62, 125–134. Mach, E. (1960), The Science of Mechanics, sixth American edition, translated by T. J. McCormack, Open Court, New York. Parsons, C. (1983), Mathematics in Philosophy, Cornell University Press, Ithaca. Russell, B. (1903), The Principles of Mathematics, Allen and Unwin, London. Wilson, M. (1998), “To Err is Humean,” in: Philosophia Mathematica, 7, The George Boolos Memorial Symposium, Notre Dame IN, 247–257. Wright, C. (1983), Frege’s Conception of Numbers as Objects, Aberdeen University Press. Wright, C. (1997), “On the Philosophical Significance of Frege’s Theorem,” in: Language, Thought, and Logic; Essays in Honour of Michael Dummett, R. G. Heck, Jr. (ed.), Oxford University Press, New York, 201–244; reprinted in Hale and Wright (2001). Wright, C. (1998a), “On the Harmless Impredicativity of N= (‘Hume’s Principle’),” in: The Philosophy of Mathematics Today, M. Schirn (ed.), Clarendon Press, New York, 339–368; reprinted in Hale and Wright (2001).
112
Intuition and the Axiomatic Method
Wright, C. (1998b), “Response to Dummett,” in: The Philosophy of Mathematics Today, M. Schirn (ed.), Clarendon Press, New York, 389–406; reprinted in Hale and Wright (2001). Wright, C. (1999), “Is Hume’s Principle Analytic?,” in: Notre Dame Journal of Formal Logic 40, 6–30; reprinted in Hale and Wright (2001). Wright, C. and B. Hale (2000), “Implicit Definition and the A Priori,” in: New Essays on the A Priori, P. Boghossian and C. Peacocke (ed.), Oxford University Press, 286–319; reprinted in Hale and Wright (2001).
¨ GODEL, REALISM AND MATHEMATICAL ‘INTUITION’∗ Michael Hallett McGill University, Montreal, Canada
G¨odel is perhaps the most notable modern mathematician whose writings on the philosophy of mathematics appeal both to a notion of mathematical intuition and also to the Critical Philosophy of Kant. (The others are Hilbert and Poincar´e, whose views will not be treated here.) The aim of the present programmatic remarks is to approach answers to two questions: the first is to establish more clearly what role, for G¨odel, ‘intuition’ is meant to play; the second is to establish, and perhaps delimit, the connection to Kant’s philosophy, about which G¨odel is at best enigmatic. Part of the point of the present paper is to make clear what G¨odel’s notion of intuition is not. In particular, it is important to say at the outset that, despite his use of the term ‘mathematical intuition’, what G¨odel is appealing to is not Kantian sensible intuition or an extension of it; neither is G¨odel (not to speak of this essay) attempting to provide a ‘reading’ of Kant’s Critical Philosophy. In particular, the important and interesting question of how precisely what G¨odel says fits in with Kant’s theory of mathematics or knowledge in general will not be addressed. Various remarks G¨odel makes (especially in his Dialectica paper of 1958 and its revision of 1972) show clearly that he regards Kantian sensible intuition (for example, in the way it was employed by Hilbert) as much too restrictive to be of service in attempting to understand the full scope of knowledge of modern mathematics. Furthermore, G¨odel can hardly be described as a transcendental idealist in Kant’s sense, but is rather a straightforward realist about mathemat∗A
relative of this paper was presented twice in the spring of 1983, at Oxford University and the former Chelsea College in the University of London, and then, in the spring of 1984, at McGill University. A version of the present incarnation was presented at the Dortmund Conference ‘Intuition in Mathematics and Physics’ in April, 2000. For comments, help and encouragement over the years, I am grateful to John Bell, Donald Gillies, Daniel Isaacson, Mosh´e Machover, Ian Gold, Olivia Hallett, Alison Laywine and Emily Carson. For comments at the Dortmund Conference, I am indebted to Robert DiSalle, Brigitte Falkenburg, Friedrich Fulda and Ulrich Majer. I wish to thank the Social Sciences and Humanities Research Council of Canada for their generous support of my work.
113 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 113–131. © 2006 Springer. Printed in the Netherlands.
114
Intuition and the Axiomatic Method
ics in a fairly clear sense. G¨odel’s appeal to Kant is meant to underpin this realism, not to provide an alternative to it, and to this end he suggests extracting and modifying a particular element which he sees in Kant. Nevertheless, what G¨odel says is only loose and suggestive, and the main task here is to make a little more sense of it.
1.
G¨odel’s realism
The first part of our expository task will be to clarify the nature of G¨odel’s realism.1 G¨odel described himself as having been some form of ‘realist’ throughout his career. As against this, as Parsons has pointed out (Parsons (1995), 46), the explicit appeal to a relatively powerful notion of intuition does not emerge until much later in G¨odel’s philosophical writings, in fact in 1964 in the Supplement to the revised version of the famous paper on the continuum problem, i. e., G¨odel (1964), a revision of G¨odel (1947). Intuition itself is hardly mentioned before that point, although realism, in one form or another, is ever present. However, we will see that the guiding idea behind the appeal to Kantian notions in 1964 exists already in the central discussion of Russell published in 1944. G¨odel often presents realism in a fairly standard way. Consider these statements from his famous Gibb’s lecture of 1951, published only posthumously as G¨odel (*1951). (1) ‘[M]athematical objects and facts (or at least something in them) exist independently of our mental acts and decisions’ (p. 311); (2) ‘. . . the objects and theorems of mathematics are as objective and independent of our free choice and our creative acts as is the physical world’ (ibid., p. 312); (3) the concepts in a mathematical proposition ‘form an objective reality of their own, which we cannot create or change but only perceive and describe’ (ibid., p. 320); and lastly (4), this time from the revised version of the paper on the continuum problem, i. e., G¨odel (1964): ‘[T]he set-theoretical concepts and theorems describe some well-determined reality, in which Cantor’s conjecture must be either true or false’ (pp. 263–264, p. 260 in G¨odel (1990)). Note that these statements defend what one might call conceptual realism, as well as realism about ‘mathematical objects’. One thing that I will attempt to stress as the paper proceeds is that G¨odel’s views on intuition are meant in particular to play a role supplementary to this aspect of his realism. By way of a first elucidation of G¨odel’s realism, it is worth pointing out what he was reacting most strongly against. Perhaps the most important point of opposition in G¨odel’s conception is to the strong constructivist tendency to be found in Russell (especially Principia Mathematica), according to which most of mathematics is concerned with objects which we construct and which are not built into the fabric of the universe. (According to Russell’s Axiom of Infinity, there are infinitely many individuals, but the numbers and other mathematical objects are constructed by us in the hierarchy of types.) This opposition comes out (in 1944) in his attitude to ‘the’ Vicious Circle Principle, which G¨odel takes as correct only for one who
G¨odel, Realism and Mathematical ‘Intuition’
115
regards the objects of mathematics as being constructed. It is false, he argues, for one who, like himself, believes that the objects exist independently of us and who sees the purpose of mathematics as the description of the structures these objects partake in. This position is also opposed to the more general, although less precise, views of nineteenth century writers on the foundations of mathematics who promoted the view that mathematics is the result of our ‘free creation’. Certainly both Cantor and Dedekind come to mind. Neither of these writers saw themselves as constrained by what we would now call constructivist principles, although Cantor in particular always recognised certain constraints. In any case, the ‘free creation’ conception, vague as it is, does stress that the content of mathematical theories is in some sense created by us, that we have ‘free choice’ over it. To this, G¨odel was certainly opposed. Another important strand in G¨odel’s realism is hinted at in the second brief passage cited above, namely the comparison of mathematics with physics. One of the most striking (and famous) passages in G¨odel’s philosophical writings is the following from his essay on Russell which states G¨odel’s own view in opposition to Russell’s: Classes and concepts may, however, also be conceived as real objects, namely classes as “pluralities of things” or as structures consisting of a plurality of things and concepts as the properties and relations of things existing independently of our definitions and constructions. It seems to me that the assumption of such objects is quite as legitimate as the assumption of physical bodies, and there is quite as much reason to believe in their existence. They are in the same sense necessary for a satisfactory system of mathematics as physical bodies are for a satisfactory theory of our sense perceptions and in both cases it is impossible to interpret the propositions one wants to assert about these entities as propositions about the “data”, i. e., in the latter case the actually occurring sense perceptions. (G¨odel (1944), p. 137, p. 220 of Benacerraf and Putnam (eds.) (1964), or p. 128 of G¨odel (1990))
It is precisely with respect to this last view (the indispensability of the abstract notion of mathematical object) that G¨odel’s appeal to intuition plays a significant role, as we will see. Note that G¨odel’s realism is of a very strong form; if we ask ‘Realism with respect to which range of statements?’, the answer is: ‘Across all of set theory’, where G¨odel clearly regards set theory as embracing all of contemporary mathematics. When put together with what G¨odel says about facts existing ‘independently of us’, it is clear that, for G¨odel, any meaningful mathematical (set-theoretic) statement has a definite truth-value independently of our knowledge of that value. Let us call this, for want of a term, the phenomenon of completeness in truth-value. Not surprisingly, given this, G¨odel’s realistic attitude emerges most clearly with respect to incompleteness phenomena, i. e., with respect to statements which are demonstrably (or plausibly) neither provable nor refutable in very basic axiom systems (say, axiom systems for set theory or for arithmetic), in other words, the problem presented by basic axiom systems whose conse-
116
Intuition and the Axiomatic Method
quence relation does not match completeness in truth-value. Incompletenesses are basically of two sorts, consistency statements on the one hand, and, on the other, statements concerning genuinely unresolved mathematical questions like that surrounding the Continuum Hypothesis. Incompletenesses surrounding consistency statements were first discovered by G¨odel himself at the beginning of the 1930s; G¨odel showed that no standard mathematical system containing a certain (specifiable) modicum of number theory can demonstrate its own consistency. Why is this a surprise? G¨odel showed that, under these conditions, it is always possible (and this will be true even for highly complex and powerful systems, for example, the standard system for set theory) to code basic meta-mathematical notions (being a formula, being a proof) concerning the theory into basic arithmetic, and when this is done, one can easily form a statement which asserts its own unprovability in that theory. This statement, G¨odel showed, will be neither provable nor refutable in that theory, but will nevertheless be an arithmetical statement of a rather simple form, namely: ¬∃x[r = t], where r and t are quantifier free terms standing for natural numbers, and x is a variable over n-tuples of natural numbers. For these systems, therefore, no matter how powerful, there will always be simple arithmetical statements the correctness of which they will not be able to decide. Thus, G¨odel in effect showed that incompleteness phenomena of a very basic kind are virtually ubiquitous in mathematics. Secondly, the meta-mathematical statement of the consistency of a system T (call it ConsT ) also becomes an arithmetical one, closely related to the statement of self-unprovability. G¨odel’s Second Incompleteness Theorem shows that, no matter how powerful T is, assuming it is actually consistent and contains a modicum of arithmetic as before (certainly standard set theory does, and this is the theory which largely concerns us here), ConsT will not be provable in T . Hence, for a very wide range of T , T cannot prove its own consistency. Moreover, G¨odel clearly regards the consistency statements (and the statements of self-unprovability) as true even though unprovable; thus, for these systems, the arithmetical proposition ConsT will be a true statement about numbers which is not provable in T , so T will be provability incomplete. Let us call incompletenesses of this form meta-mathematical incompletenesses. Mathematical incompletenesses are, on the face of it, much more complex. (The distinction between these two kinds is not necessarily hard and fast, but let us leave aside this complication here.) The problem of the continuum is not only a good illustration, but the most pertinent example in considering G¨odel’s work. This problem may be simply posed as the question: How many points are there on a line? When framed within Cantor’s theory of infinite size or cardinality it becomes the question: What is the infinite cardinal number assigned to the continuum? Cantor put forward an hypothesis, known as the Continuum Hypothesis, which conjectures that the answer is the second infinite cardinal, the natural numbers being assigned the first.2 The Continuum Hypothesis is known to be independent of the most basic set of axioms for
G¨odel, Realism and Mathematical ‘Intuition’
117
set theory, the Zermelo-Fraenkel axioms (ZF), and its cognates, thus showing that the continuum problem is unresolved. G¨odel’s own work in the late 1930s showed that the Continuum Hypothesis is consistent with these axioms, so not refutable from them; and work by Cohen in the early 1960s showed that the same holds for the negation, so it is not provable either. Moreover, the techniques developed by Cohen enabled it to be shown that it is consistent with the axioms of ZF to assign more or less any cardinal one wishes to the continuum, so, from a certain point of view, the continuum problem is as insoluble as it can be. One way in which one might express the difference between the metamathematical and the mathematical incompletenesses is as follows. In operating with a well-known and well-established theory T , there is a sense in which the consistency of that theory is implicitly assumed, and trouble-free operation with the theory (with perhaps growing understanding of how it avoids known contradictions) increases the degree of faith in its consistency. Thus, an obvious way to ‘close’ the meta-mathematical incompletenesses of the kind outlined is to find a natural way to add the statements of the form ‘ConsT ’ to form new theories T . With the genuine mathematical incompletenesses, however, there is at first sight no immediately plausible candidate for expansion. (Do we add the Continuum Hypothesis or its negation, or a proposition which implies the Continuum Hypothesis or one which implies its negation? And how do we find/select such propositions?) But despite this surface difference, G¨odel assimilated the two kinds of incompleteness. The method of assimilation is important, not to say clever. As mentioned, the simplest way to close the basic meta-mathematical incompletenesses is to add the sentence ConsT itself to T as a new axiom. But meta-mathematical examination of logical systems tells us that consistency is equivalent to the possession of a model (for first-order theories), so we could add instead a statement which expresses the existence of a model. In the case of set theory (ZF for the sake of concreteness), this idea is extended to that of adding a canonical model (adding to the so-called von Neumann cumulative hierarchy), which is the same as adding to the theory a statement asserting the existence of a large ordinal, an ordinal the existence of which clearly cannot be proved by the base theory ZF. This will mean that something which one takes to be true is provable in the new theory while not provable in the old; thus an incompleteness has been closed. (A new incompleteness will open up, but no matter.) But note that the original incompleteness in question is a meta-mathematical one, yet closing it has employed a powerful mathematical assumption (the existence of a large ordinal), indeed, much more powerful than is necessary.3 G¨odel’s assimilation of the mathematical incompletenesses to these is then based on a conjecture, one which is at the root of G¨odel’s Large Cardinals Programme, namely that mathematical incompletenesses can be closed systematically in just the same way. The power of this method of closure is not really required to deal with the meta-mathematical incomplete-
118
Intuition and the Axiomatic Method
nesses, but rather with the mathematical ones. Nevertheless, the method solves the meta-mathematical incompletenesses at the same time; thus, both kinds of incompleteness are to be closed by adding models to standard set theory, and adding models amounts to adding infinite ordinals.4 As G¨odel points out, the metamathematical incompletenesses show that piecemeal extensions of the axiom system in question will do nothing to ‘complete’ the theory in any final sense. But: . . . this does not exclude that all these steps (or at least all of them which give something new for the domain of propositions in which you are interested) could be described and collected together in some non-constructive way. In set theory, e. g., the successive extensions can be most conveniently represented by stronger and stronger axioms of infinity. (G¨odel (*1946), p. 151 of G¨odel (1990))
G¨odel goes on: It is certainly impossible to give a combinatorial and decidable characterization of what an axiom of infinity is; but there might exist, e. g., a characterization of the following sort: An axiom of infinity is a proposition which has a certain (decidable) formal structure and which in addition is true. (ibid.)
By considering all such extensions, we have an extended notion of demonstrability which . . . might have the required closure property, i. e., the following could be true: Any proof for a set–theoretic theorem in the next higher system above set theory (i. e., any proof involving the concept of truth which I just used) is replaceable by a proof from an axiom of infinity. It is not impossible that for such a concept of demonstrability some completeness theorem would hold which would say that every proposition expressible in set theory is decidable from the present axioms plus some true assertion about the largeness of the universe of all sets. (ibid.)5
What underlies G¨odel’s position here is adherence to two of Hilbert’s central doctrines. The first is that the axiomatic presentation of mathematics is primary, and therefore deducibility within axiom systems is primary. Hence, what G¨odel seeks when ‘closing’ an incompleteness is a new axiom system, and one which explicitly avoids the notion of ‘objective mathematical truth’, even though it is inspired by it. Thus, he does not regard the incompleteness phenomenon as warranting an abandonment of the ‘axiomatic method’, rejecting thus Zermelo’s later belief6 that axiomatisation represents a distortion of mathematics. Secondly, he adopts Hilbert’s principle of the ‘solvability of every mathematical problem’ in a strong and uniform sense7 , namely, that for every unsolved mathematical (set-theoretic) problem, one can find a new, true set-theoretic axiom (and thus an appropriate axiom system) which decides the problem. In the present context, the important thing about this is the belief in ‘objective mathematical truth’ which leads to the conviction that there are axiom systems to be found which will ‘close’ the various incompletenesses, and this in a uniform way. Hence, all mathematical problems have solutions in a unifying framework, the type theory of the hierarchy of sets. The central problem is then to be seen in the gap between ‘objective mathematical truth’
G¨odel, Realism and Mathematical ‘Intuition’
119
and our creations, i. e., the conceptual framework embodied in the axiom systems which attempt to capture the objective mathematical facts, and thus with the continual extension of our knowledge to match the objective mathematical facts, even though we will never be able to match them fully. It is important to note that in the early part of G¨odel’s famous paper on the continuum problem, the paper where (in the revised version) there is the most extended discussion in his writings of both ‘mathematical intuition’ and how one might decide upon new axioms, G¨odel writes as if the search for new axioms is akin to the search for new hypotheses in theoretical physics, and that the axioms themselves can/should be justified in something like a hypothetico-deductive way, i. e., by examining their consequences. It follows that G¨odel cannot here be applying a notion of intuition directly to these sorts of statements, i. e., cannot be invoking intuition as some sort of touchstone of truth for new principles. Deciding on a new extension to set theory will be a deliberative and reflective matter in which one’s mathematical experience with a given system plays a part, and so also does our being informed by the specific mathematical developments in the theory. This is not too surprising. For G¨odel, intuition does not function as a source of truth even at the lower levels of mathematics. In a paper stemming from 1958 (G¨odel (1958)/G¨odel (*1972)), G¨odel dismisses Hilbert’s reliance on a version of Kantian intuition for elementary arithmetic, which he sees as a kind of quasi-spatio-temporal, concrete intuition. He does not say he thinks it necessarily wrong, but rather that it is so weak as to be of no service. It would not even justify arithmetic statements of the form ConsT , the very type of statement to which G¨odel turns for the closure of the meta-mathematical incompletenesses. This is therefore one negative thing which we can say already about G¨odel’s theory of intuition. But having emphasised the conjectural nature of higher mathematics, it is important to note that for G¨odel it is conjectural only to a certain degree. The conjectures will be based on the fixed set-theoretic structure. And what underlies this is our concept of set.
2.
The appeal to intuition
There are, I think, two distinct senses in which the notion of intuition is invoked by G¨odel, and in neither is intuition a direct source of knowledge. The first sense is the notion of a basic, suggestive insight, fundamental to our mathematical activity. This is akin to the notion of geometrical intuition as it appears, for instance, in the work of Felix Klein; it is suggestive, helpful, creative, and, in the case of Klein, tied up with imaging, picturing. But intuition in this sense is not a source of truths. It is this sense of intuition to which G¨odel appeals when he claims that the basic axioms of set theory are highly plausible, to be agreed to, say, by anyone with any direct experience of set theory, and intimate knowledge of the mathematical consequences of various set-theoretic principles and techniques. Intuition in this sense is something
120
Intuition and the Axiomatic Method
which develops when one is educated in, and informed by, the subject. (Perhaps, above all, belief in the consistency of the axiom system has this as its source, and not arithmetic intuition.) Intuition in this sense is nothing like the a priori intuition of Kant, nor anything like the Raumanschauung which Frege takes as the source of knowledge of Euclidean geometry as a body of synthetic a priori truths. This sense of intuition is neither new, nor distinctive nor particularly enlightening, although it is important to recognise where it is playing a role in G¨odel’s account. The second sense of intuition is far more important, however. Even in this second sense, intuition should not be regarded as itself providing knowledge of mathematical propositions; nevertheless it is to be regarded as something like a precondition for the acquisition of high-level mathematical knowledge, the acquisition itself being more of the conjectural and hypothetico-deductive kind discussed above. The central insight is given by G¨odel’s analogy with perception of the material world, the clearest statement of which is the well-known passage from his essay on Russell quoted above. I repeat this passage here; recall that the context of G¨odel’s remarks is his opposition to Russellian constructivism with respect to both mathematics and the physical world: Classes and concepts may, however, also be conceived as real objects, namely classes as “pluralities of things” or as structures consisting of a plurality of things and concepts as the properties and relations of things existing independently of our definitions and constructions. It seems to me that the assumption of such objects is quite as legitimate as the assumption of physical bodies, and there is quite as much reason to believe in their existence. They are in the same sense necessary for a satisfactory system of mathematics as physical bodies are for a satisfactory theory of our sense perceptions and in both cases it is impossible to interpret the propositions one wants to assert about these entities as propositions about the “data”, i. e., in the latter case the actually occurring sense perceptions. (G¨odel (1944), p. 137, p. 220 of Benacerraf and Putnam (eds.) (1964), or p. 128 of G¨odel (1990))
This connects fairly directly with another famous remark of G¨odel’s, from the 1964 Supplement to his paper on the continuum problem, where the notion of intuition begins to play a significant and explicit role. The passage is: . . . the objects of transfinite set theory, conceived in the manner explained on p. 262 and in footnote 14 [the iterative conception of set], clearly do not belong to the physical world and even their indirect connection with physical experience is very loose (owing to the fact that set theoretical concepts play only a minor role in the physical theories of today). But despite their remoteness from sense experience, we do have something like a perception also of the objects of set theory, as is seen from the fact that the axioms force themselves upon us as being true. I don’t see any reason why we should have less confidence in this kind of perception, i. e., in mathematical intuition, than in sense perception which induces us to build up physical theories and to expect that future sense perceptions will agree with them and, moreover, to believe that a question not decidable now has meaning and may be decided in the future. The set-theoretical paradoxes are hardly more troublesome for mathematics than deceptions of the senses are for physics. That new mathematical intuitions leading to a decision of such problems as Cantor’s continuum
G¨odel, Realism and Mathematical ‘Intuition’
121
hypothesis are perfectly possible was pointed out earlier. (G¨odel (1964), p. 271, pp. 267–268 of G¨odel (1990))
This latter passage can be a little confusing, since as Parsons has pointed out (op. cit., p. 65) it is hard to see how G¨odel gets from the claim that the ‘axioms force themselves upon us as being true’, which, if it rested on intuition, would be an instance of ‘intuition that’, to the conclusion that we have a ‘perception of the objects of set theory’, which would be an instance of ‘intuition of ’. Parsons suggests, and I think rightly, that by ‘the objects of set theory’ here what G¨odel really means is the general concept of set, and not the (or even some) individual sets. And indeed note that in the passage from the essay on Russell just quoted G¨odel speaks of ‘classes and concepts’. There is a good deal of textual evidence in G¨odel’s other philosophical writing (for example, in the two versions of his paper on Carnap included in Volume 3 of the Collected Works, i. e., G¨odel (*1953)/9–III, and (*1953/9–V)) which suggests that G¨odel construes the analogy of the perception of physical objects with the ‘perception’ (or perhaps better: the ‘discernment [erkennen, erfassen]’) by reason of concepts. Thus, if mathematical intuition is to be a ‘kind of perception’, then what it does is enable us to ‘perceive’ concepts. Following this, what ‘forces itself upon us as true’ are propositions which unfold the concept of set ‘revealed’ in ‘perception’. And G¨odel has explained in the earlier part of the 1964 paper what he takes to be the basics of the concept, namely what is known as the iterative concept of set. The axioms which ‘force themselves’ upon us, are therefore the basic axioms of the iterative conception of set, which are (or are equivalent to) the axioms of ZF set theory. What I would like to suggest about this analysis, therefore, is the following. When G¨odel talks about ‘new mathematical intuitions’ in the passage above, he is really referring to intuition in the first, vaguer sense of insight which I mentioned, that is, to plausible suggestions which emerge in the light of our experience with using the basic system, particularly experience in ‘unfolding’ the iterative concept of set, above all (in line with the large cardinal programme outlined in the 1946 address) in developing conjectures about large cardinals. It should be noted here that even the axioms which are ‘forced on us’ are not themselves the product of mathematical intuition itself, but rather the product of reflection on the concept, the ‘perception’ of which is the real product of the intuition. This squares with G¨odel’s discussions in the earlier part of the paper of the precise role of the axioms, and how one might decide on them. The suggestions he puts forward are that the higher axioms of set theory are really to be thought of like physical hypotheses. As we would therefore expect, none of this really fits with mathematical intuition as ‘intuition that’. G¨odel’s unique appeal to intuition is, therefore, concentrated on the ‘perception’ by reason of the concept of set. This provides the precondition, sets the framework, for the deliberation about the specific axioms and principles which is governed by the more speculative processes and hypthetico-deductive accounts to which G¨odel
122
Intuition and the Axiomatic Method
refers. It is this that I want to examine more closely, and which finally brings us to Kantian themes.
3.
Sets and Kantian synthesis
The key to this is given in the paragraph which, in the 1964 version of the paper on the continuum problem, follows directly after the long passage quoted above: It should be noted that mathematical intuition need not be conceived of as a faculty giving an immediate knowledge of the objects concerned. Rather it seems that, as in the case of physical experience, we form our ideas also of those objects on the basis of something else which is immediately given. Only this something is not, or not primarily, the sensations. That something besides the sensations actually is immediately given follows (independently of mathematics) from the fact that even our ideas referring to physical objects contain constituents qualitatively different from sensations or mere combinations of sensations, e. g., the idea of object itself, whereas, on the other hand, by our thinking we cannot create any qualitatively new elements, but only reproduce and combine those that are given. (G¨odel (1964), 271–272, p. 268 of G¨odel (1990))
Note again the indirect reference to the broadly constructivist view of ‘reproducing and combining’. The distinction G¨odel states here between, on the one hand, sensation and, on the other, the contributed notion of object roughly corresponds to Kant’s distinction between matter and form in an appearance. Since G¨odel is concerned with mathematics quite generally, he presumably has in mind something more abstract than spatio-temporal form. Indeed, just after the passage quoted where he claims that the notion of object is contributed, G¨odel goes on to say: Evidently the “given” underlying mathematics is closely related to the abstract elements contained in our empirical ideas.
And in a footnote he continues: Note that there is a close relationship between the iterative concept of set . . . and the categories of pure understanding in Kant’s sense. Namely, the function of both is ‘synthesis’, i. e., the generating of unities out of manifolds (e. g., in Kant, of the idea of one object out of its various aspects). (G¨odel (1964), p. 272, p. 268 of G¨odel (1990))
The main text continues It by no means follows, however, that the data of this second kind, because they cannot be associated with actions of certain things on our sense organs, are something purely subjective, as Kant asserted. Rather they, too, may represent an aspect of objective reality, but, as opposed to the sensations, their presence in us may be due to another kind of relationship between ourselves and reality. (ibid.)
Now we have an explicit connection between G¨odel’s ideas and Kantian theory, for the theory of synthesis which G¨odel mentions is the key to the most abstract part of Kant’s conception of objects. Most interestingly, G¨odel claims
G¨odel, Realism and Mathematical ‘Intuition’
123
a close relationship between this abstract part of the form of objects and the set concept in mathematics. If G¨odel is right, then even in our perception of physical objects there is a sense in which we are applying something very like the abstract set concept. This would mean, first, that whatever one makes of Kant’s particular claim that Euclidean geometry is synthetic, there may be a more fundamental level at which mathematics is synthetic. More precisely expressed, our understanding contributes to our notion of object in general by using abstract synthesising; at the core of this, says G¨odel, is the set concept governed by the mathematical theory of sets. This would also give some grounds for the view that the conceptual machinery of both mathematics and physics has some common basis, as well as the claim that mathematical intuition is ‘something like perception’. However, before we can understand a little more clearly what G¨odel is getting at, we must examine briefly Kant’s notion of synthesis. The spatio-temporal forms of intuition are of great importance in Kant’s theory of how it is that we contribute to the notion of physical object. But in the section of the Kritik der reinen Vernunft entitled ‘Transcendental Logic’, Kant seeks to explain how it is that we have not just experience of objects as spatio-temporal, but how we have experience of objects at all; that is, experience of coherent enduring individuals capable, for example, of being in causal relations with each other. Kant’s explanation leads to the Deduction of the Categories and the Analytic of Principles, but, as G¨odel points out, what lies behind this is really his theory of synthesis. This is what, for Kant, governs the transition from the manifold of representations to the unity of a judgement that an object falls under a concept, which is ultimately how experience is expressed. As Kant says: By synthesis in the most general sense, however, I understand the action of putting different representations together with each other and comprehending their manifoldness in one cognition. Such a synthesis is pure if the manifold is given not empirically but a priori (as is that of space and time). Prior to all analysis of our representations these must first be given, and no concepts can arise analytically as far as the concept is concerned. (Critique of Pure Reason, A77/B103.)8
And Kant goes on: The synthesis of a manifold, however, (whether it be given empirically or a priori) first brings forth a cognition, which to be sure may initially still be raw and confused, and thus in need of analysis: yet the synthesis alone is that which properly collects the elements for cognitions and unifies them into a certain content; it is therefore the first thing to which we have to attend if we wish to judge about the first origin of our cognition. (op. cit., A77–78/B103)
According to the position sketched here, what the understanding is presented with is a manifold of what we might call ‘raw’ representations. According to Kant, there are then two further and higher stages, perception proper and then formulation of judgements. For Kant, this is where the putting together or ‘association’ or ‘synthesis’ of representations takes place, forming out of the
124
Intuition and the Axiomatic Method
raw manifold judgements about objects. As Kant says, experience (represented as such judgements) is ‘cognition through connected perceptions’ (op. cit., B161).9 It is this connection, this unification, which is finally responsible for the recognition of the object in the full sense of the word. Whatever elements of present sensation are involved in such recognition, certain abstract elements are necessarily involved as well. We might say, using a suggestive metaphor, that the concept of an enduring object is used to ‘smooth out’ an otherwise diverse and complicated manifold of representations. Kant is perfectly clear that this smoothing out is done by concepts contributed by the understanding, and is not anything directly received in sensation. There are numerous passages which confirm this. The following three are particularly striking: The question now is whether a priori concepts do not also precede, as conditions under which alone something can be, if not intuited, nevertheless thought as object in general, for then all empirical cognition of objects is necessarily in accord with such concepts, since without their presupposition, nothing is possible as object of experience. Now, however, all experience contains, in addition to the intuition of the senses, through which something is given, a concept of an object that is given in intuition, or appears; hence concepts of objects in general lie at the ground of all experiential cognition as a priori conditions; . . . (op. cit., A92–93/B125–126)
The second stresses also the automatic use of combination or synthesis: Yet the combination (conjunctio) of a manifold in general can never come to us through the senses, and therefore cannot already be contained in the pure form of sensible intuition; for it is an act of the spontaneity of the power of representation, and, since one must call the latter understanding, in distinction from sensibility, all combination, whether we are conscious of it or not, whether it is a combination of the manifold of intuition or of several concepts, and in the first case either of sensible or non-sensible intuition, is an action of the understanding, which we would designate by the general title synthesis in order at the same time to draw attention to the fact that we can represent nothing as combined in the object without having previously combined it ourselves, and that among all representations combination is the only one that is not given through objects but can be executed only by the subject itself, since it is an act of its self-activity. (op. cit., B129–130)
There is a similar remark in the Fortschritte from 1793, this time concerning the concept of putting together: All representations which make up an experience can be ascribed to sensibility, with a single exception, i. e., that of the putting together. . . Since the putting together cannot be given through the senses, but we must make it ourselves, it does not belong to the receptivity of sensibility, but to the spontaneity of reason as an a priori concept. (Kant (1793/1804), ‘Erste Abteilung: Von dem Umfange des theoretisch-dogmatischen Gebrauches der reinen Vernunft’.)10
Although Kant uses the phrase ‘putting together’ here instead of ‘combination’ or ‘synthesis’, he is clear, at least in the Critique of Pure Reason, that
G¨odel, Realism and Mathematical ‘Intuition’
125
the putting together of a manifold is not enough to yield an object. Putting together, by itself, would yield only a plurality, but a plurality, even if ordered, is not a single thing. What is needed in addition is that such putting together results in recognition of a single object; thus a concept of unity is needed, too. Indeed, Kant seems to stress that it really is such a concept of unity which allows us, or compels us, to combine or associate what is given through intuition at all. As he says in the Critique: But in addition to the concept of the manifold and of its synthesis, the concept of combination also carries with it the concept of the unity of the manifold. Combination is the representation of the synthetic unity of the manifold. The representation of this unity cannot, therefore, arise from the combination; rather by being added to the representation of the manifold, it first makes the concept of combination possible. (Critique of Pure Reason, B130–131)
Kant goes on to say that this unity is not the category of unity, but is presupposed by all categories, and thus ‘precedes a priori all concepts of combination’ (ibid.). Thus only with the concept of unity does the sorting, putting together, awareness of a manifold, lead to awareness of an object. And it is this which, for Kant, is crucial in our making sense and order out of the manifold of related sensations. The idea of assemblages being unities is the abstract conceptual foundation of Kant’s notion of objects, and the ultimate reason why our knowledge of objects or even of the form of objects (as in pure mathematics) can be synthetic. As Kant says, to repeat a passage quoted earlier: [Synthesis] is therefore the first thing to which we have to attend if we wish to judge about the first origin of our cognition. (op. cit., A77–78/B103)
4.
G¨odel and Kant
Let us now try to connect some of the various threads. The first thing to emphasise is that G¨odel echoes Kant’s claim that the additional notion of ‘synthetic unity’ is essential in our reproducing and combining the elements given in sensibility (G¨odel (1964), p. 272, p. 268 of G¨odel (1990)), to produce, for example, the ordinary notion of a physical object. It is at this point that he remarks: Evidently the “given” underlying mathematics is closely related to the abstract elements contained in our empirical ideas. (ibid.)
The footnote to this asserts the close relationship between the set concept and Kant’s notion of synthetic unity. One respect in which this association makes very good sense is the following. In set theory, the notion of a collection’s being a unity is needed over and above the mere aggregation of elements; the principal reason why the theory is so powerful is precisely its assumption that collections can be single objects of the same logical type as the elements that go to make them up, and thus available for further collection. Moreover, it was the assumption that the question of the specialness of aggregative unity can be ignored, or in other words that any collection is automatically a unity and
126
Intuition and the Axiomatic Method
therefore a set, which led to the set-theoretic paradoxes and to the inconsistent Axiom of Comprehension in its various forms.11 It was part of the task of early set theory to delineate the set concept as the concept of collection unity by explaining when collections can be taken as unities. The axiom system for set theory is designed to say when ‘collection’ or ‘aggregation’ under a predicate will yield a set, and the axioms are arrived at ‘transcendentally’, as it were, by analysing accepted mathematical experience or practise, particularly the nature of continuity and the theory of real functions. The analogy we have, then, is this. According to Kant, we contribute (through the understanding) the notion of (physical) object, and the attendant notion of ‘synthetic unity’ is indispensable in this. This creates a range of meaningful statements with truth-values which cannot be ascribed those truth-values on the basis of sensation (or even experience) alone. It is not clear that the concept of ‘synthetic unity’, indispensable as it is, is to be regarded as a concept in the Kantian system, and thus how exactly it is to fit into the theory of categories or how it is to be integrated with the notion of the ‘unity of apperception’. What G¨odel suggests, however, is that we can extract this element from Kant’s thought, and, in so far as it is pertinent to mathematics at least, make it the central concept in our theory of what mathematical knowledge is knowledge of.12 This theory is the theory of sets. The set concept is what enables us to assert that there is one object (say, the infinite ordinal ω) where we are only ‘given’ a collection (in the example, the collection of natural numbers). The use of the set concept as G¨odel understands it does not consist in conscious or deliberate processes of construction or formation. In particular, there is no suggestion of the ‘perception’ of individual sets13 , and then of the construction of their aggregation. Rather its use resides in the recognition that certain collections are enduring objects which are composed of elements, elements not all of which we need to have ‘experienced’ or constructed individually, or indeed elements which are impossible for us to describe individually, just as we cannot describe the real numbers individually. In this way, the set concept mirrors the smoothing out function which Kant attributes to the synthetic capacity. But, as with our notion of physical object, this does not just involve smoothing out what we observe into perception of enduring objects; it is rather an attribution of independence in the sense that we carry over this picture to what we are not currently observing, and thus the attribution of truth-values across a much wider range. In particular, we conceive of the world we do not observe or the parts of the world we could not observe (the world in the very distant past, for example) as being, at its elemental level, much like the world we do observe. This global smoothing out, the assumption of uniformity, as G¨odel says (and as is implicit in Kant) is an essential prerequisite to our theorising, to our formulating universal laws about the world. For G¨odel, then, classical set theory (mathematics) is based on just the same kind of global smoothing out, and the fact that we unquestionably do this with physics makes the fact that we naturally treat mathematics in the
G¨odel, Realism and Mathematical ‘Intuition’
127
same way at least more explicable. We might think of the ‘aspects’ of a set u as either its elements, or the properties of these elements, i. e., the power set of u, or the sets to which u belongs. In each case, we can only be familiar in general with a limited sample of these, but in each case, however, we assume that all the ‘aspects’ are present. Perhaps our first mathematical experience only concerns such collections as small natural numbers, regarded as collections which must be both aggregations and unities, and then certain well-described subsets of the natural numbers. But the automatic nature of the use of the set concept is such that we smooth this out to a full unexperienced universe of sets, which contains, for example, the set ω of natural numbers, and, as a single object, the power set of the natural numbers, thus the set of all properties of natural numbers. Most importantly, perhaps, to connect this to G¨odel’s realism, we apply the law of excluded middle to statements about the universe and about the sets in it. Thus, there either is or is not an infinite set of real numbers which cannot be mapped one-to-one either onto the natural numbers or onto the set of all real numbers. There are disanalogies with Kant in at least the following respects. Kant’s theory is that the synthetic unity is contributed by the understanding, and, as far as our judgements are concerned, in an automatic way. We cannot but see the world as composed of enduring, causally connected objects. But the set-theoretic reading of mathematical experience is not automatic. For example, when faced with a certain mundane experience of the physical world (say about the room I am currently in), it is automatic, indeed involuntary, that I understand this experience as one involving enduring objects with which I stand in causal relationships. But when faced with a mundane fact from analysis or arithmetic, it is certainly not automatic that I understand this (or ‘see’ it) as expressing a complex fact about sets, and it is certainly not involuntary. The second, and connected, substantial difference with Kant is the following. There are basically two possible explanations for why the understanding regards our experience of the world as experience of enduring objects and all that this entails. The first is that our understanding contributes the requisite conceptual framework for construing the world this way. The second is that the conceptual structure is substantially present in the world and that we apprehend it. The first is basically Kant’s explanation, while G¨odel’s account of the mathematical world seems to suggest much more the latter. This is attested to by various remarks about the ‘subjectivism’ (i. e., mind-dependence) of Kant’s theory, by his opposition to the various theories (like Russell’s) which hold that we ‘construct’ large parts of the mathematical world, and in addition by his theory of the ‘perception’ of concepts.
5.
Conclusion
If what I have said in the last section is approximately right, although what G¨odel says is not clear enough at first sight to make this certain, then G¨odel’s
128
Intuition and the Axiomatic Method
explanation of why our mathematical understanding is capable of ‘seeing’ the mathematical world in basic outline as it actually is, is Kantian only to a limited extent. That is, G¨odel extracts a central element from Kant’s theory of knowledge and forms from it a fundamental component of our explanation of what mathematical knowledge is knowledge of, of why we have knowledge of what we do not immediately experience, and indeed can extend that knowledge substantially. It is, however, not Kantian in the sense that the fundamental concept of ‘synthesis’ is not taken to be contributed by the understanding, but rather ‘perceived’. Having said this, however, it must be part of G¨odel’s belief that the mind has a certain capacity which allows it to ‘perceive’ the synthesising concept, a capacity which must be more fundamental than its capacity to ‘perceive’ other concepts. In this case, G¨odel’s view of mathematics is based on a recognition of a central conceptual, indeed organisational, contribution of the mind, and its ability to develop theoretical and linguistic frameworks based on this which are capable of mirroring the objective mathematical facts. In other words, what G¨odel is concerned with, assuming that we do have objective knowledge of mathematics which is uniform, governed by a principle of truth-value completeness and extendable, is to explain why we are capable of this kind of cognition. In this sense, then, what he hints at is a kind of Kantian theory, a theory which attempts to explain, as Kant puts it, . . . namely how subjective conditions of thinking should have objective validity, i. e., yield conditions of the possibility of all cognition of objects: . . . (Critique of Pure Reason, A89/B122)
I want to end this paper by quoting one further passage where G¨odel refers directly to Kantian philosophy and its application to the philosophy of mathematics. It comes in an unpublished shorthand draft of a paper first written in 1961, which is a survey of some trends in modern philosophy of mathematics. G¨odel suggests towards the end of the paper that Husserlian phenomenology represents a promising direction for philosophy of mathematics to explore; it also suggests that Husserl’s ideas are more precise formulations of ideas which are found embryonically in Kant. In the course of this, G¨odel writes the following: I would like to point out that this intuitive grasping of ever newer axioms that are logically independent from the earlier ones, which is necessary for the solvability of all problems even within a very limited domain, agrees in principle with the Kantian conception of mathematics. The relevant utterances by Kant are, it is true, incorrect if taken literally, since Kant asserts that in the derivation of geometrical theorems we always need new geometrical intuitions, and that therefore a purely logical derivation from a finite number of axioms is impossible. That is demonstrably false. However, if in this proposition we replace the term “geometrical” by “mathematical” or “set-theoretical”, then it becomes a demonstrably true proposition. I believe it to be a general feature of many of Kant’s assertions that literally understood they are false but in a broader sense contain deep truths. (G¨odel (*1961/?), p. 9, in G¨odel (1995), p. 385)
At the end of the paper, G¨odel notes:
G¨odel, Realism and Mathematical ‘Intuition’
129
[I]f the misunderstood Kant has already led to so much that is interesting in philosophy, and also indirectly in science, how much more can we expect it from Kant understood directly? (op. cit., p. 10, in G¨odel (1995), p. 387)14
Notes 1. The best general review of G¨odel’s realism, and the way his appeal to intuition fits with it, is to be found in Parsons (1995). Parsons stresses the following: I will suggest, however, that G¨odel aims at what other philosophers (in the tradition of Kant) would call a theory of reason rather than a theory of intuition. (Parsons (1995), p. 45)
This is very much in line with the tentative remarks made in the present paper, which is perhaps best seen as a supplement to that of Parsons. 2. One can state the continuum problem independently of the framework of cardinal numbers, as Cantor first did: Does there exist an infinite subset of the points on a line which can neither be put into one-to-one correspondence with the natural numbers nor with the whole line? The Continuum Hypothesis asserts that the answer to this is negative. 3. See Feferman (1987) or Feferman (1989). 4. The idea was clearly expressed in programmatic form in 1946, but the germ of it goes back to 1933. G¨odel wrote (in English): A special case of the general theorem about the existence of undecidable propositions in any formal system is that there are arithmetic propositions which can be proved only by analytical methods and, further, that there are arithmetic propositions which cannot be proved even by analysis but only by methods involving extremely large infinite cardinals and similar things. (G¨odel (*1933), p. 13, p. 48 of G¨odel (1995))
Thus meta-mathematical incompleteness is seen as one example of a certain species of mathematical problem, those which pose questions about what Hilbert termed the ‘Reinheit der Methode’. 5. Note the similarity between G¨odel’s proposal and Zermelo’s ‘axiom’ of ‘meta set theory [MetaMengenlehre]’ proposed in his (1930) that ‘there exists an unbounded sequence of boundary numbers’ (p. 46). 6. It was also Poincar´e’s and Brouwer’s. 7. For Hilbert, the ‘solvability of a problem’ included producing a proof of its insolubility from a given axiom system. The principle G¨odel adopts is clearly stronger than this. 8. This translation, like others from the Kritik der reinen Vernunft, is taken from Guyer and Wood (1998) 9. See also the Fortschritte, i. e., Kant (1793/1804), p. 101 in Kant (1921), or p. 275 of the AkademieAusgabe of Kant’s collected writings, Band XX. 10. The text is on p. 102 of Kant (1921), and pp. 275–276 of the Akademie-Ausgabe of Kant’s collected writings, Band XX. The translation is mine. 11. Note that Cantor did not make this mistake, but always stressed that only collections which can be thought of as wholes or unities can be taken as sets. See Hallett (1984). There is here also a further general analogy with Kant, in that the set-theoretic antinomies show that there must be limits or constraints on set-theoretic or mathematical ‘reason’. The iterative conception of set, which G¨odel supports, certainly acccepts such constraints. 12. Recall again that Kant says at A77–78/B103 of the Critique that ‘[Synthesis] is therefore the first thing to which we have to attend if we wish to judge about the first origin of our cognition.’ 13. This, incidentally, marks another connection with the modern axiomatic view. 14. The quotations are taken from the English translation in G¨odel (1995), by K¨ohler, Wang, Dawson, Parsons and Craig.
130
Intuition and the Axiomatic Method
References Items marked with an ‘*’ were originally unpublished. Benacerraf, P. and H. Putnam (eds.) (1964), Philosophy of Mathematics: Selected Readings, First Edition, Basil Blackwell, Oxford. Benacerraf, P. and H. Putnam (eds.) (1983), Philosophy of Mathematics: Selected Readings, Second Edition, Cambridge University Press. Davis, M. (ed.) (1965), The Undecidable, Raven Press, New York. Di Francia, G. T. (ed.) (1987), L’infinito nella scienza, Enciclopedia Italiana. Ewald, W. (ed.) (1996), From Kant to Hilbert: A Source Book in the Foundations of Mathematics, Volumes 1 and 2, Clarendon Press, Oxford. Feferman, S. (1987), “Infinity in Mathematics: is Cantor Necessary?” in: Di Francia (1987), 151–209; a shorter version is to be found in Feferman (1989). Feferman, S. (1989), “Infinity in Mathematics: is Cantor Necessary?” in: Philosophical Topics 17, 23–45. G¨odel, K. (*1933), “The Present Situation in the Foundations of Mathematics”, first published in G¨odel (1995), 45–53. G¨odel, K. (1944), “Russell’s Mathematical Logic” in: Schillp (1944), 125–53; reprinted in: Benacerraf and Putnam (1964), 211–232, Benacerraf and Putnam (1983), 447–469, and also G¨odel (1990), 119–141. G¨odel, K. (*1946), “Remarks Before the Princeton Bicentennial Conference on Problems in Mathematics” in: Davis (1965), 84–8; also in: G¨odel (1990), 150–153. G¨odel, K. (1947), “What is Cantor’s Continuum Problem?” in: American Mathematical Monthly 54, 515–525, errata, ibid. 55, 151; reprinted in: G¨odel (1990), 176–187. G¨odel, K. (*1951), “Some Basic Theorems on the Foundations of Mathematics and their Implications”, first published in: G¨odel (1995), 304–323. G¨odel, K. (*1953/9–III), “Is Mathematics Syntax of Language?”, first published in: G¨odel (1995), 334–356. G¨odel, K. (*1953/9–V), “Is Mathematics Syntax of Language?”, first published in: G¨odel (1995), 356–362. ¨ G¨odel, K. (1958), “Uber eine bisher noch nicht ben¨utzte Erweiterung des finiten Standpunktes” in: Dialectica 12, 280–287; reprinted, with an English translation by S. Bauer-Mengelberg and J. van Heijenoort in: G¨odel (1990), 240–251. G¨odel, K. (*1961/?), “The Modern Development of the Foundations of Mathematics in the Light of Philosophy”, first published in: G¨odel (1995), 374–387. G¨odel, K. (1964), “What is Cantor’s Continuum Problem?” in: Benacerraf and Putnam (1964), 258–73; reprinted in: Benacerraf and Putnam (1983), 470–485, and also in: G¨odel (1990), 254–270. G¨odel, K. (*1972), “On an Extension of Finitary Mathematics That Has Not Yet Been Used” ¨ ¨ in: Godel (1990), 271–280; Godel’s revision, with extensive notes, of an English translation of G¨odel (1958) by L. Boron. G¨odel, K. (1990), Collected Works, Volume 2: Publications 1938–1974, Oxford University Press. ¨ Godel, K. (1995), Collected Works, Volume 3: Unpublished Essays and Lectures, Oxford University Press. Guyer, P. and Wood, A. (1998), Critique of Pure Reason, Cambridge University Press. English translation of Kant (1781/1787). Hallett, M. (1984), Cantorian Set Theory and Limitation of Size, Clarendon Press, Oxford. Kant, I. (1781/1787), Kritik der reinen Vernunft. Zweite Ausgabe, Johann Friedrich Hartknoch, Riga; first edition (‘A’ pagination), with second edition emendments (‘B’ pagination), edited by J. Timmermann, Felix Meiner Verlag, Hamburg 1998.
G¨odel, Realism and Mathematical ‘Intuition’
131
Kant, I. (1793/1804), Welches sind die wirklichen Fortschritte, die die Metaphysik seit Leibnizens und Wolfs Zeiten in Deutschland gemacht hat?, edited by F. T. Rink; Goebbels und Unzer, K¨onigsberg 1804; reprinted in: Kant (1921), and the Akademie-Ausgabe of Kant’s collected writings, Band XX. Kant, I. (1921), Zur Logik und Metaphysik, herausgegeben von Karl Vorl¨ander. Dritte Abteilung: die Schriften von 1790–93, Felix Meiner (Philosophische Bibliothek, Bd. 46c), Leipzig. Parsons, C. (1995), “Platonism and Mathematical Intuition in Kurt G¨odel’s Thought” in: Bulletin of Symbolic Logic 1, 44–74. Schillp, P. A. (ed.) (1944), The Philosophy of Bertrand Russell, Open Court Publishing Co., Evanston, Illinois. ¨ Zermelo, E. (1930), “Uber Grenzzahlen und Mengenbereiche” in: Fundamenta Mathematicae 16, 29–47; English translation by M. Hallett in Ewald (1996), Volume 2, 1208–1233.
INTUITION, OBJECTIVITY AND STRUCTURE∗ Elaine Landry University of Calgary, Canada
1.
Introduction
Frege’s argument for the objectivity of arithmetical knowledge, in the first instance, is used to challenge the Kantian claim that, in contrast to philosophical knowledge which arises from the analysis of concepts, arithmetical knowledge must arise from the construction of concepts and, consequently, requires pure intuition. Frege redirects this opposition with his suggestion that “the question of how we arrive at the content of a judgment should be kept distinct from the other question, Whence do we derive the justification for its assertion?”. (Frege (1884), p. 3) This separation of the context of discovery from the context of justification is thus intended by Frege to do both mathematical and philosophical work. On the one hand, it is used to show that arithmetic is essentially logical: since the justification of the truth of its propositions arises through “gapless” proofs, no appeal to either intuition or ’intuitive axioms’ is required. On the other hand, it also allows Frege to redefine the Kantian terms a priori, a posteriori, synthetic and analytic so that [w]hen a proposition is called a posteriori or analytic in my sense, this is not a judgement about conditions . . . which have made it possible to form the content of a proposition . . . it is a judgement about the ultimate ground upon which rests the justification for holding it to be true. (Frege (1884), p. 3)
Consequently, for Frege, demonstrating, by conceptual analysis, that the justificatory grounds for arithmetical knowledge are purely logical leads to the conclusion that the propositions of arithmetic are analytic and so neither concept construction nor pure intuition has any role to play in the explanation of the grounds for their objectivity. ∗A
longer version of this paper is published as “Logicism, Structuralism and Objectivity”, in Topoi 20: 79, 2001. Thanks to the editors of Topoi and to Kluwer Academic Publishers for their permission to rework these ideas.
133 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 133–153. © 2006 Springer. Printed in the Netherlands.
134
Intuition and the Axiomatic Method
In this paper, I consider whether we ought still heed Frege’s warning that “we are all too ready to invoke inner intuition, whenever we cannot produce any other ground of knowledge”. (Frege (1884), p. 19) I will argue, much as Frege did, that the objectivity of mathematical knowledge can be accounted for by a conceptual analysis of the content and structure of what its propositions say. My aim will be to show that if we replace logicism by structuralism, then we can account for the objectivity of mathematical knowledge by a categorytheoretic analysis of the content and structure of what its propositions say. I will argue that if we take a “semantic turn”, and thereby shift our philosophical focus from what mathematical propositions are about to what they say, i.e., to what their truth conditions warrant us to talk about in an interpreted system, then we can use the notion of “shared structure” (in much the same way as Frege used the notion of “inherent logical form”) to account for the objectivity of mathematical statements. Moreover, I will argue that if we shift our mathematical focus from talk about objects qua “independently existing things” to talk about objects qua “positions in structured systems”, then we can use category theory as an organization tool for our analysis of the content and structure of what mathematical propositions say, and thereby claim that the justification of the truth of mathematical statements comes down to what we can say about the structure of both objects and systems and not what we can know about their essence. Bringing these two shifts together is the assumption that mathematical propositions can meaningfully and objectively talk about an object without being about any object in particular and that it does so by construing objects as positions in structured systems. That is, what is objective about a statement is not what it is about but, rather, lies in the shared structure that allows us to talk about any object that satisfies the corresponding semantic constraints, i.e., that satisfies the truth conditions that determine what is to count as an interpretation for the structured system in question.
2.
The Semantic Turn
I begin with Coffa’s characterization of the “semantic tradition”, which will allow me to describe the “semantic philosopher” as one who realizes that “between our subjective representations and the world of things we talk about, there is a third element, what we say”. (Coffa (1991), p. 77, italics added.) In this section, I argue that we have yet to appreciate the significance of what Coffa (1991), has termed ‘the semantic tradition’1 and, consequently, we have yet to recognize the position and role of the semantic philosopher. Failing this awareness, as opposed to concerning ourselves with what can be objectively said, we continue to assume that the conditions for meaningful assertions must be grounded in what we can know or what exists. In either case, whether we see ourselves as mathematical philosophers or as philosophers of mathematics, we continue to separate the aim of mathematical philosophy — the aim set at the clarification of the meaning and use of mathe-
Intuition, Objectivity and Structure
135
matical concepts — from the aim of philosophy of mathematics — the aim set at accounting for the referents of, and the truth conditions for, mathematical statements. As a consequence, we, as philosophers of mathematics, continue to read those philosophical issues that relate to the semantics of mathematical discourse as strictly epistemological (either naturalized or psychologized) or strictly ontological (either physicalized or platonized). The semantic philosopher, in contrast, is one who realizes that any conceptual analysis of the content and structure of what mathematical propositions say need not involve reference (direct or mediated) either to the psychological or constructive acts, or the ontological “things” through which we come to say them. (See Coffa (1991), p. 1) The semantic tradition began as a reaction to the psychologistic interpretation of Kant.2 In particular, philosophers and mathematicians belonging to the semantic tradition sought to maintain the objective character of mathematical knowledge while at the same time avoiding the seemingly necessary appeal to pure intuitions. As Coffa notes, [t]he semantic tradition consisted of those who believed in the a priori but not in the constitutive powers of the mind. They also suspected that the root of all idealist confusion lay in misunderstandings concerning matters of meaning. Semanticists are easily detected: They devote an uncommon amount of attention to concepts, propositions, senses . . . (Coffa (1991), p. 1)
To recognize the value of this move from psychology to semantics we must first understand Kant’s distinction between philosophical knowledge and mathematical knowledge. More precisely, we must understand that for Kant: [p]hilosophical knowledge is knowledge gained by reason from [the analysis of] concepts; mathematical knowledge is the knowledge gained by reason from the construction of concepts. To construct a concept means to exhibit a priori the intuition which corresponds to the concept. For the construction of a concept we therefore need a non-empirical [pure] intuition. (Kant, CPR, A713/B741)
Confronted with this Kantian account of mathematical knowledge, the question the semantic philosopher sought to answer was: Is there any way that the objectivity of mathematical knowledge can, like that of philosophical knowledge, be accounted for through an analysis of its concepts? More specifically, the question at hand was: Is there any manner of both clarifying the meaning and justifying the use of mathematical concepts so as to provide a justificatory ground for mathematical assertions, without having to appeal either to construction or pure intuition? Frege, in attempting to answer this question, denied that arithmetical judgements were synthetic. They were analytic, not in the usual Kantian sense of the meaning of the predicate being contained in that of the subject3 , but, in the sense that, in virtue of the logical form inherent in any proposition which talks about an arithmetical concept, as it is used in the context of a given statement, “objects fall under concepts”4 . In the Begriffsschrift, by creating a language of concepts, he thus shifted the basis of conceptual analysis from issues relating
136
Intuition and the Axiomatic Method
to pure forms of intuition and concept construction to those of objective logical representation and “concept-writing”. Moreover, in his Die Grundlagen der Arithmetik, he upheld the distinction between subjective and objective representations (see Coffa, (1991), pp. 22–26) by demanding that we “always separate sharply the psychological from the logical, the subjective from the objective”. (Frege (1884), p. x) Furthermore, in appreciation of the linguistic nature (as opposed to the possibly psychological interpretation of talk of representations) that such an analysis need abide by, he added that we “never ask for the meaning of a word in isolation, but only in the context of a proposition”. (Frege (1884), p. x) Finally, he attempted to demonstrate that, together with this “context principle”, his concept-writing, which implicitly directs us “never to lose sight of the distinction between concept and object” (Frege (1884), p. x), allows for the objective, logical, representation of the concept ‘number’ by writing the propositions of arithmetic in such a manner as they may be demonstrated as being analytic. Given these “three fundamental principles” (Frege (1884), p. x), arithmetic could proceed, as philosophy, through the analysis of the content and structure of its concepts, modulo the additional demand that, when written in the language of concept-writing, such an analysis need only consider what can be said of the logical relations between concepts and objects. Unfortunately, Basic Law V5 , even when restricted so as to render it consistent, has an existential content which is conceivably false, i.e., which cannot be justified from within Frege’s concept language, and, therefore, he could not reach his conclusion of having demonstrated that the propositions of arithmetic are analytic. It is clear that Frege can be characterized as a semantic philosopher: he attempted to justify the inclusion of mathematical concepts as legitimate objects of philosophical study by offering an analysis of the content and structure of what we say, as opposed to considering the psychological and/or constructive acts by which we come to say it. What is not clear, however, is whether the problems he encountered imply the demise of this semantic approach itself. If we are to take philosophical history as our guide, then it appears as if we must supplement our aim of conceptual analysis with the presupposition that either a stronger “logic” (or a different set-theory) or a more robust “reality” (physical or metaphysical) provides a pre- or meta-linguistic basis for what we say about objects. Does this history truly capture all of our options? To answer this question, we must first make explicit the assumption enabling such a drastic conclusion, viz., that the only way one can use a statement to talk about objects is if the statement is about objects. To rest with this assumption, however, is to conflate the semantic claim that meaningful statements talk about objects, with the ontological claim (or platonistic assumption) that meaningful statements must be about objects. Thus, it is here that we must apply the lessons of the semantic tradition, before we take our cue from the philosophical history. For, as Coffa advises,
Intuition, Objectivity and Structure
137
[f]ew things have proved more difficult to achieve in the development of semantics than recognition of the fact that between our subjective representations and the world of things we talk about, there is a third element, what we say . . . many of the best philosophical minds . . . were [and are!] unable to understand that what we say, sense, cannot be constituted either from psychological content or from real-world [or logical] correlates of our representation . . . They all attempted [and continue to attempt] to understand sense by forcing it into a world to which it does not belong. (Coffa (1991), pp. 76–77)
In this light, we may now see that while semantic philosophers, such as Frege, had realized that sense need not be constituted from the psychological content or the pure form of thought, and hence need not rely on construction or pure intuition, he, and philosophers following him, had not realized that it need not come from either the content of pure thought, the referents of “logical concepts” (from “logical objects”), or from any “real-world” correlates. Post- Fregean philosophers, for example, attempted to legitimize mathematical and/or theoretical concepts by offering up a logical analysis, not of what we say, but, in effect, by reducing such concepts and/or their referents to logical and/or empirical “atoms of meaning”. To cite instances of such attempts one only need consider Russell’s program in both Principia Mathematica and The philosophy of Logical Atomism and the standard interpretation of Carnap’s program as set out in his Aufbau. The non-logical status of the Axiom of Infinity scuttled Russell’s mathematical atomism. And, as is well-known, his physical atomism was stymied by Wittgenstein’s observation that a logical analysis of what we say about the world forces us to conclude that “the world is the totality of facts, not of things”6 , that is, is not composed of either logical or empirical “objects”. Similarly, Quine’s semantic holism7 challenged Carnap’s view that one could distinguish between those cases in which meaning is determined a priori, or logically, from those in which it is determined a posteriori, or linguistically. Through Quine, the semantic claim that “meanings are what concepts become when wedded to the word”, (Coffa (1991), p. 8) becomes read as the nominalist claim that “meanings are what essences become when wedded to the word”. (Coffa (1991), p. 8) In either case, as a result of the observations of both Wittgenstein and Quine, conventionalism was held to be the consequence of using a logical and/or conceptual analysis as a basis for determining both the semantic and/or epistemic warrant for what we say. Thus, their observations have led philosophers to believe that unless we restrict ourselves to “the world of facts”, or have access to “real essences”, as opposed to their nominalistic counterparts, then what we say will be so intimately bound to the conventions we choose that any analysis of its constituent concepts will show nothing. The question that I now turn to consider is: Is conventionalism a necessary consequence of a linguistic (as opposed to a purely logical) analysis of what mathematical propositions say about objects and systems? That is, in light of the failure of Frege’s logicism, is there any means of maintaining a notion of our speaking objectively about mathematical objects and systems that does not
138
Intuition and the Axiomatic Method
rely on any appeal to intuition or other constructive processes as a means of grasping the real essences of objects and systems as “things”. In the remaining sections, I show that, if we adopt a structuralist (as opposed to a logicist) philosophy of mathematics and consider category theory as the language of mathematical structuralism, then we can we justify the inclusion of both mathematical objects and systems, qua positions in structured systems, as legitimate objects of philosophical study on the basis of an analysis of what we say from within a category-theoretically presented “linguistic framework”.
3.
Logical form yields objectivity
Frege’s semantic turn is marked by the realization that the meaning of an arithmetical statement does not arise from reference to independently existing arithmetical objects, nor does it arise from the conditions for the possibility of constructing its constitutive concepts. Rather, it is expressing our talk about arithmetical objects, in the language of concept-writing, as concepts saturated in the context of a proposition which allows us to objectively grasp the meaning of a sentence by analyzing the logical relations between concept and objects. That Frege took the semantic turn is thus to a great extent witnessed by his use of the context principle: as Carl notes, Frege raised the question “How are numbers given to us?” and answered it via the context principle by directing us to sentences containing number words. (Frege (1884), Sec. 62) I propose the following interpretation of his answer: Numbers are given directly to our reason. But the elements of reason are judgements, since we do not reason with words but rather with sentences and in doing so make judgements. Thus, to determine the nature of numbers we must look at judgements of numbers and about numbers. Do not ask what number words mean outside the context of a sentence. That is like asking what numbers are like independently of judgements, which in turn is to ask what they are like independently of reason. And to answer that would be to “judge without judging”. (Carl (1994), p. 166)
Understood in this way, a logical analysis of what we say (or, more precisely, a conceptual analysis of the “logical form” inherent in arithmetical assertions) seems to provide for the type of semantic realism — a realism that is a consequence of the fact that talk about mathematical objects is, in an objective sense, meaningful. That is, we speak, in arithmetic, with significance by expressing, in the language of concept-writing, what falls under the concept ‘number’. And, since ‘number’ applies to everything thinkable, what we express when we make an assertion about any number is objective. As Frege explains: [v]irtually everything that can be an object of thought may in fact be counted; the ideal as well as the real, concepts as well as things . . . even numbers themselves can in turn be counted . . . those fundamental principles must extend to everything thinkable; and a proposition that is in this way of the greatest generality is justifiably assigned to logic. (Frege (1885), in Dummett (1991), p. 43–44)
Intuition, Objectivity and Structure
139
The objective status of an arithmetical assertion, then, relies solely on the logical form inherent in the content of pure thought, and hence, has no need for any other source for its justification. Frege says, we see how pure thought, irrespective of any content given by the senses or even by an intuition a priori, can, solely from the content that results from its own constitution, bring forth judgements that at first sight appear to be possible only on the basis of some intuition . . . (Frege (1879), p. 163)
Frege’s version of semantic realism, as that which arises out of his account of objective content, thus, seems to avoid the poles of both idealism and platonism. By making objectivity, first and foremost, an epistemological notion, considered as such from within the context of justification, he appears to avoid reference to both subjective representations (to ideas) and elements of the world which we are talking about (to independently existing arithmetical objects). That is, . . . Frege does not attempt to explain objectivity by reference to the assumption of the independent existence of objects. What is objective is explained by reference to intersubjective accessibility, as far as it is based on reason. Frege’s notion of objectivity has epistemological, but not ontological, presuppositions. (Carl (1994), p. 79)
It is in uncovering these “epistemological presuppositions” that we see that the answer to the question of whether Frege remained within the scope of the semantic tradition is not, however, straightforward. For while it is clear that he dismissed subjective representations and the world of things we talk about, it is not clear that he successfully avoided reference to “the world of things we think about”. That is, Frege’s account of the objectivity of (the statements which talk about) number appears nonetheless to have platonistic consequences — it results in the reification of thoughts, or at least a reification of the “logical objects” that arise from the analysis of what it means to say that logical form is inherent in the content of such thoughts. He says: [u]nlike ideas, thoughts do not belong to the individual mind (they are not subjective), but are independent of our thinking and confront each one of us in the same way (objectively). They are not the product of thinking, but are only grasped by thinking. In this respect they are like physical bodies . . . Logic is concerned with the laws of truth, not with the laws of holding something to be true, not with the question of how men think, but with the question of how they must think if they are not to miss the truth. (Frege (1897), p. 148–49, italics added)
4.
Conceptions of logic
The question that needs be considered now is: Do we require an independent, Fregean, “world of thoughts” to mediate between what possible propositions mean and what actual statements say? Given Frege’s universalist conception of logic it would appear as though we do. Goldfarb’s account of the differences between Fregean logic and modern logic allows us to see better why this is the case. Goldfarb characterizes the modern conception of logic as schematic in that
140
Intuition and the Axiomatic Method . . . the subject matter of logic is logical properties of sentences and logical relations between sentences. Sentences have such properties and bear such relations to each other by dint of their having the logical forms they do . . . logic deals with what is common to and can be abstracted from different sentences, in different quantificational languages. Logical forms are not mysterious quasi- entities, a` la Russell. Rather, they are simply schemata: representations of the composition of the sentence, constructed from the logical signs (quantifiers and truth-functional connectives, in the standard case) using schematic letters of various sorts (predicate, sentence, and function letters). Schemata do not state anything, and so are neither true or false: but they can be interpreted: a universe of discourse is assigned to the quantifiers, predicate letters are replaces by predicates or assigned extensions . . . over the universe, sentence letters can be replaced by sentences or assigned truth-values. Under interpretation, the schema will receive a truthvalue . . . to say that a sentence can be schematized by a schema is just to say that there is an interpretation under which the schema becomes the sentence. (Goldfarb (2001), p. 3)
Thus, “logic [on the schematic conception] is tied to no particular subject matter; it deals with these [schemata] rather than with particular contents”. (Goldfarb (2001), p. 5) In contrast, Fregean logic is intended “to express a content [of pure thought] through written signs” (Frege (1882b), p. 91) and consequently, . . . there are no parts of his logical formulas that await interpretations . . . on the universalist conception, the concern of logic is the articulation and proof of logical laws, which are universal truths . . . not in being about nothing in particular, but in using topic-universal vocabulary to state truths about everything. (Goldfarb (2001), pp. 5 and 7)
This difference, Goldfarb suggests, engenders another: The schematic conception is meta-linguistic. The claims of logic are claims about schemata or about sentences . . . In contrast, on the universalistic conception, logic sits squarely at the object level; issuing laws that are simply statements about the world [of thought]; what they describe are not phenomena of language or of representation. (Goldfarb (2001), p. 7)
So, while for Frege, the universe of discourse is not “what exists”, independently of thought, it is, nonetheless, fixed by “what is thinkable”, independently of linguistic expression. And, more significantly, it is this fixity which secures the objectivity of what we say. What appears to underpin Frege’s universalist conception of logic is his view of logic as language. His concept script is intended to provide the language of concepts by providing the framework for our analysis of their content and structure. The context principle then acts as the objectifying bridge between what we say about concepts and their extensions and what we say about numbers qua objects by demanding that we not proceed with our analysis of what arithmetical propositions say until we have actually said something, i.e., until we have made an assertion (or expressed a thought). Only then can we conclude that arithmetical judgements are purely logical, i.e., only then can we argue that, since
Intuition, Objectivity and Structure
141
. . . numbers are given to us as extensions of concepts . . . [they are] given directly to our reason and . . . utterly transparent to it . . . [and] for that very reason these objects are not subjective fantasies. There is nothing more objective than the laws of arithmetic. (Frege (1884), p. 115)
Frege’s view of logic as language is thus held as opposed to Boole’s view of logic as calculus, or as a “calculating procedure for carrying out deductive inferences”. (Frege (1882a), p. 88) For Frege, “the language of arithmetical formulas is a “conceptual notation” [for thought itself] it represents the facts immediately and not by means of a sound [or a formula]”.8 (Frege (1882a), p. 88) According to the view of logic as language, the logical form of arithmetical propositions must somehow inhere in the content of pure thought. As Frege explains, [r]ight from the start I had in mind the expression of a content. What I am striving after is a lingua characterica in the first instance for mathematics, not a calculus restricted to pure logic. But the content is to be rendered more exactly than is done by verbal language. (Frege (1880/81), p. 12)
On the universalist’s reading of logic, the views of logic as language and logic as calculus may best be understood in the following manner. Logic as language attempts an analysis of what we say by considering logical form as inhering in the content of an expression so that logical form is instantiated within the context of a proposition which expresses the corresponding content. Logic as calculus presents logical form as something superadded to content, that is, as something which is recoverable when the sentence is interpreted. That Frege rejected this latter view is clear: The word ‘interpretation’ is objectionable; since a thought, correctly expressed, leaves no room for different interpretations. A proper sentence expresses a thought and this is true or false. (Frege (1906a), in Resnik (1980), p. 177)
What needs now be noted, however, is that this distinction between logic as language and logic as calculus does not necessarily match the distinction between Goldfarb’s universalist and the schematic conceptions of logic. Thus, we need to take care to keep separate the (Fregean) universalistic logicist claim that mathematical propositions are about everything thinkable, the universalistic formalist claim that mathematical propositions are about nothing, and the schematic structuralist claim that mathematical propositions can talk about an object without being about any object in particular. On a structuralist reading of the schematic view of logic, we may say that an uninterpreted statement talks about something in virtue of its shared structure, but is about nothing in particular until we have provided an interpretation for the schemata of the corresponding structured system. For example, if we consider the schema of the Natural Number structure, as defined by the Peano axioms, then the concept ‘number’ occurs as a position in the uninterpreted system, and so a proposition which talks about number could be about either Zermelo cardinals or von Neumann ordinals. Yet, when we interpret the system in, say, ZF, ‘number’ means cardinal, i.e., any statement in which ‘number’ occurs is about cardinals.
142
5.
Intuition and the Axiomatic Method
Structure yields objectivity
Let us now return to our initial question: Does replacing logicism by structuralism allow us to retain a notion of objectivity that is required for the justification of the truth of mathematical propositions that talk about objects and systems? What I will show is how, if we replace the Fregean claim that logical form yields objectivity by inhering in the logical representation of the content of thought with the claim that structure inheres in the category-theoretic presentation of a system, then we can make use of the Carnapian idea that “shared structure” is the bearer of objectivity. To this end, I now attempt to situate and defend the claim that category theory provides the language for organizing the content and structure of what we say in and about mathematically structured systems and categorical logic provides, respectively, the calculus and language for our meta- theoretic analysis of what we say in and about logically structured systems. It is in this sense that I think it appropriate to call category theory the language of mathematical structuralism: it provides a “linguistic framework” for the claim that structure, both mathematical and logical, yields objectivity. I now turn to consider, then, Carnap’s changing conceptions of logic with the aim of using this investigation to show that, despite the many alterations of the meaning of both ‘logical form’ and ‘structure’, what remains, and can still be defended, is Carnap’s view that structure yields objectivity9 . To read Carnap as a semantic philosopher, and thereby situate him with respect to both Coffa’s characterization of the semantic tradition and my semantic realist interpretation of mathematical structuralism, we note Richardson’s account of the neo-Kantian10 backdrop out of which Carnap’s views arose: [for the neo-Kantians, e.g., Cassier, Natorp, Bauch] objects do not simply exist as such outside of their logical relations to one another, nor are they collections of sensory representations. Objective knowledge is found not in pure experience but in the laws of the mathematical sciences of nature . . . This view clearly contrasts with any naive realism that speaks of objective knowledge as objective not because of the systematic interrelations of the objects in the system but by relations to transcendent objects outside the system. Similarly, it is inconsistent with any idealism that founds objectivity in the subjective experience of any one individual, or denies objectivity to knowledge in general . . . Ontological questions about the existence of objects presupposes a structured framework of relations in which logical form has already discharged its epistemological function. In this sense, all existential questions must be interpreted as internal to the system of the sciences itself. (Richardson (1998), p. 123 and 124)
While it is true, then, that in the Aufbau Carnap, as a logicist, believed that there could only be one constitutional system for mathematics — that provided by the (Russellian) universalist conception of logic — it is also true that he believed that in science there is a multitude of constitutional systems whose objectivity is licensed by their “shared structure”. How, then, can we manage a reconstruction of Carnap’s attempts at conceptual analysis which brings together his scientific structuralism with his mathematical logicism, so that we may claim that in mathematics too struc-
Intuition, Objectivity and Structure
143
ture yields objectivity? What we must first clarify is Carnap’s neo-Kantian scientific structuralism. Richardson accurately sums this up as follows: . . . Carnap differs from Kant in finding that logical form alone is sufficient to play the role that guarantees objectivity. No room is found for the forms of “pure intuition” in the Kantian sense. On the one hand, he constructs a notion of objectivity within the system of scientific concepts itself via the construction of the intersubjective world of science (Sec. 142-7). On the other hand, he also endorses the project of objectivity as pure logical structure through his notion of a ‘purely structural definite description’ (Sec. 14-16). Both of these projects can be viewed as methods of accounting for objectivity as a purely structural notion, but they differ in their notion of ‘structure’. In the first case, the classical mathematical structure of physics is seen as the crucial objectifying structure for science. In the second, Carnap seeks to deploy the resources of logic to give a structural account for concepts that does not rely on this structure of the mathematized sciences but on the structure of type theory itself . . . The goal seems to be . . . the elucidation of the unique role that the superadded mathematical structure of the world of physics provides for the question of the objectivity of science. (Richardson (1998), pp. 29 and 76)
What we note here is the extent to which, in the Aufbau, Carnap, as regards mathematical structure, held a universalistic logicist view of logic as language (as opposed to schematic structuralist view of logic as calculus). Recognizing this, we realize why one could not, by analogy, claim that in mathematics, as in science, logical form is superadded to mathematical structure: logical form is thought to inhere in mathematical propositions themselves. That is, what Carnap does is rely on the reductive success of Principia Mathematica to give him whatever notion of structure he needs to account for objectivity in the (non-mathematical) sciences. He takes for granted the objectivity (that is, the indubitability and, therefore, the intersubjectivity) of type theory and, hence, of mathematics, and indeed (in Sections 10 and 16) claims that this is what Russell has shown. Or, again, as Richardson explains [a]s the discipline that presents the formal conditions for any judgement on any subject matter, logic [and, therefore, mathematics] is neither subject to epistemological doubt nor in need of further grounding . . . The assimilation of the constitutive definitions of empirical terms to logical truths is the key to the purely structuralist account of objectivity . . . But Carnap’s conception of logic, which he has taken over from Bertrand Russell and Gottlob Frege, is a fully general language in which everything that can be said at all can be said. (Richardson (1998), p. 192-193)
Before going further, we must recognize the manner in which Carnap’s logicism is distinct from Frege’s: “Carnap no longer seeks to assign logical form to the mind”. (Richardson (1998), p. 180) Since logical form is taken as inhering in the type-theoretic structure of mathematical propositions (as opposed to inhering in the constitutive content of thought), the epistemic indubitability, and, therefore, the justification of any assertion is considered a purely logical matter. So, to Richardson’s previous account of the differences between Carnap and Kant, we may also add that Carnap differs from both Kant and Frege
144
Intuition and the Axiomatic Method
in finding that logical form (read now as type-theoretic structure) alone is sufficient to play the role that guarantees objectivity in mathematics. No room is found, contra both Kant and Frege, for either the form or the content of pure thought. That is, it is Carnap’s belief that nothing is needed to mediate between logical form and mathematical expression: even at this early stage Carnap had rejected a justificatory role for thought (pure or otherwise). But what is to replace the semantic role that Frege afforded to (the content of) thought? To answer this question, and explain why Carnap shifted from his universalist conception of logic as language to a schematic conception of logic as calculus, we must note two “turns”. The motivation for the first is appreciated when we recall the strain between Carnap’s scientific structuralism — which takes logical form as superadded to scientific propositions, and his mathematical logicism — which takes logical form as inhering in the type- theoretic structure of mathematical propositions. This tension was brought to the fore in The Logical Syntax of Language when, in an attempt to account for the logical form of both mathematics and Hilbertian meta-mathematics, Carnap shifted to a purely syntactic notion of meta-mathematical logical form as that inhering in the deductive structure of axiom systems. The resulting lack of fit between taking the logical form of mathematical propositions as inhering in their typetheoretic structure and this purely syntactic analysis of meta-mathematical logical form was, as Coffa notes, too great: [t]he [meta-mathematical] notions, Carnap insisted, are defined only for ASs [Axiom Systems]. But logic is not an AS [it is a language according to Carnap’s interpretation of Russellian type theory]. Hence, it would appear, metamathematical concepts make no sense when referred to logic . . . The Hilbertian temptations to which Carnap was ready to succumb were bursting out of his Russellian framework. (Coffa (1991), p. 280)
Carnap was thus faced with two options: move to the semantic notion of logical form (as that inhering in the truth conditions for the assertion of a proposition) or find a new universal language that could replace type theory and also accommodate his notion of meta-mathematical logical form. Under Tarski’s influence, Carnap chose the former and, in doing so, finally gave the Fregean semantic turn its needed schematic twist: Carnap’s endorsement of Tarski’s doctrine . . . [led to] the recognition of the old Fregean distinction between the content of a statement and its assertion. Carnap’s “Truth and Confirmation” is the first carefully worked out presentation . . . in which a clear distinction is made between saying something and claiming that it is true . . . (Coffa (1991), p. 373)
Finally, persuaded by the observations of Wittgenstein, in addition to this “schematic semantic turn”, Carnap also made a “linguistic turn”, so that the language of any system (as opposed to its logic) is seen as inherently structured. The result was a purely linguistic notion of structure as that inhering in the semantically considered schemata (inhering in the truth conditions) of a given system. This, combined with his underlying assumption that structure
Intuition, Objectivity and Structure
145
yields objectivity, is what allows him to conclude, in “Empiricism, Semantics, and Ontology”, that the structure inherent in the linguistic presentation of a system is itself the bearer of objectivity. Or, more precisely, that claims made from within “linguistic frameworks” are inherently structured and, therefore, necessarily objective so that “[t]he construction of so-called reality depends however, as we know, on the structure of the language being used at the time.” (Carnap (1935) in Coffa (1991), p. 373) But does this linguistic notion of structure — as that inhering in the truth conditions of the language of a system — yield objectivity? If Quine is right, the answer must be no.
6.
Categories as the schemata of structured systems
To reconsider Quine’s response, we now ask: Did Carnap have to force talk of both mathematical and logical structure into talk of a purely linguistic conception of structure and a completely “tolerant” view of language, and thereby consign mathematical objectivity either to Quinean naturalism — where the Fregean distinction between discovery and justification is blurred, if not altogether lost — or set-theoretic reductionism and/or foundationalism — where again the appeal to a justificatory role either for intuitions or intuitive axioms, or for mental representation or abstraction, resurfaces? If not, then we must ask how Carnap’s semantic program, as presented in “Empiricism, Semantics, and Ontology”, can be supported by a conceptual analysis of the content and structure of what we say both in and about mathematical systems without having to introduce a meta-linguistic (set or structure-theoretic) notion of ‘logical form’ to underpin our meta-theoretic analysis of their schemata. If, as I propose, we restrict our conceptual analysis to the constituents of (the propositions of) mathematically structured systems, then we can use category theory as an organization tool to justify the inclusion of mathematical objects and systems by presenting them both as positions in category-theoretically structured systems. Before detailing what I mean, we must pause here for a moment to make perspicuous the notion of system that is being appealed to. That is, we must now distinguish between what may be called the Bourbaki and the categorical notion of a system. What must first be appreciated is that we are not beginning with what Awodey has termed, the “Bourbaki description of mathematical objects as sets-with-structure” (Awodey (1996), p. 211) and, hence, there is an important distinction to be made between categorical and modern approaches to mathematical structuralism. I begin, then, with the categorical notion of a system, since, as we will see, this is where we find our corresponding notion of a cat-structured system. In its most general sense, a system is a schema (in Goldfarb’s sense of the term) of our talk of ‘sorts’. A cat-structured system, then, has ‘objects’ and ‘morphisms’ as its sorts which are structured by the category-theoretic axioms. So that the schema for a kind of structured system qua a category
146
Intuition and the Axiomatic Method . . . is anything satisfying these axioms. The objects need not have ‘elements’, nor need the morphisms be ‘functions”’ . . . We do not really care what noncategorical properties the objects and morphisms of a given category may have; that is to say, we view it ‘abstractly’ by restricting to the language of objects and morphisms, domains and codomains, composition, and identity morphisms. (Awodey (1996), p. 213)
At once we see important differences: on the category-theoretic view, not only are there are no objects as either sets-with-structure (see Dummett, (1991), p. 295) or places-with-structure (see Shapiro, (1997, pp. 73 and 93), there are no structures as either (equivalence types of) systems-with-structure or “the abstract form of a system, highlighting the interrelationships among the objects . . . ” (Shapiro (1997), p. 74). There are only objects as positions in a structured system and structured systems. What this means is that the Bourbaki conception of a system (of a system whose objects are positions in a set-structure) is to be considered as a kind of structured system: it is not the archetype of either the concept ‘system’ or the concept ‘structure’. Likewise, a category is neither a privileged system nor structure: it is a schema for what we can say about the shared (or same) structure of the various (or same) kinds of mathematically structured systems. That is, we can present the underlying structure of a Bourbaki system, or equivalently present the set-structure, by taking our objects to be sets and our morphisms to be functions. The result is the kind of category called Set. But this does not mean that objects are sets and morphisms are functions, it means in this kind of category propositions that talk about objects and morphisms can be interpreted as talking about sets and functions. Indeed, the value of this schematic notion of a system is that it can be used to capture many kinds of structured systems, independently of its specific setstructure (independently of what its ‘sorts’ are). For example, in the kind of category called Top, we present the topological-structure by taking objects as topological spaces and morphisms as continuous mappings, independently of what topological spaces are. As Awodey explains: . . . suppose we have somehow specified a particular kind of structure in terms of objects and morphisms . . . Then that category characterizes that kind of mathematical structure, independently of the initial means of specification. For example, the topology of a given space is determined by its continuous mappings to and from the other spaces, regardless of whether it was initially specified in terms of open sets, limit points, a closure operator, or whatever. The category Top thus serves the purpose of characterizing the notion of ‘topological structure’. (Awodey (1996), p. 213)
So, Shapiro is simply mistaken to claim that [t]he category theorist characterizes a structure or type of structure in terms of structure-preserving functions, called “morphisms”, between systems that exemplify the structures. (Shapiro (1997), p. 93)
for indeed what such a category theorist who takes structure-preserving morphisms as functions characterizes is the set-structure. For the category theorist,
Intuition, Objectivity and Structure
147
however, this is not a privileged notion of “structure”, indeed, there is no notion of structure over and above that of kinds of structured systems. Moreover, once we realize this, we see how Shapiro’s mistake defeats his argument for the necessity of a “structure theory” (see Shapiro (1997), especially pp. 93– 97). Simply put, to talk about the “shared structure” of mathematical systems, there is no need for structure theory over and above category theory. Given this categorical presentation, the question of the existence of mathematical objects and the justification of the truth of propositions in which they occur, are considered to derive solely from the fact that these objects are required to talk about how things are in a specific category11 . In this way a specific category acts as the schema for what we say about a kind of mathematically structured system —it is the ”linguistic framework” in which what we say about objects as positions in a kind of mathematically structured system is justified12 . For example, the category of sets, Set, allows us to talk about the concept ‘set’ as a position in the set-structured system, which can then be variously interpreted as being about an object in the models offered by, say, ZF or GB, or Grothendieck universes. More than allowing us to talk about the “same structure” of the same kinds of mathematically structured systems, category theory allows us to talk about the “shared structure” of different kinds of structured systems. Simply, we take such structured systems as objects and the functors13 between them as morphisms. For example, we can talk about the shared structure of group-structured systems and set-structured systems by considering Group and Set as our objects the forgetful functor14 as our morphism. In this way, then, category theory acts as the language that frames our talk about the mathematical structure of both objects and systems — categories act as “linguistic frameworks” for what we say about objects as positions in structured systems and for what we say about systems as positions in structured systems. Thus, if we recall the central claim of the structuralist, namely, that the subject matter of mathematics is “structured systems and their morphology”, then category theory provides the most general framework for what we say about the mathematical structure of both mathematical objects, qua positions in structured systems, and mathematical systems themselves, again qua positions in structured systems. Let us stop here to make a more concrete connection with Carnap. Carnap says [i]f someone wishes to speak in his language about a new kind of entities, he has to introduce a system of new ways of speaking, subject to new rules; we shall call this procedure the construction of a linguistic framework for the entities in question. (Carnap (1956), p. 242)
We now note that when specific categories are taken as schemata for what we say about the mathematical structure of objects and systems, i.e., are taken as Carnapian “linguistic frameworks”, we can recapture Carnap’s claim that “the reality of anything is nothing else than the possibility of its being placed in a
148
Intuition and the Axiomatic Method
certain system”. (Carnap (1935) in Coffa (1991), p. 229) And, thus, we come to frame a semantic realist interpretation of mathematical structuralism — that is, we come to frame a type of realism that results by considering what we can say from within a category-theoretically presented structured system. We must now, however, consider the “external” question, that is, “the question of the existence of the system of entities as a whole” (Carnap (1956), p. 242) That is, we must show how we can include talk of the structure, both mathematical and logical, of categories themselves, without having to take categories as “independently existing objects”. For as Carnap warns: from these [internal] questions we must distinguish the external question of the “reality” of the thing world itself. In contrast to the former questions, the question is raised neither by the man in the street nor by scientists, but only by philosophers. Realists give an affirmative answer, subjective idealists a negative one, and the controversy goes on for centuries without ever being solved. And it cannot be solved because it is framed in a wrong way. To be real in the scientific sense means to be an element of the system; hence this concept cannot be meaningfully applied to the system itself. (Carnap (1956), p. 243)
What we must determine is how we can talk about the content and structure of categories themselves, i.e., how to present such talk in terms of ‘sorts’ (‘objects’ or ‘morphisms’) satisfying the category-theoretic axioms. From the present standpoint we consider an ‘object’ (called a general category15 ) in the category of categories (typically referred to as Cat), which has categories as objects, and translations (interpretations of one structured system in another represented by functors) between cat-structured systems as its morphisms. So considered, we can include talk about categories qua objects as positions in structured systems by taking them as ‘objects’ in the cat-structured system Cat. It is in this sense that we may claim that Cat acts as a schema for what we can say about the shared (or same) structure of the various (or same) kinds of cat-structured systems. The reason, then, why we do not, when considering “external” questions, run into the “philosophical” problems that Carnap anticipated is that we can describe the content and structure of linguistic frameworks, i.e., of categories, in category-theoretic terms. We can account for the “existence” of categories in terms of general categories, without having to invoke an object-language/metalanguage distinction. Thus, we accept Carnap’s claim that . . . the acceptance of the thing language leads, on the basis of observations made, also to the acceptance, belief, and assertion of certain statements [such as internal existence claims] . . . (Carnap (1956), p. 244)
Yet, we deny the associated “pragmatic” claim that this requires us to be “tolerant” of all languages, that is, we deny his conclusion that . . . the thesis of the “reality of the thing world” cannot be among these statements, because it cannot be formulated in the “thing language” or, it seems, in any other theoretical language. (Carnap (1956), p. 244)
Category theory can act as the other theoretical language16 because it permits us to talk about categories, without our having to claim that it is about cat-
Intuition, Objectivity and Structure
149
egories, that is, without our having to claim that category theory is either a “thing language” or that Cat is a “thing world”. Thus, we can safely be internal semantic realists (realists on the basis of what we can say as warranted by the truth-conditions of the statements that are made from within a linguistic framework) without having to make reference to categories-as-objects that exist in some meta-linguistic structure, such as the category of all categories. For Cat is the schema for our talk of cat-structured systems — it is neither a structured system, nor is it a structure. Speaking metaphorically, then, and to answer Lawvere (1966), Shapiro (1997) and Mayberry (1994), categories do not swim about in a foundational sea of either categories, structures or sets waiting to be rescued either by Cat, structure theory or by naive set-theory. So far I have spoken to Carnap’s “pragmatic” worry, that an external language for linguistic frameworks is untenable, by showing that our talk about the mathematical structure of cat-structured systems requires no reference to categories as extra-linguistic objects. I now turn to address Quine’s “semantic” concern that a purely linguistic conception of structure, as framed by the schemata of a linguistically presented system (i.e., as inhering in the truth conditions of a given kind of mathematically structured system), cannot include a meta-theoretic analysis of the “logical form” of these schemata themselves and so cannot be used to distinguish between those cases in which meaning is determined a priori, or logically, from those in which it is determined a posteriori, or linguistically. And that as a consequence, Carnap’s pragmatism cannot avoid falling victim to Quine’s (1951) “more thorough pragmatism”. In response, I note that17 when it comes to the meta-theoretic analysis of the “logical form” of the schemata of structured systems, i.e., when we come to discuss logical structure of structured systems, such talk can be interpreted using the tools of categorical logic. Indeed, we may say that categorical logic provides the calculus for mathematical structure and the language for logical structure because it allows us to talk about the “logical form” of structured systems as something superadded to the content of mathematically structured systems and as inhering in the content of logically structured systems. For example, it allows us to talk about the specific, proof- theoretic, structure of any mathematically or logically structured system qua axiom system by considering formulas as ‘objects’ and rules of inference as ‘morphisms’. (See Lambek and Scott (1989)) Moreover, we can also consider the general logical structure of any kind of structured system, itself, as a kind of category, and thereby use the resources of categorical logic to provide a “linguistic framework” for our talk of the general logical structure of any mathematically or logically structured system. For example, we can make use of a species of category, called a ‘topos’ wherein “an object X of a topos will be viewed as a type, or sort, or species, or generalized set, or class of things — the X’s.” (Awodey (1996), p. 224) What this means is that we do not have to give up “truth in virtue of meaning” in favor of superadding a meta-linguistic (structure-theoretic or set-theoretic) account of the “essence” of mathematical or logical schemata
150
Intuition and the Axiomatic Method
because the “logical form” of the schemata of structured systems can be linguistically presented, i.e., presented in the language of category theory, as both superadded to the content of mathematically structured systems and as inhering in the content of logically structured systems. In this way, a topos provides the schema for what we say about the logical structure of the various kinds of structured systems. And so, to answer both Quine and the later Wittgenstein, if we take the structuralist stance that “mathematics deals with objects and systems as positions in structured systems” and agree with Carnap that “structure yields objectivity”, then we have answered how, without invoking Kantian intuition as the form of mathematical construction, Fregean logical form as the content of pure thought, or Russellian logical objects as atoms of meaning, we can, when we talk about the mathematical and logical structure mathematical objects and systems, both make sense and speak with objective significance.
Notes 1. Dummett (1994), gives a similar account of the semantic tradition which he terms “the linguistic turn”. However, Dummett uses the linguistic turn to characterize analytic philosophy in general, whereas Coffa means for the semantic tradition to be used to distinguish semantic philosophers from both Kantian philosophers and logical positivists. 2. The “psychologistic interpretation” of Kant can be characterized as the view that the categories, and the forms of space and time, are aspects of our psychological make-up that arise from our physical constitution. That is, while the forms of space and time arise from the constitution of our sensory faculty, the categories arise from that of our intellect. In either case, they are the psychological (as opposed to logical and/or transcendental) conditions for the possibility of the objective representation of our subjective knowledge. The “logical interpretation”, on the other hand, sees the categories, and the forms of space and time, as logical requirements for knowledge. They are not grounded in the way we are constituted, but are taken to be features of our epistemic situation — conditions or rules that have to be adhered to if we are to gain knowledge. That is, they are the logical and/or transcendental conditions for the possibility of the objective presentation of our subjective knowledge. (See Friedman (1994), for a reading of Kant along this, logical, theme.) 3. See Kant, CPR, especially, A6 and A7. 4. See Frege (1884), especially §70–§78. 5. For an explicit definition of Basic Law V, see Frege (1903), p. 72. 6. See Wittgenstein (1921), p. 5. 7. For a more detailed account of Quine’s semantic holism and its implications for Carnap’s position see his “Two Dogmas of Empiricism” in Hart (1996), p. 31–51. In particular, note his claim that “Carnap, Lewis, and others take a pragmatic stand on the question of choosing between language forms, scientific frameworks; but their pragmatism leaves off at the imagined boundary between the analytic and the synthetic. In repudiating such a boundary, I espouse a more thorough pragmatism. Each man is given a scientific heritage plus a continuing barrage of sensory stimulation; and the considerations which guide him in warping his scientific heritage to fit his continuing sensory promptings are, where rational, pragmatic.” (Quine in Hart (1996), p. 51) In this light, I hope to show that, when talking about both mathematical and logical structure, one does not have to take a pragmatic stand on the question of choosing between linguistic frameworks. One can objectively present mathematical systems as ‘objects’ of linguistic frameworks, without having to rely on an imagined boundary between the analytic and the synthetic, and without having to rely on the webbed-belief that pragmatic concerns outweigh linguistic analyses. 8. See Frege (1882c). 9. For a thorough and insightful analysis of Carnap’s belief that structure yields objectivity, as such belief is intended and used in the Aufbau, see Richardson (1998), especially Chapter 2, Sec. 2. 10. The Neo-Kantian position may best be understood as advancing the “logical interpretation” of Kant. See footnote 3.
Intuition, Objectivity and Structure
151
11. Definition: A specific category C is a two-sorted system, the sorts being called objects of C, denoted by X, Y, . . . , and morphisms of C, denoted by f , g, such that i) Each morphism f has an object X as domain and an object Y as codomain, indicated by writing f : X → Y ii) If g is any morphism g: Y → Z with domain Y, the codomain of f , and codomain Z, there is an morphism k = gof : X → Z called the composition of f and g iii) For each object X there is an morphism 1X : X → X called the identity morphism of X. iv) The following axioms are satisfied: For all morphisms f : X → Y, g: Y → Z, h: Z → W a) identity: f o1x = f , 1y of =f and b) associativity: f o(goh) = (f og)oh Examples: Set — sets as objects, functions as morphisms, Vect — vector spaces as objects, linear maps as morphisms, Group — groups as objects, homomorphisms as morphisms, Top — topological spaces as objects, continuous functions as morphisms. 12. One question that may arise at this point is: Even if we accept that category theory provides us with the language of mathematics, that it allows us to talk about the structure of mathematical objects and systems, how is mathematical content to be legitimated? The answer to this question is provided by the following observation: if we take the meaning of a mathematical object or statement to be fixed from within an interpretation of a specific structured system, and interpretation of, say, set theory or group theory, then we get all the meaning that we want or need. Or, as Bell puts it: this [category-theoretic approach] does not necessarily require that we espouse the extreme formalist view [of superadding content to form] . . . No, I think the answer lies in recognizing that the meaning (or reference) of these concepts is determined only relative . . . to the local frameworks of interpretation. (Bell (1986), p. 4) The content of mathematical discourse, the sense of a mathematical concept, is not determined by its role within a specific category. Rather, the sense of a mathematical concept is determined by its definition within an interpreted theory. This is what is meant by the claim that “under the local interpretation a mathematical concept possessing a fixed sense now inevitably has a variable reference” . (Bell (1986), p. 4) 13. Definition: A functor F : A → B is a ‘morphism’ from one category, A, to another, B, which preserves the cat-structure of a specific system, i.e., is a ‘structure-preserving’ interpretation of a specific category, A, into another, B, by taking objects to objects and morphisms to morphisms and respecting domains and codomains, composition and identity. 14. The forgetful functor from Group to Set takes a group to the set of its elements and a group homomorphism to the corresponding function. 15. Definition: A general category C can be thought of as an object in the category of categories Cat, whereby objects and morphisms within C are thought of as functors 1 → C and 2 → C respectively, where 1 is a terminal category and 2 is a category with exactly two global elements 0: 1 → 2 and 1: 1 → 2. This approach to defining what I have called ‘general categories’, begins with an axiom that says that categories and functors collectively form a category; that is, functors have domains and codomains, and compose and so on. (See McLarty (1995), p. 110) Thus, we have a connection between general and specific categories in the sense that “A category has objects and morphisms . . . ” becomes shorthand for “We begin by looking at the functors to a category from 1 and 2” (See McLarty (1995), p. 110). 16. Elsewhere, Landry (1999), I provide a more explicit account of how my view of category theory as the language of mathematics can be seen as providing a framework for mathematical structuralism. In this sense, then, category theory is neither taken as an object-language nor as a meta-language. While propositions, duly interpreted from within some mathematical theory, are taken as providing what might be called the object-language of mathematics, category theory, insofar as it aligns itself with the structuralist belief that we must start our analysis with structured objects and systems and their morphology, ought to be seen as providing the “structure-language” of mathematics. Thus, the statements of an interpreted theory provide the object-language only by virtue of their talking about interpreted “positions in structured systems”. 17. For example, an Algebraic Theory may be characterized as a category so its models are functors. In general, categorical logic, has the resources to treat theories T as categories. An interpretation of T in another theory T’ becomes a functor T → T’, so a model in terms of sets becomes a functor T → Set, in terms of vector spaces, from T → Vect, and so on. (See McLarty (1995), p.L 255).
References Awodey, S. (1996), “Structure in Mathematics and Logic: A Categorical Perspective” in: Philosophia Mathematica, 3 (4), 209–237. Bell, J.L. (1986), “From Absolute to Local Mathematics” in: Synthese 69, 409–426.
152
Intuition and the Axiomatic Method
Carl, W. (1994), Frege’s Theory of Sense and Reference: Its Origins and Scope, Cambridge University Press. Carnap, R. (1928), (Aufbau) The Logical Structure of the World, translated by R. A. George (1967), Routledge & Kegan Paul, London. Carnap, R. (1934), The Logical Syntax of Language, translated by A. Smeaton (1937), Kegan Paul, London. Carnap, R. (1956), “Empiricism, Semantics, and Ontology” in: Benacerraf, p. and H. Putnam (eds.) (1991), Philosophy of Mathematics, 2nd ed., Cambridge University Press. Coffa, J.A. (1991), The Semantic Tradition from Kant to Carnap: To the Vienna Station, Cambridge University Press. Dummett, M. (1984), Truth and Other Enigmas, Harvard University Press, Massachusetts. Dummett, M. (1991), Frege. Philosophy of Mathematics, Harvard University Press, Massachusetts. Dummett, M. (1994), Origins of Analytic Philosophy, Harvard University Press, Massachusetts. Frege, G. (1879), Begriffsschrift, in Conceptual Notation and Related Articles, Bynum, T.W. (trans. and ed.) (1972), Oxford University Press, 101–203. Frege, G. (1880/81), “Boole’s Logical Calculus and the Concept-script” in: Long, P., and R. White (eds.) (1979), Posthumous Writings, Blackwell, Oxford, 9–46. Frege, G. (1882a), “On the Scientific Justification of a Conceptual Notation” in: Bynum, T. W. (trans. and ed.) (1972), Conceptual Notation and Related Articles, Oxford University Press, 83–89. Frege, G. (1882b), “On the Aim of Conceptual Notation” in: Bynum, T. W. (trans. and ed.) (1972), Conceptual Notation and Related Articles, Oxford University Press, 90–100. Frege, G. (1882c), “Boole’s Logical Formula-language and my Concept-script” in: Long, P., and R. White (eds.) (1979), Posthumous Writingc, Blackwell, Oxford, 47–52. Frege, G. (1884), (Die Grundlagen der Arithmetik) The Foundations of Arithmetic, Austin, J.L. (trans.) (1986), Northwestern University Press, Illinois. Frege, G. (1885), “On Formal Theories of Arithmetic” in: McGuinness, B. (ed.) (1984),Collected Papers on Mathematics, Logic, and Philosophy, Blackwell, Oxford, 112–121. Frege, G. (1893), The Basic Laws of Arithmetic, Furth, M. (trans. and ed.) (1967), University of California Press. Frege, G. (1897), “Logic” in: Long, P., and R. White (eds.) (1979), Posthumous Writings, Blackwell, Oxford, 126–151. Frege, G. (1906), “Reply to Mr. Thomae’s Holiday Causerie” in: McGuinness, B. (ed.) (1984), Collected Papers on Mathematics, Logic, and Philosophy, Blackwell, Oxford, 341–345. Frege, G. (1924/25), “Numbers and Arithmetic” in: Long, P., and R. White (eds.) (1979), Posthumous Writings, Blackwell, Oxford, 275–277. Friedman, M. (1994), Kant and the Exact Sciences, 2nd. ed., Harvard University Press, Massachusetts. Goldfarb, W. (2001), “Frege’s Conception of Logic”, in: Floyd, J. and S. Shieh (eds.), Futures Past: Reflections on the History and Nature of Analytic Philosophy, Harvard University Press, Massachusetts. Hart, W. D. (ed.) (1996), The Philosophy of Mathematics, Oxford University Press. Kant, I. (1781 & 1787), The Critique of Pure Reason, N. Kemp Smith (trans.) (1992), The Macmillian Press Ltd., London. Lambek J. and P. J.Scott (1989), Introduction to Higher Order Categorical Logic, Cambridge University Press. Landry, E. (1999), “Category Theory: The Language of Mathematics” in: Philosophy of Science 66 (Proceedings), 14–27. Lawvere, F. W. (1966), “The Category of Categories as a Foundation of Mathematics” in: Proc. Conference Categorical Algebra, LaJolla 1965, Springer, New York, 1–20. Maddy, P. (1992), Realism in Mathematics, Clarendon Press, Toronto.
Intuition, Objectivity and Structure
153
Mayberry, J. (1994), “What is Required of a Foundation for Mathematics?” in: Philosophia Mathematica 3 (2), Special Issue, “Categories in the Foundations of Mathematics and Language”, Bell, J. L. (ed.), 16–35. McLarty, C. (1993), “Numbers Can Be Just What They Have to Be” in: Noˆus 27, 487–498. McLarty, C. (1995), Elementary Categories, Elementary Toposes, Clarendon Press, Toronto. Quine, W. V. (1951), “Two Dogmas of Empiricism” in: Hart, W. D. (ed.) (1996), The Philosophy of Mathematics, Oxford University Press, 31–51. Resnik, M. D. (1980), Frege and the Philosophy of Mathematics, Cornell University Press, Ithaca. Resnik, M. D. (1981), “Mathematics as a Science of Patterns: Ontology and Reference” in: Noˆus 15, 529–550. Resnik, M. D. (1996),”Structural Relativity” in: Philosophia Mathematica 3 (4), Special Issue, “Mathematical Structuralism”, Shapiro, S. (ed.), 83–99. Richardson, A. (1998), Carnap’s Construction of the World, Cambridge University Press. Russell, B. (1918), The Philosophy of Logical Atomism, Pears, D. (ed.), (1993), Open Court, Illinois. Russell, B. and A. N. Whitehead (1925), Principia Mathematica, 2nd. ed., Cambridge University Press. Shapiro, S. (1996), “Space, Number and Structure: A Tale of Two Debates”, in: Philosophia Mathematica, 3 (4), Special Issue, “Mathematical Structuralism”, Shapiro, S. (ed.), 148– 173. Shapiro, S. (1997), Philosophy of Mathematics: Structure and Ontology, Oxford University Press. Tait, W. W. (1986), “Truth and Proof: The Platonism of Mathematics” in: Hart, W. D. (ed.) (1996), The Philosophy of Mathematics, Oxford University Press, 142–167. Wittgenstein, L. (1921), Tractatus Logico-Philosophicus, Pears, D. F. and B. F. McGuinness (trans.) (1992), Routledge & Kegan Paul, London.
II
PHYSICAL ASPECTS
INTUITION AND COSMOLOGY: THE PUZZLE OF INCONGRUENT COUNTERPARTS Brigitte Falkenburg Universit¨at Dortmund, Germany
Kant’s theory of intuition emerged from an intriguing puzzle concerning the mathematical foundations of his pre-Critical cosmology, the puzzle of incongruent counterparts. However, the puzzle does not reduce to a choice between space-time substantivalism or relationalism, as suggested in current philosophy of science. Kant brought intuition up since he thought that in the face of incongruent counterparts, neither a relationalist nor a substantivalist account of space was tenable. Indeed for Kant the puzzle of incongruent counterparts indicates limitations of what we now call the axiomatic method. In the following I will try to show that these limitations are due to irreducible non-extensional features of physical magnitudes. Kant’s 1768 point about incongruent counterparts has often been misunderstood. Most modern interpretations neglect the historical background, namely Kant’s pre-Critical attempt at laying the grounds for a system of metaphysics and cosmology in face of the famous Leibniz-Clarke debate. His puzzle concerned the question of how a spatio-temporal object can be individuated by means of spatio-temporal relations only, if the object is a possible world. Kant always believed in Leibniz’s famous invariance arguments against Newton’s absolute space and time. In his 1755/56 cosmology and metaphysics, he presupposed that space and time are relational. But he had already partially dispensed with Leibnizian metaphysics, namely with Leibnizian monads as distinguishing marks of individual objects. In 1768, he realized the existence of incongruent counterparts, i.e. of left-handed and right-handed objects which are mirror-symmetric to each other while they are mirror-asymmetric in themselves; and he argued that they are incompatible with a relational theory of space. So what could space and time be, if they were neither relational nor absolute? He cut the Gordian knot by assuming that space and time are not objective but subjective. His Critical philosophy suggests that they are neither
157 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 157–180. © 2006 Springer. Printed in the Netherlands.
158
Intuition and the Axiomatic Method
relations of things and events nor a vacuous background of all things in the world, but representations a priori, namely pure intuitions. Thus genetically, Kant’s theory of intuition cannot be separated from his 1768 paper on incongruent counterparts. But is this historical background related to any issue that might interest us today, armed with modern mathematics and in the context of modern physics? To answer this question, I will proceed in the following steps. First, I will analyse Kant’s 1768 puzzle of incongruent counterparts (1) in its neglected cosmological context and (2) I will show that it concerns the question of how to individuate mirror-symmetric possible worlds. (3) Next, I will investigate in detail how Kant resolves the puzzle in 1770 on the grounds of his Critical theory of space. (4) Finally, I discuss my results in the light of current physics and show that the puzzle of incongruent counterparts has epistemological aspects which can not be resolved in pure formal terms (5).
1.
The background: Kant’s early cosmology
The argument of incongruent counterparts has often been misunderstood to be a pure mathematical argument.1 In fact it belongs to what we now call physical geometry. In Kant’s words, it belongs to the “real use of reason” in cosmology as a part of metaphysics. The argument relates closely to Kant’s pre-Critical project of establishing a system of metaphysics by means of the analytic method. In this section, I will show that it stems from a cosmological re-interpretation of Leibniz’s project of an analysis situs. In a nutshell, the argument confronts its modern interpreters with the following puzzle: taken together with Kant’s pre-Critical (Wolffian) theory of ideas, it proves that a relational concept of space has no real possibility. From the point of view of Kant’s later theory of space and time as pure forms of intuition, however, it proves only that a relational view of space is counterintuitive. In both cases, no modern reader will fail to note its logical weakness. But in neglecting Kant’s own background, any logical criticism will miss the point. Let us start to unravel the story by explaining the background, Kant’s early cosmology and metaphysics. It is well known that they aim at reconciling the central metaphysical principles of Leibniz and Wolff (comprising the principle of sufficient reason), and a Newtonian view of nature (comprising the laws of Newtonian mechanics as well as atomism). It can be shown in detail that Kant’s three most important 1755/56 writings are conceived to establish a system of metaphysics in Wolff’s style.2 In the Nova Dilucidatio (1755), Kant analyses the domain of the principle of sufficient reason, and he derives the basic features of the fundamental entities underlying his ontology. They are monads interacting with each other, where the interactions are mediated by God’s omnipresence. In the Theory of the Heavens (1755), he explains the systematic constitution of the world (“systematische Verfassung des Weltbaus”) which is observed in the solar system and the Milky Way. He sketches the development of the solar system, of galaxies and of the universe as a whole. In addition, he
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
159
offers a physico-teleological proof of the existence of God. This proof is based on the hypothesis that the present order of the universe developed from chaotic initial conditions according to the laws of Newton’s mechanics. In the Monadologia Physica (1756), he argues on the grounds of a dynamic (Boscovich type) atomism that matter consists of point-like “physical monads” as its ultimate parts, even though space is infinitely divisible and matter is extended, i.e. spatial. Taken together, these three writings provide the principles of a metaphysica generalis et specialis comprising ontology, theology, psychology and cosmology. These principles should provide the foundations of a Wolffian system of metaphysics unifying Leibniz’s principle of sufficient reason with Newtonian ideas about the atomistic constitution of matter, about the mechanical laws which govern the celestial bodies, and about God’s omnipresence in the physical world. Even though Kant never exposed this system, his pre-Critical philosophy differs from 18th century eclecticism in its systematic, that is, non-eclectic approach. The Prize Essay published in 1764 deals with the adequate method of philosophical reasoning which belongs to the “real use of reason”. Here, Kant argues that philosophy should not rely on the synthetic method of mathematics because this method results in arbitrary definitions. Philosophy should only rely on the analytic method of Newtonian science. At that time, Kant is still convinced that the use of Newton’s analytic method results in adequate metaphysical concepts and principles that derive from our empirical ideas about the world. As an example of analytic reasoning, the Prize Essay repeats the argument of the Monadologia physica according to which bodies consist of indivisible substances.3 It can be shown, however, that Kant’s pre-Critical metaphysics does indeed not rely on a coherent methodology, but rather on several variants of the analytic method which are only loosely connected. (In his pre-Critical period, he was not really able to escape from eclecticism. And in 18th century philosophy, many variants of the analytic method were floating around.4 ) In particular, he combines something like conceptual analysis in the Cartesian sense and the analysis of the phenomena in the Newtonian sense. The latter is based on empirical phenomena and phenomenological laws (experimental and causal analysis), whereas the former is partially logical (analysis of concepts), partially psychological (analysis of our ideas). For the “real use of reason” in metaphysics, Kant matches both variants of the analytic method in a very specific way. The analytic method proposed in his Prize Essay is an analysis of our ideas which represent the external world.5 Around 1764, Kant still hoped that the analysis of our ideas of the external world yield adequate real definitions of our metaphysical concepts. At this time as well as after his Critical turn, his goal was to make our knowledge of the world certain and coherent, by providing it with unique metaphysical foundations. For Kant’s pre-Critical project of integrating Newtonian science into a Wolffian system of metaphysics, the analysis of our ideas of space and time was crucial. The analysis of our idea of space had not only been one of the cornerstones of the proof of atomism he gave in the Monadologia physica. In-
160
Intuition and the Axiomatic Method
deed it was also the starting point of his 1768 writing Concerning the Ultimate Ground of the Differentiation of Directions in Space. But the argument on incongruent counterparts given there is the turning point of Kant’s philosophical development. In Kant’s view it demonstrated that the analytical method, when applied to our idea of space, does not give rise to adequate real definitions, but to the destruction of the foundations of his preCritical cosmology and metaphysics. Kant’s early attempts at providing the physics of his day with unique metaphysical foundations relied on a relational theory of space and time. The argument of incongruent counterparts, however, should convince the reader that a relational theory of space is untenable, and that we need some concept of an absolute space, be it in a Newtonian or in another sense. What had happened? The weak spot of the pre-Critical cosmology is the way in which Kant uses Leibniz’s principle of indiscernibles. Indeed he makes an incoherent use of the principle. In Leibniz’s own interpretation it says that two objects are identical if they have identical internal properties. This interpretation excludes the existence of two objects in the world which are only numerically different, such as atoms, or the parts of an empty space inside or outside the world. In his 1755/56 writings, Kant does not thoroughly accept this strict interpretation of the principle. He accepts its use in Leibniz’s refutation of absolute space and time, but he rejects its use in Leibniz’s refutation of atomism. Indeed Kant’s pre-Critical cosmology attempts to unify the following features of Leibniz’s and Newton’s views of space, time and matter. (1) From his very first writing, the True Estimation of Living Forces, Kant presupposes a relational theory of space and time. In his writings up to 1756, this relationalism was tacitly based on a realistic interpretation of Leibniz’s view of space and time as orders of the coexistence and succession of the monads (their perceptions respectively). Only in 1758, Kant gave an explicit argument for his relationalism, by claiming that the existence of an empty space or a vacuum within the world is absurd. The argument appealed indirectly to Leibniz’s invariance arguments against Clarke’s defence of Newton’s absolute space, which are indeed based on the principle of sufficient reason, or on the principle of indiscernibles as its consequence.6 At that time, Kant had probably not yet read the Leibniz-Clarke correspondence, but he did so in the 1760s. Indeed he supported Leibniz’s arguments against absolute space and time throughout his life.7 In 1755, in the Nova Dilucidatio, he supported a Leibnizian theory of space and time, but re-interpreted in terms of really interacting entities. This re-interpretation stands in manifest opposition to Leibniz’s genuine relational theory, according to which space and time are relational phenomena which are well-founded by noumenal monads, i.e. relationless substances. The Nova Dilucidatio explains that space and time are the relations of coexistence and succession of physically interacting monads. (2) But in addition, the Nova Dilucidatio defends a Newtonian theory of individuation. Numerically identical objects such as crystals are observed in
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
161
nature. To ask for unobservable distinguishing marks behind the phenomena would mean “to search for knots in a bullrush”.8 Here, Kant criticizes Leibniz’s well-known interpretation of the principles of indiscernibles. He argues that this interpretation does not derive from the principle of sufficient reason but exceeds the range of the principle. He defends Clarke’s Newtonian way of reading the principle, namely that position is sufficient to individuate objects of the same kind (such as crystals, or atoms). Obviously, he cannot consistently defend his early atomism of the Monadologia physica while applying Leibniz’s version of the principle of indiscernibles to atoms. But how can both features be combined? How may relationalism and the existence of numerically identical objects within the world go together? Only if Kant’s relational theory of space and time is strong enough to explain the position of each physical monad on relationalist grounds, that is, from the spatiotemporal relations of all monads in the world alone. In short, the 1755/56 cosmology is only tenable if relationalism is sufficient to individuate objects. Unfortunately this becomes highly problematic, if we deal with metaphysical objects such as possible worlds. In 1768, Kant detects a consistency problem in his pre-Critical metaphysics. In his view, the existence of incongruent counterparts proves that relationalism does not suffice to individuate objects on their own, by means of their internal relations only. Detailed analysis of the argument reveals the following. In 1768, Kant seems to draw three conclusions. (i) The 1755/56 theory of individuation is inadequate. (ii) The relational concept of space has no real possibility. (iii) A Leibnizian theory of space and time as relations of substances is (unfortunately) as untenable as Newton’s theory of absolute space and time. In 1770, he escaped the puzzle by arguing that the individuation of objects is based on space as pure intuition.
2.
The puzzle of incongruent counterparts
Concerning the Ultimate Ground of the Differentiation of Directions in Space starts with a criticism of Leibniz’s analysis situs. This was an innermathematical project which Leibniz suggested but never completed. It aimed at establishing geometry by means of spatial relations alone. From a modern point of view, it aimed at giving geometrically axiomatic foundations in terms of a few primitive relations. Kant’s criticism is harsh and polemical. His harsh tone has misled several modern interpreters to think that his point on incongruent counterparts is primarily mathematical. As we shall see, it is not. It belongs to the foundation of cosmology and deals with possible worlds. In Kant’s view, Leibniz’s program of giving conceptual foundations to geometry has to prove successful for physical geometry. His 1768 criticism of Leibniz’s analysis situs can only be understood in the light of his own pre-Critical project of establishing a system of metaphysics by means of the analytical method. Remember the Prize Essay: in 1762, Kant explicitly argued that the correct method of mathematics is synthetical but the correct method of meta-
162
Intuition and the Axiomatic Method
physics is analytical. According to his view of the distinction of mathematics and metaphysics, it is misconceived to restrict the program of an analysis situs to pure mathematics. For him, the analysis of the conceptual contents of our idea of space does not aim at constructing arbitrary mathematical concepts. It aims at the metaphysical foundations of a cosmology comprising Newtonian physics. Concerning space, Kant makes the following distinction. The geometrical concept of space is evident. According to Kant’s pre-Critical epistemology, it belongs to the logical use of reason and gives rise to “synthetic” mathematical definitions. The cosmological concept of space, however, belongs to the real use of reason. Since it is not evident at all whether our logico-mathematical concepts are adequate for the real use of reason, the cosmological concept of space needs an “analytic” metaphysical foundation. In his 1755/56 writings, Kant had used the analytic method only for establishing his atomistic view of matter and his Newtonian view of structure, natural history and origin of the universe. In 1768, apparently he became aware that he had never made any effort to establish his relational concept of space in the same way. His methodological considerations in the Prize Essay should make it clear that for him an inner-mathematical project of analysis situs makes no sense. Kant’s 1768 argument deals with a cosmological problem! Correspondingly, after harshly criticizing Leibniz’s project of an analysis situs he confesses that he does not really know to what extent his own subject is related to this very Leibnizian project. Subsequently, he reinterprets the latter as follows in metaphysical terms: But to judge by the meaning of the term, what I am seeking to determine philosophically here is the ultimate ground of the possibility of that of which Leibniz was intending to determine the magnitudes mathematically.9
According to this passage, his goal is to determine the “ultimate ground”, or sufficient reason, of the geometrical concepts which are subject to Leibniz’s analysis situs. In contradistinction to Leibniz, he wants to determine this ground philosophically, not mathematically. The “possibility” of geometrical concepts he deals with in philosophy can not be the logical possibility which he attributed to mathematical concepts in his pre-Critical philosophy. According to the Prize Essay, mathematical concepts are constructed according to the synthetic method. Their construction results in arbitrary combinations of concepts and in nominal definitions. The possibility of philosophical, i.e. metaphysical, concepts is real possibility. Kant’s task of 1768 is to give a real definition of the cosmological concept of space, which is the metaphysical foundation of the geometrical concept of space. His method for finding such a definition must be the analytic method, which the Prize Essay had declared to be the only adequate philosophical method. In the central passages of his 1768 writing he analyses our idea of space by considering the geometrical features of its phenomenological contents, namely of the geometric objects we know. His goal is to take them as a touchstone for
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
163
the question of whether Leibniz’s analysis situs is adequate in the real use of reason, that is, to physical geometry. His analysis reveals a spatial property of geometric objects which (in his view) does not derive from Leibniz’s relational account of geometry. Our idea of space embraces left-handed or right-handed objects such as hands and screws, which exhibit mirror-asymmetry. They have an incongruent counterpart, i.e. they are not congruent with their counterpart obtained by spatial reflection. In doing so they exhibit (in Kant’s view) an intrinsic property which is not relational in the sense of Leibniz’s analysis situs, namely their left- respectively right-handedness. In Kant’s view handedness can only be defined in terms of a relation to the “directions” in space, that is, to the distinctions of right and left, above and below, in front of and behind. These “directions” are defined in relation to “absolute space” in the sense of an abstract totality of extension,10 or, as we would say today, to a given coordinate system with a given left- or right-handed orientation. (In Kant’s account it remains unclear whether handedness is an intrinsic property of objects, or whether it is a relational property in the sense of the relation of an object to a surrounding background space.) What are the achievements of Leibniz’s original, mathematical analysis situs, and how do they fail in Kant’s view with regard to incongruent counterparts? Leibniz’s project aims at establishing geometry in an axiomatic approach which is based on a few primitive relations. According to his project, Euclidean geometry can be established by means of only two primitive rela¨ tions: equality (“Gleichheit”) or length identity, and similarity (“Ahnlichkeit”) or angle identity. A third central concept of geometry, congruence, is not primitive but derived. It results from their combination. Now Kant turns our attention to the matter of fact that a right hand and a left hand are incongruent, even though they are identical in lengths and angles. Thus his analysis of our idea of space shows that for physical geometry we need identity under reflection, or mirror symmetry, as an additional primitive relation. That is, Leibniz’s project of an analysis situs leaves the axiomatic basis of geometry in an undetermined way. From a mathematical point of view, two remarks have to be made. First, Kant’s observation strengthens Euler’s mathematical criticism of a Leibnizian relational view of geometry. Euclidean 3-space does not only have an affine structure which is not captured by Leibniz’s (merely topological) foundations of geometry (Euler 1748), it is also orientable (Kant 1768). Second, this disease of Leibniz’s relationalist project of an analysis situs might be cured by adding a third primitive relation, namely mirror symmetry. But this is not what Kant wants to show. His next step is non-mathematical, or metaphysical, and crucial. He argues that the orientation of a right-handed or a left-handed object is an absolute property which derives from absolute space: What we are trying to demonstrate, then, is the following claim. The ground of the complete determination of a corporeal form does not depend simply on the relation and position of its parts to each other; it also depends on the refer-
164
Intuition and the Axiomatic Method ence of that physical form to universal absolute space, as it is conceived by the geometers.11
His central argument is based on a thought experiment which runs as follows: . . . imagine that the first created thing was a human hand. That human hand would have to be either a right hand or a left hand. The action of the creative cause in producing the one would have of necessity to be different from the action of the creative cause producing the counterpart. . . . However, there is no difference in the relations of the parts of the hand to each other, and that is so whether it be a right hand or a left hand; it would therefore follow that the hand would be completely indeterminate in respect of such a property. In other words, the hand would fit equally well on either side of the human body; but that is impossible.12
From a (modern) logical point of view, Kant’s conclusion looks like a non sequitur. God might create the rest of the world in accordance with the orientation of the single hand he created first. If he did so, he would end up either with our world or with the incongruent counterpart of our world. From Leibniz’s relationalist point of view, there is no internal difference between the two worlds. According to his principle of indiscernibles, both worlds are indeed not only physically equal but identical. This may be counterintuitive but it is obviously not logically impossible. Some modern interpreters accuse Kant of committing such a non sequitur. However, whoever does so misses the point of his argument, as it reads in its historical background (and that is, in 1768, for Kant himself). In 1768, Kant still uses a Leibniz-Wolffian logic and semantics. In addition, he deals with a cosmological respectively metaphysical problem. For him, “impossible” in the passage quoted above can only mean that the conclusion has no real possibility. Therefore I suggest reconstructing the logical structure of his argument roughly as follows: Premise 1: According to our idea of space, a single hand imagined as a first piece of creation is necessarily a right or left hand. That is, in relation to our idea of space it has definite handedness. Premise 2: According to relationalism, a hand on its own is undetermined with respect to the property of being a left or right hand. That is, it has no definite handedness. Conclusion: From a relationalist point of view, a single hand created on its own can be arbitrarily embedded in our idea of space. That is, in a creation resulting in creating our bodies it should fit to both sides of the body, which (according to our idea of space) seems absurd. Therefore, relationalism has no real possibility. The main point of this argument is: The spatial properties of incongruent counterparts show that relationalism is incompatible with our idea of space. Our idea of space is not relational, it is absolute as far as it determines the orientation of hands or screws. This point is fatal for Kant’s pre-Critical cosmology. If we put together our relational concept of space and our idea of space there
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
165
results an incompatible criterion for the individuation of objects. A relational concept of space does not suffice to individuate the left respectively right hand, which we imagine in our idea of space. Note, however, that the objection only works if a hand on its own, which makes up the whole universe, is conceived as an object. Only under this assumption, i.e. if objects within the world and objects which make up a world have the same individuation criteria, it turns out that the pre-Critical cosmology is based on an inconsistent theory of individuation. (In the Critical period, i.e. in the CPR, Kant obviously no longer holds this assumption.) According to Kant’s pre-Critical criticism of an unrestricted use of Leibniz’s principle of indiscernibles, numerically different objects such as atoms or crystals are possible. They are individuated in terms of position alone, within a relational system of physical monads. A left hand or a right hand on its own (as the only object in respect of the universe), however, is not part of a bigger relational system. Therefore it can not be individuated in terms of position. The relational theory of space is not sufficient to individuate a single hand on its own which is the only object in the world or, from a relationalist point of view, which makes up a world. According to relationalism, the parts of this hand are not sufficiently individuated to distinguish them in any terms. According to their internal relations, a left and a right hand exhibit no differences at all. Considered as two possible worlds, they are indiscernibles. (According to Leibniz, God would not have created one rather than the other since he had no sufficient reason to distinguish them.) Indeed relationalism enables us only to distinguish a left hand relative to a right hand, and vice versa. In Kant’s view, however, a definite handedness, or orientation, is an internal or “absolute” property of an object, which can only be explained in terms of a relation to some kind of absolute background space of physical geometry. Obviously the absoluteness of this property is due to our idea of space, i.e. to the orientation of our internal spatial coordinate system. According to Kant’s 1770 theory of space as a pure intuition, this orientation is a priori, even though each one of us may have learned it (. . . not easily at all,as we remember . . . ) as a child.13 In view of the argument of incongruent counterparts, on the one hand, and Leibniz’s well-established arguments against absolute space, on the other hand, Kant confronts a puzzle: both contemporary candidates for an “objective” account of space and time seem to be refuted. He shows a serious objection against his own former Leibnizian relational theory of space. But he cannot reject it in favour of a Newtonian absolute concept of space. He had good Leibnizian reasons for rejecting space-time substantivalism. He always believed that Leibniz’s objections against Newton’s concept of absolute space are substantial, and throughout his life he maintained the famous invariance arguments of the Leibniz-Clarke debate. The world and its translational “counterpart” shifted for some distance in space and time make no difference, thus
166
Intuition and the Axiomatic Method
there is no such difference. Therefore to talk about the location of the world in space and time is meaningless.14 In fact Kant makes incoherent use of Leibniz’s principle of indiscernibles. He subscribes to it in accepting the famous invariance arguments of the LeibnizClarke debate, according to which it makes no sense to ask for the location of the universe in space and time. But he insists that we should ask for the absolute orientation of a single hand which makes up a possible world. That is, he applies the principle of indiscernibles to continuous spatio-temporal symmetries, but he refuses to apply it in the same way to a discrete symmetry such as what we now call parity. As his early criticism of Leibniz’s use of the principle against atomism shows, he conceives of mirroring and permutations with the same, intuitive approach.15
3.
Solution of the puzzle: Space as intuition
After two years of wild attempts at finding a solution,16 Kant resolves the puzzle by making the distinction between concepts and intuitions. The 1770 Dissertation lays the foundations for the Critical theory of space and time as pure intuitions. Concepts and intuitions are two different kinds of representations. Concepts are general representations, whereas intuition is a singular representation. Concepts contain things under themselves, whereas space and time embrace all things within themselves.17 Space and time are singular representations a priori. In this way Kant replaces his pre-Critical “objective” account of space and time as relations of substances by the “subjective” theory of space and time as pure intuitions. If our idea of space is a pure intuition rather than a concept, the argument of incongruent counterparts is no longer fatal for cosmology. Intuition enables us to imagine a possible world which consists of a single hand only, and intuition lets us draw the conclusion that it is “impossible” to create a single hand that “would fit equally well on either side of the human body”. Obviously here we deal neither with logical possibility, nor with real possibility in the sense of asking for a real definition of physical space. Indeed the distinction of concepts and intuitions led to two crucial semantic shifts in Kant’s account of real possibility. The first can be found in the 1770 dissertation, when Kant still believed that unrestricted metaphysical knowledge, i.e. pure intellectual knowledge of the world, is possible. The second is found in the Critique of Pure Reason (CPR). In 1781, Kant denies the possibility of any cosmological knowledge that exceeds the limitations of possible experience. Before 1770, within the framework of a Leibniz-Wolffian doctrine of ideas, Kant had a Leibnizian concept of real possibility, according to which a concept has real possibility if it has a non-contradictory real definition. Leibniz’s own account of real possibility was indeed stronger. For Leibniz, real possibility means the compossibility of substances. The complete concepts of all things in the world must be compatible with each other. This requirement is
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
167
similar to semantical consistency in a modern sense. For Kant, who never believed in Leibnizian monads, a weaker condition remained: the real definition of a concept must not give rise to contradiction. (Here indeed real possibility means no more than logical possibility plus adequacy of a concept.) In 1762, Kant thought that the analytic method gives rise to non-contradictory real definitions of metaphysical concepts. In 1768, he saw that the analytic method, respectively the analysis of our idea of space plus Leibniz’s arguments against Newton’s absolute space, gave rise to a contradictory concept of space. According to Leibniz’s invariance arguments, space is not absolute. According to the argument of incongruent counterparts, it is. In terms of real possibility, the immediate conclusion of Kant’s 1768 puzzle is: the non-coincidence of incongruent counterparts shows that a relational concept of space is no real possibility. No concept of space has real possibility. (Only space as a pure intuition has real possibility. This is the truth that dawned on him in 1769, according to the famous Reflexion 5037.18 ) But what, then, does “real possibility” mean? The Critical answer (of 1781!) to this question is: Only concepts which are representable in pure intuition have real possibility (objective reality respectively). In 1768, Kant was not yet able to make a clear distinction between two kinds of representations, namely concepts and intuitions. After his 1770 dissertation, he was! Without that distinction, he inevitably confused two semantic requirements: (i) the intuitive requirement that a concept should be representable in space and time; and (ii) the logico-conceptual requirement that a concept should have real possibility in the sense of admitting of a non-contradictory real definition. In 1770, however, Kant was not yet at the point of identifying real possibility with representability in space and time. In the opening passages of the 1770 dissertation, he mentioned in a critical tone that “unrepresentable and impossible are commonly treated as having the same meaning”, giving rise to a misleading concept of the mathematical infinite19 . At that time, he still believed that cosmology in a traditional, Wolffian style could go together with his new theory of space and time. He thought that symbolic intellectual knowledge of an infinite world was possible, even though such a concept is not representable in our idea of space. In the 1770 dissertation he emphasized that there is no intuitive but only symbolic knowledge of intellectual things.20 He still thought that the “real use of reason” in metaphysics gave rise to a symbolic concept of the world as a sum total of all substances. Now he draws a sharp distinction between the real possibility of symbolic metaphysical concepts on the one hand, and the representability of spatio-temporal concepts in intuition, on the other hand. With this distinction between spatio-temporal representability and real possibility in mind, we see that the conclusion of the 1768 argument about incongruent counterparts has to be re-written as follows: . . . it would therefore follow that the hand would be completely indeterminate in respect of such a property. In other words, the hand would fit equally well on either side of the human body; but that is impossible to imagine.
168
Intuition and the Axiomatic Method
Here, “imagine” is to be understood in the sense of our idea of space. In the light of Kant’s 1770 distinction between real possibility and spatio-temporal representability, the 1768 argument confuses “(really) impossible” and “unimaginable in our idea of space”. Eleven years later, however, Kant’s concept of real possibility is no longer distinct from spatio-temporal representability. In the CPR, it is indeed bound to intuition. To be more exact: in contradistinction to the 1768 argument about incongruent counterparts it is no longer tacitly, but explicitly bound to the condition of being representable in space and time. This is the second, crucial semantic shift in his account of real possibility. On the basis of Kant’s theory of intuition, be it in 1770 or in 1781 and later, his pre-Critical thought experiment about the creation of a single hand (or the orientation of a possible world) is no longer a puzzle. It reveals the non-conceptual character of space as a pure intuition. The internal difference of incongruent counterparts cannot be depicted by concepts. It indicates that space is an intuition. In the 1770 dissertation, incongruent counterparts reappear as distinct singular representations which indicate that space is a pure intuition. Incongruent counterparts are “perfectly similar and equal” but they do not coincide.21 That is, they fall exactly under the same (relational) concepts, even though they exhibit a difference. The difference between a left hand and a right hand cannot be expressed by means of concepts, it can only be intuitively grasped: between solid bodies which are perfectly similar and equal but incongruent, such as left and right hands . . . there is a difference, in virtue of which it is impossible that the limits of their extensions should coincide . . . It is, therefore, clear that in these cases the difference, namely the incongruity, can only be apprehended by a certain pure intuition.22
It has been argued that this conclusion is a non sequitur.23 Indeed it is true that the mere incongruity of left and right hands is only relational. A left hand is not congruent to a right hand, and vice versa. From a relationalist point of view, the left hand and the right hand are intrinsically identical (see premise 2 of his central argument in sect. 2). For Kant, however, they are not, since according to our idea of space their difference is evident (see premise 1 of his argument in sect. 2). Kant’s point about the difference of left and right hands is indeed stronger than the relation of incongruity he mentions. (In the passage quoted above, incongruity comes into play “in virtue” of the difference, and not vice versa.) We cannot intellectually grasp the intrinsic property of being a left respectively a right hand. That is, without intuition we cannot definitely say which one we’re dealing with, if we conceive of a single hand; but with intuition, we can, as required. In Kant’s view this property, which we intuitively grasp, is absolute (in the sense of being defined in relation to an absolute intuitive background space). The difference between left-handedness and right-handedness is as irreducible to a mere relation between objects, as is the direction of the arrow of time, respectively, the temporal difference of past
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
169
events and future events.24 The same is the case for empirical qualities, say, the properties of being green or red. Their qualitative difference is irreducible.25 To see what kind of difference Kant has in mind, let us compare the case of a left hand and a right hand with the case of a green hand and a red hand. If the difference of left and right hands is reduced to the relation of being incongruent, then the difference of green and red hands is reduced to the bare numerical relation of the wavelengths of the light they reflect. Any strong physicalist who claims such reducibility obviously dispenses with the specific marks of the qualia of our perception. At first glance, this involves only replacing the secondary qualities of our perception by the primary qualities of natural kinds, from a physicalist point of view. However, no physicalist can claim that the differences of empirical properties are reduced to mathematical relations. Physicalism means reducing qualia to physical properties, that is, to primary qualities of natural kinds. Different colours correspond to different wavelengths, and these correspond to light quanta of different energies. The physical difference between light quanta of different energies is mathematically described in terms of a numerical relation but obviously it is not reduced to such a relation. Light quanta of different energies have different causal powers. (For example, the photo effect only works if there are light quanta of sufficient high energy, respectively if there is light of a sufficiently small wavelength.) Physical properties and their causal differences are more than bare mathematical relations. They are dynamical. They have irreducible intensional aspects which are not grasped by the axiomatic method. We cannot intellectually grasp them (and their qualitative differences) completely. An analogous observation concerning the spatial properties of being left-handed or right-handed is Kant’s point in the 1770 dissertation. Qualia are grasped in intuition. The physical properties that give rise to qualia such as visual shape, weight, colour, warmness etc. can be measured. In physics, they are expressed in terms of irreducible physical dimensions such as length, duration, mass or temperature. Kant’s point in 1770 is that left-handedness and right-handedness are something like formal qualia of our perception, which represent irreducible intrinsic spatial properties of objects. In the case of left hands and right hands, the qualitative difference is grasped by pure intuition. The difference between green and red hands is empirical. In both cases, however, we deal with intrinsic properties of objects even though we grasp them only in relation to our cognitive capacities. Kant’s Critical turn teaches that our cognitive capacities do not enable us to grasp the intrinsic properties of things in themselves. In pure intuition, we grasp phenomena rather than noumena, that is, secondary qualities rather than primary qualities of the substances in nature. For Kant, space as a pure form of intuition is our cognitive capacity to individuate spatial objects such as left and right hands. This cognitive capacity is not ontologically but epistemically absolute. It is an epistemic successor of Newton’s ontological concept of absolute space. The intrinsic property of hands and screws to be left-handed or
170
Intuition and the Axiomatic Method
right-handed is defined in relation to our intuition. Our idea of space, namely spatial intuition, is a cognitive absolute background space of physical geometry. In addition to this epistemically absolute intuition of space, Kant maintains a Leibnizian relational concept of space.26 This solution cuts the Gordian knot of the puzzle of incongruent counterparts. It unifies a Newtonian and a Leibnizian view of space and time at the price of a subjective account of both views. Space and time are secondary rather than primary qualities of things. Indeed this insight led not only to the two substantial semantic shifts in Kant’s account of real possibility mentioned above. It changed his whole approach to metaphysical problems. Kant realized that with space and time as pure intuitions, metaphysical realism is no longer tenable. In the years after 1770, he realised that his criticism of the traditional spatio-temporal foundations of cosmology also affects the rest of cosmology, i.e. the theory of the world and the substances which constitute the world. In the CPR, he criticized any metaphysics which is not grounded in transcendental philosophy. According to the CPR, traditional cosmology gives rise to the antinomy of pure reason.27 Indeed the recent discussion of handedness and incongruent counterparts falls back into positions which Kant had defended only before his Critical turn. Modern philosophy of science focuses on the question of whether handedness commits us to substantivalism or whether it is compatible with relationalism.28 In Kant’s 1768 view, however, the problem is that neither relationalism nor substantivalism of space are tenable. In his view, the puzzle of incongruent counterparts commits us to dispense completely with metaphysical realism. This conclusion may be too strong. Indeed from his Critical position of an empirical realism, there is no longer a puzzle. But this proves only that transcendental idealism is sufficient for resolving the puzzle. It may not be a necessary condition for doing so. A weaker conclusion is that incongruent counterparts commit us to admitting irreducible non-extensional aspects of physical properties. As I want to show in the following, this conclusion remains valid for modern re-formulations of the problem.
4.
Indiscernible worlds in modern physics
Let us resume. In Kant’s 1768 view, incongruent counterparts show that a purely relational account of space has no real possibility. At that time, “real possibility” means for him that a real definition of a metaphysical concept can be given, that is, a definition which is non-contradictory and adequate. On his later, Critical point of view the 1768 argument leads to quite another result. After 1770, it indicates for him that there are two kinds of representations, namely concepts and intuitions. The distinction between a right hand and a left hand is not ontic and relational, but epistemic and absolute. Handedness is an intrinsic property of an object, respectively of the way in which it is represented in our idea of space. This property as well as the difference between a left-
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
171
handed and a right-handed object cannot be expressed by concepts but only be apprehended by intuition. The mere relation of being incongruent neglects this property. Our idea of space, which is an intuition rather than a concept, remains underdetermined from a relationalist point of view. In the last section I argued that Kant’s substantial issue is the intensional aspect of physical properties. Physical properties are intrinsic to objects. Their differences cannot be reduced to mere mathematical relations, either in the case of green and red objects, or in the case of left and right hands, or in the case of past and future events. Neither Kant’s puzzle of 1768 nor its resolution in 1770 can be understood if this point is neglected. But how are we to understand the puzzle in view of modern physics? Does there remain any puzzle? To make this point clear, let us look again at Kant’s incoherent ways of using Leibniz’s principle of indiscernibles. Then, let us ask what we would make out of Leibniz’s principle in these cases, in view of modern physics. Kant uses Leibniz’s principle in three ways, or with regard to three kinds of symmetries. These uses do not fit together. 1. Translation symmetry: When applied to spatio-temporal translations of the world, the principle says: given that the world is invariant under translations in space and time, it makes no sense to ask for the location of the world in space and time. That is, space and time are not absolute but relational. Regarding this application, Kant follows Leibniz in his pre-Critical as well as in his Critical periods. There is no absolute space (except as an abstract idea of a logical totality of all empirical frames or coordinate systems). It makes no sense to ask for the position of the world in space and time, since the world is not the kind of entity that can have a position. There are only positions within space and time. 2. Mirror symmetry: When applied to two possible worlds which are incongruent counterparts of each other, Leibniz’s principle tells us — like in the case of translation symmetry — that it makes no sense to ask for a difference between them. Leibniz never explicitly makes this application. However, in his third letter to Clarke he argues that if space is absolute, there is no sufficient reason why God should have placed the bodies in the world in a preferred way, and not exchanged everything from east to west.29 Thus he argues that space is not absolute because the world is invariant under exchange of east and west. However, the question of whether “exchange of east and west” means rotation or mirroring of the world remains open. Rotation and translation are continuous transformations which belong to the same invariance group. Applied to the world, both transformations have the same physical meaning: they make no difference, they result in one-and-the-same world. But if Leibniz means invariance of the world under mirroring, he claims that the world is identical with its incongruent counterpart. Kant is not willing to draw this conclusion, as his 1768 argument shows. 3. Permutation symmetry: When applied to hypothetical objects (such as atoms) with identical intrinsic properties, the principle says that there are no
172
Intuition and the Axiomatic Method
such objects in the world. According to Leibniz, God has no sufficient reason for creating intrinsically identical objects in different places. Since they can be arbitrarily permuted, it makes no difference to arrange them in any order, and thus there are no indiscernibles in nature.30 Regarding this application, Kant never followed Leibniz, either in his pre-Critical or in his Critical time.31 . According to Leibniz’s principle of indiscernibles, two objects within the world as well as two possible worlds are identical if and only if they show no intrinsic dynamic differences. According to Kant’s incoherent use of the principle, (1) two possible worlds related by a continuous transformation such as spatial or temporal translation are identical: space and time cannot be absolute in a Newtonian sense; (2) two possible worlds related by a discrete transformation such as mirroring or time reversal are different: space and time cannot be merely relational in a Leibnizian sense; (3) two objects within the world may be intrinsically identical but numerically different: atoms are possible. Their permutation is a discrete transformation, it makes a difference. (Here, all modalities have again to be understood in the sense of real possibility.) Obviously Kant has different intuitions of continuous and discrete symmetry transformations. In his view a translation or rotation of the world makes no difference, whereas mirroring of the world or a permutation of intrinsically identical objects results in a different world, respectively, state of the world. From a modern axiomatic point of view, we have to look at these uses of Leibniz’s principle in terms of symmetries and their physical meaning. The first two cases concern the symmetries of space-time theories, the third case concerns the symmetries of physical systems such as crystals or atoms. In modern physical cosmology, the symmetries of space-time theories come together with the gauge symmetries of particle physics. With regard to both kinds of symmetries we have to ask: is there any intrinsic difference between our world and a possible world resulting from an active symmetry transformation?32 If NO, Leibniz is right, that is, such a possible world is identical with ours. If YES, we may further ask whether the difference can be intellectually grasped in formal terms and put on axiomatic grounds within modern physics. If this is the case, there is no puzzle. Both worlds exhibit different formal structures, and Leibniz’s principle does not apply to them for that reason. But if this is not the case, something is strange and unclear from the axiomatic point of view. This is the case I am after. Is it possible to find any intrinsic difference between two possible worlds related by a symmetry transformation, a difference which we can not intellectually grasp? If we find something like that, we have some modern correlate of Kant’s puzzle of incongruent counterparts.
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
173
Let me start with the continuous space-time symmetries. Physical geometry is characterised in terms of the Galilei group of non-relativistic mechanics, the Poincar´e group of special relativity, or general covariance plus Einstein’s field equations of general relativity. Two possible worlds related by a spatio-temporal symmetry transformation belong to the same class of equivalent frames. In non-relativistic mechanics and special relativity, all inertial frames are equivalent. They have the same physical meaning, that is, they give rise to the same kind of dynamic effects, respectively, physical phenomena. Is there any intrinsic difference between our world and a possible world resulting from a Poincar´e transformation? The Leibnizian answer is NO, any possible world resulting from an active Poincar´e transformation of our world is identical with ours. The principle of indiscernibles tells us that Poincar´e transformations have no physical meaning (at least as long as we exclude mirroring and time reversal). But is this really the case? Indeed a class of inertial frames, say, Minkowski space, is not yet a world. Remember the twin paradox. To resolve it, we need some preferred inertial system with a preferred time arrow, light cone, causal structure and history, where the travellers meet again for comparing their relative times, their clocks, and their private causal stories.33 Neither Minkowski space nor Leibniz’s principle tell us which frame is a preferred one. The only restrictions are: (i) it is acceleration free, and (ii) it is a frame where the travellers are able to exchange repeated signals for synchronizing their clocks. (I leave it open here whether it is possible to construct such a world within Minkowski space alone, in terms of several observers travelling around in inertial frames, meeting each other and synchronizing their clocks. In the case of a travelling twin who comes back to her sister, we obviously have to take into account acceleration effects and switch to some model of general relativity.34 ) In any case, an inertial frame in which the twins meet again, note their age difference and measure the time difference of their clocks, is intrinsically different from the rest frame of any observer who is moving outside the light cone of their meeting. In a certain sense, Minkowski space is an unphysical abstraction, and so is the claim that Leibniz’s principle holds. The twin paradox teaches, YES: two subsequent active Poincar´e transformations of the travelling twin make a difference in time. (Within general relativity, however, the difference can be put on axiomatic grounds.) Now, what about general relativity? General relativity gives us many possible worlds. Each one of them is a model of general relativity, it represents a solution of Einstein’s field equations. These models are obviously not physically equivalent. They differ substantially in their space-time structures. Poincar´e invariance is locally valid in each one of them, but (except in a flat universe) it is no longer a global symmetry. Some possible worlds have a strange topology, some of these are even causally bizarre. The G¨odel universe, for example, admits time travelling and violations of causality. Other possible worlds may correspond to cosmological models with a non-orientable space, or with unobservable higher dimensions. In a non-orientable space as well as in a space of
174
Intuition and the Axiomatic Method
higher dimension, left-handed objects transform into right-handed objects by a sequence of translations and/or rotations. That is, a global discrete symmetry of Euclidean 3-space may become local and continuous by embedding Euclidean 3-space (respectively some part of it) into another space. In such a space, the distinction between a left hand and a right hand is no longer an absolute internal property which an object exhibits in relation to some absolute space. It becomes a local relational property which depends on the space-time region where the hand is and/or has passed through.35 Indeed Kant’s puzzle about a single hand which makes up a world is not affected by the modern discussion of exotic models of general relativity, in which left or right hands are embedded. Obviously, a left-handed or righthanded universe as such (that is, as a universe) is not embedded into some bigger space with a bizarre topology, in which handedness turns out to be a relational property.36 Odd topological models are unintended models of general relativity, or insane candidates for cosmological models. The only candidates for a genuine distinction between a left-handed or right-handed universe can be found in particle physics, in processes with parity and/or CP-violation. Thus let us assume an unweird model of the cosmos in which the handedness of objects is a well-defined global property, in relation to the background metric of space-time. A corresponding physical property is the parity of elementary particles. Operationally, parity is a relational property which is attributed to particles according to the spatial distributions of their reactions and decays. If the distribution of a particle decay is mirror symmetric, the particle has parity zero; if it is mirror asymmetric, it has a positive or negative sign (even respectively odd parity). Parity is a quantum number which has been attributed to elementary particles according to the conservation laws which have been observed in their reactions and decays since the early days of cosmic rays physics and the first accelerator experiments. Operationally, parity is only determined up to an arbitrary sign. As an absolute property, the parity of an elementary particle can not be measured. Thus in a Leibnizian relational account of the physical world, it is only a relational property. However, in modern particle physics parity is an internal dynamic property of elementary particles respectively quantum fields. In quantum field theory, different kinds of elementary particles (respectively field quanta) correspond to the solutions of different (quantized) field equations. The solutions of their field equations are classified according to values of mass, spin and parity. The classification of these solutions according to mass, spin and parity corresponds to the irreducible representations of the Poincar´e group.37 Conceptually, or axiomatically, the physical particles of quantum field theory are nothing else than the irreducible representations of the Poincar´e group, classified in terms of mass, spin and parity. This formal particle concept of modern particle physics has its empirical basis in the particle tracks and reactions which are observed in bubble chamber pictures, drift chamber signals etc. According to this for-
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
175
mal concept and the underlying empirical observations, parity is an intrinsic property or an absolute distinguishing mark of elementary particles. In 1956, it was first observed that parity is violated. The laws of nature are not invariant under a parity transformation P. Weak interactions such as the famous decay of Co60 exhibit a left-right-asymmetry. In addition, in the early 1960s it turned out that the laws of nature are also not invariant under a combined transformation of parity and charge conjugation. A charge conjugation C transforms a particle into its anti-particle, say an electron into a positron: Ce− = e+ . There are electroweak interactions such as the decay 0 of neutral Kaons (K 0 , K ) which exhibit an asymmetry under a CP transformation. The decays of these particles and the decays of the corresponding anti-particles of opposite parity exhibit different decay probabilities, respectively, life times. Charge is like parity. It is an internal property of elementary particles, even though its operational meaning is only relational. (The absolute sign of a charge cannot be measured, it is a mere convention to attribute to electrons a negative charge, and to positrons a positive charge. But does an internal dynamic property such as charge reduce to our denotation?) The corresponding symmetry group, of which a certain charge picks out an irreducible representation, is the Lie group of a gauge invariant quantum dynamics. The crucial point here is: if you accept that charge is an intrinsic property of a particle, you have also to accept that parity is, because both are intimately related over the irreducible representations of the symmetry groups of particle physics. CP-violating processes violate both symmetries conjointly. A modern discussion of Leibniz’s principle and Kant’s incongruent counterparts has to take into account these symmetries of elementary particles. Obviously, to a world with CP-violation Leibniz’s principle does not apply: The world and its counterpart resulting from a CP transformation are dynamically not identical. They differ intrinsically, the lifetimes of particles with CPviolating decays differ in both worlds. However, according to current particle physics the laws of nature are invariant under a PCT-transformation. T is time reversal. Understood as an active symmetry transformation, T maps any physical process to an opposite process which runs backwards, that is, which begins at the end of the first one and vice versa. (Only reversible processes have a time-reversed counterpart in nature. The time-reversed counterpart of a falling stone is a stone rising up to my hand. Most processes in nature are irreversible, for example thermodynamic processes and quantum measurements.) According to the PCT-theorem the laws of nature are conserved under the combined symmetry transformations C (charge conjugation), P (parity transformation) and T (time reversal). The PCT-theorem says that any particle reaction has a counterpart in nature which happens to the corresponding anti-particles of opposite parity as a reaction of reversed time order. If the PCT-theorem is not violated, the universe and its PCT-transformed counterpart obey the same laws of nature. (If PCT is conserved, any CP-violating process has a time-reversed
176
Intuition and the Axiomatic Method
CP-counterpart in nature. This can be empirically tested. Up to now, no PCTviolating process has been found.)
5.
Conclusions
Thus a modern correlate of a Kantian incongruent counterpart world reads as follows. Imagine the PCT-counterpart of our universe. In it, all particles are replaced by anti-particles; all processes are Kantian spatial incongruent counterparts of the processes in our universe; and the time order is reversed, i.e. (in a Big Bang cosmology) the universe contracts towards a Big Crunch. According to a strictly interpreted Leibnizian principle, both universes are not only physically equal but they are identical. Thus a modern Leibniz should claim that the world and its PCT-transformed incongruent antimatter-counterpart running backwards in time is identical with ours. This is highly counterintuitive. Even Leibniz would say that it simply makes no sense to ask for the difference between two such possible worlds. Somehow the discrete spatio-temporal symmetry transformations, namely mirroring and time reversal, seem to change the intrinsic properties of our world. Charge is an intrinsic property of objects and processes in nature, and so are parity and time order. Kant seems to be right: even if we disregard parity violation, it is completely counterintuitive to believe that our world is identical to its incongruent counterpart. It is as absurd as to think that our world is identical with a world in which the temporal order of all events is reversed. (Even Leibniz would not have believed that, in the case of time reversal. He simply did not think through all the consequences of a possible change of east and west in the world, including the generalisation of the non-effect of a rotation of the world to mirroring or time reversal. Surely he would not have admitted that a world in which all internal states of the monads develop in reversed time order is identical with ours.) In spite of these counterintuitive features of relationalism, logically a merely relational account of the discrete symmetries is possible. Handedness (or, today, the associated internal parity of elementary particles), time reversal, even a dynamical symmetry transformation such as charge conjugation can be expressed in combinatoric terms, i.e. in terms of permutations of space or time coordinates, or of matter and antimatter. There is no way of distinguishing the relation between physical objects in our world from their incongruent counterparts in a counterpart world. It has been argued that therefore relationalism is a natural ontological attitude towards the existence of such counterparts. Relationalism obeys ontological parsimony. There are no reasons to assume absolute (or non-extensional) physical properties (which can not be intellectually grasped), except (perhaps) in our imagination.38 But is such an ontological attitude convincing? More than fifty years ago, in light of modern mathematics and physics, but before the discovery of parity violation, Weyl emphasized that physical geometry cannot be reduced to math-
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
177
ematical geometry.39 He insisted that this is also true for the difference of left and right. Weyl shows how in an n-dimensional affine space, the distinction of a left-handed and a right-handed coordinate system is reduced to the relative order of the basis vectors. An even-numbered permutation of the basis vectors conserves the orientation of the coordinate system, whereas an odd-numbered permutation changes the orientation, that is (in Euclidean 3-space), transforms a left-handed in a right-handed system and vice versa.40 In this way, the mathematician identifies the difference between a left-handed and a right-handed system with the combinatoric relational property of being related by an oddnumbered permutation. In Weyl’s view, however, the puzzle of left and right in nature does not reduce to this relational mathematical property: The discrepancy between the questions of the philosophers and the mathematicians regarding the origins of the phenomenon, which nature presents us with, can hardly be illustrated more conspicuously.41
Kant’s transcendental philosophy restricted the structure of empirical science to our cognitive capacities. His theory of intuition, which explains the puzzle of incongruent counterparts, is anthropomorphic. Kant insists on the intensional aspects of physical properties such as handedness, which cannot be grasped intellectually but only intuitively. According to scientific realism (that is, metaphysical realism regarding the content of our best accepted theories), sciences aim at de-anthropomorphization. Planck emphasized this point in his famous lecture on the unity of a physical view of the world.42 According to Kant’s 1768 argument, de-anthropomorphization in mathematical terms comes to an end as far as the irreducible qualia of space and time are concerned. Should we give no credit to Kant’s solution of his 1768 puzzle, namely transcendental idealism, in view of scientific realism? Is substantivalism convincing in face of Leibniz’s famous invariance arguments? On the other hand, would we really accept a Leibnizian relational view of the world according to which the universe and its “incongruent” PCT-counterpart are indiscernibles? In a Kantian spirit, Weyl is obviously not willing to draw one of these two conclusions — still believing that mirror symmetry, respectively parity, is not violated in nature. My own conclusion concerning the whole dilemma is a natural epistemological attitude: to ask for the results of active PCT-transformations also makes sense only within our world, but not for our world in total. However, we see that still today Kant’s argument points at deep issues concerning certain non-extensional, or intensional, aspects of physical properties, and the limited domain of our axiomatic knowledge of them.
Notes 1. 2. 3. 4.
Buroker (1981), (1991). Cf. Falkenburg (2000), chapters 1 and 2. Kant’s Prize Essay: Akad. 2:279, 2:286 f., Monadologia physica: Akad. 1:477 ff. See Engfer (1982).
178
Intuition and the Axiomatic Method
5. Falkenburg (2000), chapter 2. 6. Leibniz (1715/16), Leibniz’s third and fourth letter. 7. See Leibniz’s (1715/16), fourth letter to Clarke, par. 14; Gerh. 737. Indeed the argument reappears in the antinomy of pure reason, cf. CPR, B 455 ff., B 459. 8. Akad. 1:410; nodos in scirpo quaerere, “Knoten an einer Binse suchen”. 9. Akad. 2:377. 10. Akad. 2:377. 11. Akad. 2:381. 12. Akad. 2:383 f. 13. Kant argues that space and time are a priori, even though they are “without any doubt” acquired rather than innate; they have “been acquired, not, indeed, by abstraction ..., but from the very action of the mind”; Akad. 2:406. 14. See footnote 7. 15. Indeed mirroring can be reduced to another kind of permutation, namely to an odd-numbered permutation of the coordinates in which an object is represented. See Weyl (1949), chapter 14, and end of sect. 4 below. 16. The Reflexions on metaphysics of that period exhibit that Kant probed any theory of space and time. In the late 1770s, in the famous Reflexion 5037, he confessed that in 1769 the truth began to dawn on him when he made serious attempts at probing opposite propositions; cf. Akad. 18:069. 17. Akad. 2:399, 2:403. 18. See Akad. 18:069 and note 14 above. 19. Akad. 2:388. 20. Akad. 2:396, beginning of § 10. 21. Akad. 2:403. 22. Ibid. — Similarly in the Prolegomena. — In the CPR, incongruent counterparts are not mentioned. This is due to the fact that the CPR develops the principles of transcendental philosophy according to the synthetic method, in contradistinction to the analytic presentation in the Prolegomena; see my remarks in Falkenburg (2000), 134. 23. Pooley (2003), 5. 24. The argument for the a priori character of time in the Dissertation relies on the meaning of the word “after”: “. . . the concept of time . . . is very badly defined, if it is defined in terms of the series of actual things which exist one after the other. For I only understand the meaning of the little word after by means of the antecedent concept of time.” (Akad. 2:399) Thus the mere relation between events tells us nothing about their temporal order. 25. Conceptually, however, for the Critical Kant the substantia phenomenon is a sum total of relations; cf. CPR B 322 and the concept of matter which is put forward in the Metaphysical Foundations of Natural Science. Intensive magnitudes are perceived non-intellectually and irreducibly as the matter of intuition, whereas they are conceived intellectually and formally in terms of the “anticipations of perceptions”. 26. In the full-fledged exposition of the Critical theory of space and time, i.e. in the Transcendental Aesthetic of the CPR, the transcendental concepts of space and time as pure intuitions are complemented by an empirical concept of space, respectively, time as a sum total (“Inbegriff”) of empirical relations. Applied to the empirical contents of pure intuition, Leibniz’s relational concept of space remains valid as a (logical) concept (i.e. a binary predicate), which expresses our empirical knowledge of space and which applies to the internal and external relations of empirical bodies. Such relations are given within the absolute background space of our intuition. According to the Metaphysical Foundations, absolute space is a speculative idea of pure reason, namely the sum total of all empirical frames or relative spaces, which is a logical concept rather than an empirical object; cf. Akad. 4:482. 27. Kant’s epistemological solution of the 1768 puzzle is the first, crucial step of the Critical turn. Identifying space with a subjective form of intuition aimed primarily at restoring the consistency of Kant’s cosmology and theory of individuation. The 1768 puzzle substantially differs from the antinomy of pure reason in the CPR, which is an internal conflict of pure reason concerning the completability of our empirical knowledge about space, time, matter or causality. 28. Earman (1971), (1989), Nerlich (1994). 29. Leibniz (1715/16), third letter; Gerh. 364. 30. Leibniz (1715/16); Gerh. 372.
Intuition and Cosmology: The Puzzle of Incongruent Counterparts
179
31. In the Nova Dilucidatio, he insisted there are intrinsically identical objects such as atoms or crystals, which differ only in their position; see Akad. 1:409 f. and above section 1. According to the CPR, space as a pure form of intuition is sufficient to individuate identical, but numerically different objects; cf. B 319. 32. Passive symmetry transformations affect only the coordinate systems in which physical systems are represented. They make no difference for the systems. 33. Cf. eg. Newton-Smith (1981); or Mittelstaedt (1989). 34. Mittelstaedt (1989). 35. See Earman (1989); several contributions in van Cleve (1991); Nerlich (1994); and Pooley (2003). 36. Indeed modern physical cosmology excludes models of the universe which are topologically too strange. There is a substantial constraint: a cosmological model corresponds to a possible world. Whence the spatio-temporal and causal unity of a world? According to the big bang model, the universe has a finite age. The concept of the age of the universe presupposes that all parts of the universe have unambiguous histories which track back to a common origin. (In the standard model of cosmology, this is granted by the cosmological principle which says that the large scale structure of the universe is homogeneous.) The G¨odel universe violates this requirement, since it admits causality violations. But a non-orientated universe is also weird, especially with regard to time measurement and to a relational concept of time. If it is a relational local property whether a clock is running clockwise or counter-clockwise, what should a relationalist believe about time measurement and unambiguous histories in such a universe? 37. Wigner (1939). 38. Pooley (2003) argues in this spirit, on the basis of a detailed formal analysis of the quantum fields of weakly interacting particles with maximal parity violation. He disregards, however, that according to the irreducible representations of the Poincar´e group parity is an intrinsic particle property like mass or spin. 39. At first, he argues with the absolute character of length. Atomic constants such as the electron charge and mass or Planck’s quantum at action teach us that there is a significant divergence between mathematical and physical symmetries. The laws of nature are not invariant under dilatation. Cf. Weyl (1949), chapter 14. 40. Ibid. 41. “Die Diskrepanz zwischen der Fragestellung des Philosophen und des Mathematikers nach den Wurzeln des Ph¨anomens, das uns die Natur stellt, kann kaum auffallender beleuchtet werden.” Ibid., last sentence of chapter 14. 42. Planck (1908).
References Buroker, J. V. (1981), Space and Incongruence, Kluwer, Dordrecht. Buroker, J. V. (1991), “The Role of Incongruent Counterparts in Kant’s Transcendental Philosophy” in: van Cleve (1991), 315–319. Earman, J. (1971), “Kant, Incongruous Counterparts, and the Nature of Space” in: Ratio 13, 1–18. Earman, J. (1989), World Enough and Space-Time: Absolute versus Relational Theories of Space and Time, MIT Press, Cambridge. Engfer, H. J. (1982), Philosophie als Analysis, Frommann-Holzboog, Stuttgart. Falkenburg, B. (2000), Kants Kosmologie, Klostermann, Frankfurt am Main. Kant, I. (1787), Critique of Pure Reason, translated by P. Guyer and A. Wood, Cambridge University Press, Cambridge. Kant, I. (Akad.), Kants gesammelte Schriften, ed. by the Preußischen Akademie der Wissenschaften (later: Deutsche Akademie der Wissenschaften zu Berlin); Vol. 1–23 (Werke, Briefe, Handschriftlicher Nachlaß) Berlin 1900–1955, Vol. 24 ff. (Vorlesungen) Berlin 1966 ff. Leibniz, G. W. (1875–1890), Die philosophischen Schriften, ed. by C. J. Gerhardt, Vol. VII, Weidmannsche Buchhandlung, Berlin. Mittelstaedt, P. (1989), Der Zeitbegriff in der Physik, 3rd ed., BI-Wissenschaftsverlag, Mannheim. Nerlich, G. (1994), The Shape of Space, 2nd ed., Cambridge University Press, Cambridge. Newton-Smith, W. H. (1981), The Rationality of Science, Routledge & Kegan Paul, Boston.
180
Intuition and the Axiomatic Method
Planck, M. (1908), “Die Einheit des physikalischen Weltbildes” in: M. Planck, Physikalische Abhandlungen und Vortr¨age, Vol. III, Vieweg, Braunschweig 1958, 6–29. Pooley, O. (2003), “Handedness, Parity Violation, and the Reality of Space” in: E. Castellani, K. A. Brading (edts.), Symmetries in Physics: Philosophical Reflections, Cambridge University Press, Cambridge. van Cleve, J. (1991) (ed.), The Philosophy of Right and Left. Incongruent Counterparts and the Nature of Space, Kluwer, Dordrecht. Weyl, H. (1949), The Philosophy of Mathematics and Natural Science, Princeton University Press, Princeton. Wigner, E. (1939), “On Unitary Representations of the Inhomogeneous Lorentz Group” in: Annals of Mathematics 40, 149.
CONVENTIONALISM AND MODERN PHYSICS: A RE-ASSESSMENT∗ Robert DiSalle University of Western Ontario, Canada
Conventionalism implied that physical geometry must be fixed by an arbitrary choice among equivalent alternatives. In the last half-century, this view has retreated before arguments that allegedly equivalent geometries are not at all equivalent on decisive empirical and methodological grounds.1 Yet such arguments were familiar to, and even proposed by, the conventionalists themselves. Poincar´e, Schlick, and Reichenbach — to take just three prominent examples — aimed not to deny that one could rationally choose among physically possible alternative geometries, but to articulate an epistemological theory of the origins of geometrical postulates. According to this theory, the empirical application of geometry depends on principles that are not themselves empirical, principles which were characterized as stipulations. But this view certainly allowed that some stipulations were better than others for the analysis of natural phenomena. Thus Reichenbach, Schlick, and Carnap could maintain that Einstein’s general theory of relativity had revealed the arbitrary element in physical geometry, while at the same time demonstrating the superiority of non-Euclidean geometry. A more recent2 challenge to conventionalism is that the very idea of a geometrical stipulation does not even makes sense in the context of a theory that, like general relativity, relates geometrical structure to the distribution of matter. On Friedman’s view, conventionalism presupposes the nineteenth-century view of geometry as a fixed and uniform background against which the laws of physics are framed. But according to general relativity, physical geometry varies with material circumstances, and so cannot be settled in advance by convention. Thus geometry can no longer be interpreted as part of an a priori background for physics, settled by an initial choice of a theoretical language. ∗I
would like to thank William Demopoulos for many conversations on the topics of this paper, and for his advice and comments on several drafts. I also thank Michael Friedman for many helpful suggestions. Work on this paper was supported by the Social Sciences and Humanities Research Council of Canada.
181 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 181–211. © 2006 Springer. Printed in the Netherlands.
182
Intuition and the Axiomatic Method
Friedman’s assessment brings out two conflicting aims behind the conventionalism of the logical positivists: on the one hand, to understand the a priori foundations of science to be not synthetic, but analytic — their analyticity arising from their dependence on the definitions of fundamental concepts; on the other hand, to understand general relativity as an insight into the empirical nature of spacetime geometry. However, Friedman’s analysis overlooks a theme of conventionalism that is not incompatible with general relativity, but on the contrary, is indispensable to our understanding of how spacetime geometry has become an empirical science. In their reaction against the Kantian and empiricist accounts of the foundations of geometry, the conventionalist tradition of Poincar´e and the positivists saw correctly that the postulates of physical geometry are neither uniquely determined by the form of outer intuition, nor inferred by induction from experience. The postulates of physical geometry do not express factual claims at all, but have an inescapably interpretive character: they connect abstract geometry to physical space by assigning physical meanings to geometrical concepts. This is the insight that the positivists attempted to capture through the notions of “coordinative definition,” “correspondence rule,” and the like. It seemed clear to them that this insight requires us to view alternativetheories ofphysical geometry as no more than equivalent linguistic frameworks, and to view the definitions that connect the frameworks with experience as arbitrary choices, justified by pragmatic a posteriori considerations. The assimilation of the choice of geometry to the choice of linguistic framework invited the now-familiar challenges from holism and empiricism: if those fundamental principles are contingent upon, and revisable in the face of, empirical knowledge, in what interesting sense could they possibly be analytic truths? If the choice of physical geometry is clear on empirical grounds, in what interesting sense could it be conventional? For Poincar´e and the logical positivists, failure to acknowledge the constitutive role of definitions in physical geometry was the source of naive empiricism about the subject; that the “definitional” character of the postulates implies conventionalism seemed obvious to them. The implication has also seemed obvious to their critics, who, in rejecting conventionalism, have seen no need to address the conventionalists’ concern with definitions or to offer an alternative account of their origin and status. I will argue that the conventionalists’ focus on the role of definitions in the foundations of geometry contains an important insight. However, my purpose is not to defend conventionalism, but to understand that insight from a completely different perspective. That the constitutive principles of physical geometry are somehow constitutive of meaning seems to distinguish them from ordinary empirical claims or physical hypotheses, but it need not mean that they are fixed by convention. I will argue that they are, instead, discovered by a process of conceptual analysis. The main task of this paper is to characterize this process of conceptual analysis, and to exhibit the role that it has played in the development of theories of space and time.
Conventionalism and Modern Physics: a Re-Assessment
183
Of course, the positivists saw a crucial role for the destructive analysis of ill-defined concepts — as in Einstein’s exemplary critiques of the Newtonian concepts of space and time. What they failed to perceive was the positive role of conceptual analysis in the construction of concepts, believing that this was adequately captured by their notion of a stipulation. As we will see, however, new foundations for physical geometry have emerged, not from arbitrary new coordinations between geometry and physics, but from novel analyses of what is implicit in established physical principles. By seeing how such analyses have worked in the most decisive historical cases, we can understand something that eluded both conventionalism and traditional empiricism: how principles that are in some sense definitional could nonetheless arise directly from empirical arguments, and how theories of space and time that arise from those definitions are genuinely empirical theories. We will also gain a clearer perspective on the relationship between spacetime theories and the philosophical concerns of logical positivism.
1.
Poincar´e and the logical positivists on space, spacetime, and general relativity
According to Friedman (1999c), the positivists’ difficulties with conventionalism arose from taking Poincar´e’s doctrine out of its original scientific and philosophical context: Poincar´e’s geometric conventionalism was plausible in the context of classical physics, and on the basis of his theory of the synthetic a priori. It is well known that Poincar´e, while rejecting Kant’s views of space and geometry, held to the view that arithmetic has an irreducible basis in temporal intuition; the principle of mathematical induction has no justification save an appeal to our intuition of the iterability of temporal processes. What is noteworthy in Friedman’s account is an analysis of the role that this view of intuition plays in Poincar´e’s account of space, and its importance to his conventionalism about spatial geometry. The very notion of space, Poincar´e pointed out, is an idealization derived from our sense of the free mobility of our own bodies. Therefore the “group of rigid motions,” identified by Helmholtz and Lie as the foundation of geometries of constant curvature, is an idealization of the primitive experience that acquaints us with the properties of space in the first place. As such, this group is the necessary and sufficient foundation of geometry as an empirical science. But the idealization involved in passing from our own local motions to the group of motions, which is the necessary foundation for our conception of space as globally homogeneous, makes crucial use of the form of temporal intuition: our conception of the large-scale structure of space derives from the presupposition that our local displacements are infinitely iterable, a presupposition whose sole basis is our temporal intuition. And the science of synthetic geometry is just an abstraction from these displacements, as classical geometrical constructions are based on the rigid motions of an idealized compass and straightedge.
184
Intuition and the Axiomatic Method
Since the principles underlying geometrical construction are just these procedures derived from the “ancestral experience” of local free mobility, and idealized as indefinitely iterable, they are necessarily bound to the geometries of uniform curvature, in which the principle of free mobility holds. It follows that only the geometries of constant curvature can be properly regarded as synthetic, for only in those geometries is constructive proof possible. The much larger class of Riemannian geometries of variable curvature must be regarded as analytic, since no “classical” constructive procedure can yield their propositions. Poincar´e’s account of the intuitive foundations of geometry thus differs from Kant’s in two crucial respects: on the one hand, Poincar´e has recognized the existence, and mathematical legitimacy, of infinitely many geometries that are not constrained by the form of intuition, while granting that intuition constrains which of these geometries may claim to constitute synthetic knowledge; on the other hand, Poincar´e has found that the constructive procedures licensed by intuition license, in turn, a more general class of geometries than Kant would have admitted, i.e. all the geometries of constant curvature. According to Friedman, the difference between geometries of constant curvature and general Riemannian geometries is essential for assessing conventionalism. On the empiricist view of Helmholtz, which was Poincar´e’s startingpoint, the kind of experience that convinces us that space is Euclidean — the outcomes of measurement based on the displacements of rigid bodies and the paths of light rays — could equally convince us that space has a non-Euclidean geometry of constant curvature, just in case rigid bodies and light rays behave in ways compatible with such a geometry. But Poincar´e placed the exact sciences in a hierarchy of “conditions of possibility.” The physics of rigid bodies could not lead one to give up Euclidean geometry, because that physics is possible only within a framework provided by some geometry, which therefore must be presupposed before any physical laws can be developed. Similarly, no geometry can be known unless a general theory of magnitude is assumed, and no theory of magnitude can be known unless the fundamental principles of arithmetic are assumed. Thus no result in physics can be a compelling reason to revise geometry. And since the legitimate candidates for physical geometry — the geometries of constant curvature, which alone are properly synthetic — are equivalent from a mathematical point of view, we can choose any one of these, provided that we adapt our physics to the choice. Therefore the choice of a geometry can only be a matter of convention, and Euclidean geometry is the simplest possible choice. This interpretation of Poincar´e’s position explains why the logical positivists’ attempts to apply it to the interpretation of general relativity would end in confusion. As we have seen, the conventions intended by Poincar´e fix the entire (homogeneous) structure of space, and thereby provide an “a priori” framework for the formulation of physical laws. In general relativity, no such convention is possible, as the geometry of spacetime is everywhere dependent on the distribution of matter, and therefore can’t be specified in advance of the
Conventionalism and Modern Physics: a Re-Assessment
185
laws of physics; in this respect general relativity realizes the vision of Riemann in direct contrast to that of Poincar´e, and the positivists’ efforts to maintain its compatibility with both views could not have succeeded. It should be noted, however, that the appropriateness of the positivists’ appeals to Poincar´e depends on how general relativity is understood. From our standpoint, the positivists’ understanding of it rested on some fundamental errors. In particular, general relativity is not, as the positivists thought, a theory of the “general relativity of motion,” but a theory of the structure of spacetime; as such, it no more satisfies positivistic strictures against “unobservable entities,” or “absolute” distinctions among states of motion, than did special relativity or even Newtonian mechanics. This understanding was encouraged, however, by Einstein himself, and the same may be said of the positivists’ account of the arbitrariness of geometry. For Einstein had identified the objective empirical basis of geometry as the determination of “space-time coincidences,” or “verifications of . . . meetings of the material points of our measuring instruments with other material points.” (Einstein (1916), p. 117). And this implied that all geometrical structures that agree on these coincidences are equivalent, and that a choice among them is an arbitrary stipulation (an implication that Schlick called “the geometrical relativity of space,” cf. (1917), chapter 3). Such a view is not obviously untenable, regarded merely as a philosophical explication of the concept “spatiotemporal measurement”; if the legitimate meaning of that concept is exhausted by Einstein’s analysis, then the theory of the geometrical structure of spacetime has to be regarded as imposed by a conventional choice, motivated by the search for the simplest possible laws of physics. On the logical positivists’ interpretation of general relativity as “relativizing” space, time and motion, the analogy between their version of conventionalism and Poincar´e’s is not so implausible. It fits neatly with their belief that hypotheses about the states of motion of bodies — e.g., the Copernican and Ptolemaic hypotheses — are on a par with geometrical conventions, and distinguished from one another only by their relative simplicity; for example, if the centrifugal force in a rotating system is equivalent to a particular gravitational field in a resting system, then we have a conventional choice about which bodies are rotating, and can choose the simplest hypothesis. More particularly, it fits with their conventionalistic account of spacetime curvature in general relativity: applying what they took to be the lesson of Einstein’s equivalence principle, they regarded curved spacetime as equivalent to flat spacetime with a gravitational field, which renders the choice between the two hypotheses a matter of convention. These examples are not unreasonably regarded as parallel to those of Poincar´e, where the choice is between, say, non-Euclidean spatial geometry and Euclidean geometry along with an additional force-field. The choice is in one case between a homogeneous non-Euclidean geometry and a homogeneous force-field, in the other case between a variable geometry and a variable force-field. In all such cases, the competing hypotheses were regarded as merely different languages for expressing, with varying degrees of
186
Intuition and the Axiomatic Method
convenience or simplicity, the same physical situation. The positivists’ main criticism of Poincar´e was only that he had a narrow conception of simplicity, which would always single out Euclidean geometry over all others; if we apply the criterion to the total system of geometry and physics, we may find that non-Euclidean geometry is simpler (cf. Carnap (1966), pp. 161–162 ), as we find with general relativity. The foregoing leads to an important qualification of Friedman’s account. We can see now that some version of Poincar´e’s view could be made compatible with general relativity, at least as the logical positivists understood the theory, in spite of the passage from homogeneous to inhomogeneous geometry. To do so is to recast conventionalism to reflect the passage from space to spacetime. Poincar´e’s conventionalism about geometry is essentially bound to the context of three-dimensional space, because the a priori constructive principles of geometry are just those connected with our spatial intuition, and, as we saw, these restrict us to geometries of constant curvature. But Poincar´e was also a conventionalist about the laws of mechanics: that bodies free of force move uniformly in straight lines, and that force is proportional to mass times acceleration, were for him mere definitions rather than factual claims. From here to a conventionalist view of inhomogeneous geometry there are only two steps. The first is to recognize that the laws of motion, and in particular the principle of inertia, serve as constructive principles for spacetime geometry; the inertial motions identified by the laws are represented by the geodesics of the spacetime structure. This means that the synthetic (in Poincar´e’s sense) geometries of spacetime will include those whose geodesics correspond to some set of inertial trajectories identified by some possible physical theory. The geodesics of Newtonian spacetime, to use a familiar example, are identified with the trajectories of particles not subject to gravitational or other forces, while those of general-relativistic spacetimes are the trajectories of freely-falling particles. The Newtonian inertial trajectories have no relative accelerations, and so correspond to the geodesics of a flat spacetime; the latter do typically have relative accelerations that vary with the local distribution of matter, and so correspond to the geodesics of a spacetime of variable curvature. But those same relative accelerations are interpreted, in the Newtonian theory, as caused by the gravitational field, and therefore as deviations from geodesic motion. Thus this seemingly innocuous reasoning leads immediately to the second step: to assert that whether we understand free-fall trajectories as accelerated by gravitational force, or as inertial trajectories, is simply a matter of convention. That is, some physical trajectories must be arbitrarily stipulated to be the inertial ones, and force correspondingly defined by deviation from such trajectories. The statement “Falling bodies travel on geodesics of spacetime” is therefore not an empirical claim, but has the stipulative character of a definition; adopting it permits us to make empirical claims about the curvature of spacetime that would otherwise be meaningless. One need not agree with this view to accept it as a reasonable analogue to Poincar´e’s: according to both
Conventionalism and Modern Physics: a Re-Assessment
187
we have an a priori framework in which we can formulate an equivalence class of physically possible geometries (in space, the geometries of constant curvature; in spacetime, the Riemannian geometries that agree on “spacetime coincidences”), and any further determination of geometry depends on a conventional choice about how the laws of physics are to be framed, and which physical trajectories are to represent the straight lines of space or spacetime. The logical positivists did not necessarily articulate this viewpoint clearly, especially since they themselves frequently appealed to examples concerning spatial curvature, or the effects of gravitational fields on spatial lengths, whereas the fundamental issues in general relativity are the curvature of spacetime, and the effects of gravity on spacetime geodesics. With a clear focus on these issues, their assimilation of general relativity to conventionalism turns out to be, at least, coherent. It should be clear now, however, how much the coherence of this assimilation depends on the positivists’ particular interpretation of general relativity. By the same token, the more modern interpretation of general relativity places the conventionalism of the positivists in a clearer light. If that version of conventionalism does make sense, it does so at some cost: on the standard modern interpretation, in which general relativity describes the “real” curvature of spacetime and its connections with matter and energy, the positivists’ view of geometry would appear to make it difficult to describe the physical content of the theory, or to explain why the theory of spacetime curvature should have been preferred over its predecessors. For one could not say that Newton or Minkowski was wrong to attribute a certain spatiotemporal structure to “the absolute world” — since that structure is a matter of arbitrary choice — but only that they were wrong to think of that structure as “absolute” rather than as a useful convention. The theory that geometry is arbitrary is, on this view, objectively better than its predecessor; the theory that spacetime is curved is merely more convenient. Overcoming this philosophical limitation, if it is one, does not by itself motivate the theory that (e.g.) gravitation and inertia are aspects of the same physical field, that spacetime is therefore curved in the presence of matter, and that therefore certain phenomena enable us to measure the spacetime curvature. That any particular phenomena are taken to be indicative of curvature, or of any other geometrical property, is precisely what must be settled by convention. In effect, the distinctive physical content of the theory, as opposed to that of its predecessors, is precisely its conventional part. And this is a significant disanalogy with Poincar´e’s conventionalism. For Poincar´e, the synthetic a priori principles that define the equivalence class of physical geometries are themselves physical principles which impose, at least, constant curvature on space. In the positivists’ interpretation of general relativity, the analogous equivalence class is defined by the general theory of Riemannian manifolds, which then serves as a mathematical metatheory within which any physical theory of spacetime geometry might be formulated, but whose basic principles contain no physical claims. But if we understand general relativity as a theory, not of relativity and covariance, but of the relation between
188
Intuition and the Axiomatic Method
spacetime curvature and mass-distribution, it is difficult to square with conventionalism. In short, it was not unreasonable of the positivists to maintain Poincar´e’s conventionalism in the face of general relativity; it would be unreasonable only to think that Einstein’s theory, as a theory of spacetime geometry, is true. If the only justification for any particular choice of the constitutive principles of physical geometry is the role they play in the total system of physical laws, and the extent to which this total system accords with our experience, then the a priori status of those principles is either questionable, or of little interest. This outcome seems to have been unsatisfactory at least to Reichenbach (1957), who argued that coordinative definitions may be motivated by something more than a posteriori convenience. His requirement that a metrical coordinative definition must stipulate the absence of universal forces is, in effect, an argument that what Poincar´e would call “empirically equivalent” alternative geometries may be inequivalent on empirically-motivated methodological grounds, and that non-Euclidean geometry is superior to a Euclidean geometry conjoined with the hypothesis of a universal force. Whatever the defects of Reichenbach’s discussion,3 it does attempt to portray the transition from Newton’s physics to Einstein’s as an empirically-motivated change in constitutive physical principles — as a change in the “relativized a priori” foundations of geometry.4 But the account of those empirical motivations is obscured by the broadly conventionalistic setting in which they are presented, according to which objective empirical reasoning about geometry is only possible within a framework established by arbitrary definitions (1957, pp. 36–37). Moreover, Reichenbach’s requirement offers a negative principle for rejecting proposed metrical coordinations, rather than a positive account of the origins of or physical motivations for any particular one. Thus, despite its admixture of anticonventionalist elements, Reichenbach’s view reinforces the conventionalist interpretation of general relativity: that the theory reveals the arbitrary element in physical geometry, and that only holistic considerations of simplicity can distinguish among geometrical conventions. It is helpful to view this entire development in a broader historical perspective. Conventionalism aimed to correct the error of Kant’s theory of the synthetic a priori, which lay in supposing that genuine propositions concerning the world of experience could have the apodeictic force of logic. If the postulates of physics and geometry have a certainty beyond that of ordinary empirical generalizations, it is because they aren’t genuine propositions, but definitions. But Kant’s theory also aimed to correct a traditional error, that of supposing that the postulates of geometry have some extralogical content, yet are purely “intellectual” truths. Whatever non-logical content is possessed by such postulates is prescribed by the forms of sensible intuition. Without defending Kant’s intuitionism against later advances in logic and the rigorization of mathematics, we can see its merits relative to earlier views: on the one hand, it recognizes that, at least up to Kant’s time, the use of intuition in mathematical reasoning was pervasive but not acknowledged; on the other hand, it recognizes
Conventionalism and Modern Physics: a Re-Assessment
189
that the concepts of mathematics were “productive” in a way that the concepts of traditional metaphysics had never been, precisely because the former are constructed in accord with the forms of spatial and temporal intuition, and not merely “invented” by the arbitrary assignment of meaning to words. Kant’s theory thus provides an explanation — for whatever it’s worth — of why the principles of geometry should seem to force themselves on us with something like the certainty of logic, while yet having some definite empirical content. By contrast, conventionalism understands this certainty as the certainty of analytic truth — the principles are just the rules that are definitive of a given linguistic framework — but can’t address their empirical content except as something fixed by an arbitrary designation, or a coordinative definition in the most literal sense, that identifies some phenomenon as the referent of a concept. For Poincar´e’s conventionalism, this question of content doesn’t arise. If the subject matter of geometry is the group of rigid spatial displacements, then we may have to make conventional choices about which homogeneous geometry to use, and precisely which bodies are rigid, but the empirical content of geometrical claims in general is fixed, as is the structure of space up to its measure of curvature. But if the subject matter of physical geometry is the “meetings of the material points of our measuring instruments with other material points,” then practically all of its content is open to arbitrary decision: except for coincidences with my own worldline, all of my judgments about such coincidences will require some theoretical decisions to be made on the grounds of simplicity and convenience. Thus, it is the need for stipulations about the very content of physical geometry that separates the logical positivists’ conventionalism from Poincar´e’s. This outcome is ironic, not merely because of the positivists’ identification with Poincar´e, but, more important, because of the enormous emphasis they placed on Einstein’s revolution as a radical conceptual change, and on new and better definitions of geometrical concepts as characteristic of that change. Carnap, for example, surely thought of these as examples of the sort of “change in the language” that “constitutes a radical alteration, sometimes a revolution,” as distinct from “a mere change in or addition of, a truth value ascribed to an indeterminate statement” within a language (1963, p. 921). This distinction is generally assumed to have been discredited by Quine, but it has a prima facie claim of relevance to the history of 20th -century physics. On what is at least a plausible reading, special relativity was founded on defining the velocity of light as a fundamental invariant, and on defining simultaneity by lightsignalling; general relativity was founded on defining the geodesics of spacetime as the paths of falling bodies; at least until we begin to suspect the notion of analyticity in general, it would seem as if the acceptance of such definitions is essential to the acceptance of the theories, so that denying or changing them would amount to creating an alternative conceptual framework. What is the sense of denying, then, that “The trajectories of freely-falling bodies are the geodesics of spacetime” is an analytic truth of the linguistic framework of general relativity? The serious answer to this question has to do with the
190
Intuition and the Axiomatic Method
difficulty about content. If such defining principles are arbitrary stipulations that can’t be independently motivated — if their physical content can’t be independently specified and judged — then such a presumptive analytic truth — by contrast with “Bodies free of all Newtonian forces follow spacetime geodesics” — amounts in practice to nothing more than the claim that general relativity is, on the whole, a more useful conceptual scheme than Newtonian mechanics. Of course, the logical positivists did attempt to justify Einstein’s conventions on philosophical grounds. Reichenbach’s proscription of universal forces was one such an attempt; others took their lead from Einstein’s own arguments, and typically pointed to the “epistemological” inadequacy of the Newtonian definitions, e.g., of simultaneity or absolute motion, and their failure to provide the sort of “verifications” described by Einstein’s accounts of simultaneity and of “spacetime coincidences.” As we have seen, however, such arguments are, at best, destructive critiques of the Newtonian framework rather than positive motivations; at worst, they are confused, since general relativity, properly understood, doesn’t really satisfy such general epistemological strictures either. At the same time, Schlick and Reichenbach, at least, argued that the philosophical insights of Einstein’s theory depended on specific theoretical developments in physics.5 This kind of justification would seem to be incompatible with the first kind: surely Einstein’s theory could not have evolved out of a purely epistemological critique of earlier theories, and be contingent on the fate of particular scientific hypotheses such as “Mach’s principle”. Neither sort of justification can be reconciled with the view that physical geometry is founded on arbitrary stipulations.
2.
Conventions, definitions, and conceptual analysis
Obviously the logical positivists left the understanding of physical geometry, and the nature and function of its a priori principles, in a very unsatisfactory state. They had good reasons to believe that Einstein’s revolutionary conceptions of space and time were developed with the help of a philosophical analysis of some kind or another; that the theories were, at the same time, founded on purely empirical principles from electrodynamics and gravitation; and that in both theories, as in physical geometry generally, definitions of fundamental concepts played crucial constitutive roles. Against Quine, they might have argued that all three beliefs reflected Einstein’s characterization of his own scientific practice — an argument that would deserve at least the attention of a professed epistemological naturalist. But their account of the definitions as arbitrary stipulations, and their attendant failure to analyse the origins or motivations of those definitions, makes the philosophical connections among the three ideas difficult to see. By the same token, the intimate relations among them would appear obvious if the origins of the fundamental definitions could be rationally explained. The most effective refutation of conventionalism would
Conventionalism and Modern Physics: a Re-Assessment
191
show that we can understand those definitions through their philosophical and physical motivations, instead of treating them as conventions. That understanding begins with recalling that concepts may come to be defined, not only by stipulations about meaning, but by conceptual analysis. This distinction may not appear immediately to be promising, if we think of conceptual analysis just as analysis of “what is contained in” a given concept. That is precisely the sort of analysis that, according to Kant, could never provide the foundation for an empirical science; from a more modern perspective, of course, analytic judgments in Kant’s sense seem to be empirical claims about the typical uses of words. A more promising notion of definition by conceptual analysis is based not on the question, “what do we typically mean by X?”, but rather on the question, “what conception of X is implicit in our established empirical judgments and practices?” And even this may be difficult to see in a constructive role. In the philosophy of space and time, such analyses have typically been seen as destructive, reducing space, time, and motion to “nothing but” their supposed phenomenal basis, as in the “relativism” of (e.g.) Berkeley, Leibniz and Mach. To Einstein and the logical positivists, Einstein’s discussions of simultaneity, rotation, and spatio-temporal measurement were analyses of just this sort, motivated by some form of verificationism: motion is “nothing but” relative motion; measurements are “nothing but” verifications of “spacetime coincidences”. To the extent that it is construed in this reductionist manner, such an analysis typically is taken to establish, not an objective foundation for geometrical measurement, but the lack of any such foundation—or, Reichenbach’s words, “the need for a coordinative definition.” This is why, on the view of the history of theories of space and time that has been common since Einstein, the progress from Newton to special to general relativity consists in a gradual “relativization” of what had been seen as objective or absolute, ending with a theory in which space and time have lost “the last remnant of physical objectivity” (cf. Einstein (1916), p. 117). As we have seen, however, that sort of analysis yields an empiricist metaperspective on spacetime theories rather than a physical motivation for any particular theory. Even if it is the right perspective, it can’t be said to capture or to reconstruct the motivations for the theory of spacetime curvature, whose foundation must in that case be a convention. A constructive conceptual analysis would show that an established set of empirical judgments implicitly contains a constructive principle for physical geometry, one that only needs to be raised to an appropriate level of precision and generality. A clear example was provided by Poincar´e himself, in his explication of “the notion of space.” The essential idea originated with Helmholtz’s analysis of what we mean by “spatial relations”: he considered how, among all the changes that we can observe, certain changes can be identified as spatial displacements or changes of spatial relation. This is in some sense a demand for a definition, but it is not to be answered by an arbitrary stipulation, or by a verificationistic reduction of spatial determinations to some more elementary empirical basis. Rather, the
192
Intuition and the Axiomatic Method
analysis recognizes a distinguished type of phenomenon that we judge to be characteristically spatial, and demands a precise formulation of the principle that is latent in those judgments. The answer is that spatial displacements are those that can be effected and cancelled by the motion of the observer; in other words, they are defined by the manner in which they can be done, undone, and combined. And it is these actions, sufficiently idealized, that invite interpretation as the characteristic operations of a group. It was by this analysis that Poincar´e arrived at his group-theoretic conception of space. The empiricist motivation of this analysis, and of the resulting definition, is evident; it is of a piece with another famous analysis by Helmholtz, concerning the question whether we can “imagine” a non-Euclidean space: By the much misused expression “to imagine,” or “to be able to think of how something happens,” I understand that one could depict the series of senseimpressions which one would have if such a thing happened in an individual case. I do not see how one could understand anything else by it without abandoning the whole sense of the expression. (1884, p. 8)
Both are classic examples of the analysis of “the empirical content” of a notion previously muddied by intuitive or metaphysical associations, and both were recalled by the logical positivists as landmarks in the development of an empiricist view of geometry. Yet they have to be distinguished from ordinary empirical arguments. It is in some sense an empirical fact that the group of displacements can be distinguished; but that the rigid displacements have the structure of a group is not an empirical claim in the ordinary sense. Rather, we would not recognize as spatial displacements any changes that did not conform to that structure. At the same time, this principle is not an arbitrary designation of the phenomenal referent of an abstract mathematical concept. It is, rather, an argument that the mathematical concept captures precisely and formally what is contained, vaguely and informally, in our pre-systematic notion of a spatial change and our pre-systematic judgments of the spatial relations of things. In other words, it is not an empirical claim because it is an interpretation of our empirical judgments; it is not conventional because the interpretation arises, not from a stipulation, but from a conceptual analysis. From the foregoing we can see why such analyses need not be empirically empty, but can have the most far-reaching implications for empirical science. Helmholtz’s definition of “to imagine” does not merely propose an interpretation of the word on which we might meaningfully claim to imagine a curved space; rather, it uncovers the interpretation that is implicit in our claim that we can imagine Euclidean space, and reveals that claim to be of much wider application than we had previously appreciated. The Helmholtz-Poincar´e definition of spatial displacement, similarly, does not merely invent a general conception that can be applied to any space of constant curvature. Rather, it reveals that the conception underlying our knowledge of Euclidean space is too general to single out Euclidean space from the other spaces of constant curvature. Where Kant had held that our notion of space is bound up with Euclid’s axioms, since
Conventionalism and Modern Physics: a Re-Assessment
193
these state the constructive procedures on which our entire conception of space is based, Helmholtz and Poincar´e showed that those constructive procedures are more general than that, and that “our notion of space” — at least, the notion implicit in our empirical judgments about space — is correspondingly general. And thus both analyses helped to make the difference between the discovery of non-Euclidean geometry, and the discovery that “space” might be non-Euclidean. Moreover, conceptual analyses of just this sort, as we will see, have been essential to the most revolutionary developments in the theory of physical geometry. At the same time we can see, from a different perspective, why a conceptual analysis like Poincar´e’s would lead to a form of conventionalism, and we can see the limits of that form of conventionalism. The concept of space that emerges from Poincar´e’s analysis is not defined by the arbitrary association of certain observed displacements with geometrical concepts, in the manner of a coordinative definition, and so it is clearly not a matter of convention — although Poincar´e acknowledges the possibility of exchanging it for some more convenient concept; the concepts arrived at by analysis are not assumed to be permanent. But it follows from Poincar´e’s analysis that the geometry of space, so defined, is indeterminate: the geometries compatible with the definition, those of constant curvature, form an equivalence class of mutually inter-translatable structures. Therefore the means of distinguishing among them must come from outside of geometry, i.e. from considerations that are not implicit in the notion of space, but that involve physical hypotheses that take the form of coordinative definitions: that light travels in a straight line, for example, can be an empirical claim only if light rays can be compared with straight lines. For Poincar´e, this is the beginning of a regress that can only end in a stipulation. Unlike the definition of a spatial displacement, such a coordinative definition requires a choice among several equivalent alternatives, some of which may turn out to be more convenient than others, but none of which has a special claim to represent “what is implicit in our notion of straight line.” From this we could conclude that conventionalism is, in an important sense, incidental to Poincar´e’s analysis of geometry. The analysis reveals the foundation of geometry in a “disguised” or implicit definition. But having this foundation does not make geometry an uninterpreted structure. For the definition itself constitutes an interpretation of a specific type of phenomenon as instantiating a specific mathematical structure. For the positivists, the structure determined by a set of implicit definitions requires a convention to fix its empirical content, but in Poincar´e’s analysis of geometry, that content is already expressed in the definition of spatial displacement; convention plays a role only because that content, as it turns out, admits a class of equivalent geometrical realizations. It is instructive to compare this analysis with that of Riemann, in particular with Riemann’s emphasis on the approximate character of the principle of free mobility. According to Riemann, if the physical principle underlying
194
Intuition and the Axiomatic Method
homogeneous geometry is only an approximation, that geometry itself is only an approximation to an actual geometry that may well be inhomogeneous at very large or small scales. On small scales, especially, the inexact principle of the rigid body must yield to a more exact, and more fundamental, principle of the interactions that constitute rigid bodies in the first place. It follows that, if a degree of global spatial structure is implicit in the principle of the rigid body, implicit in the deeper principle may be a local structure that varies over space; it also means that the degree of arbitrariness inherent in the former — and therefore the occasion for conventionalism — may not exist in the latter. For both Riemann and Poincar´e, then, analysis of the assumptions involved in measurement leads to a definition of spatial relations that associates them with physical processes, and in neither case is the definition therefore arbitrary. The arbitrariness arises from Poincar´e’s assumption that the concept of space is explicated exhaustively by the group of rigid motions, and that the latter provides the only basis for a truly synthetic geometry; on these assumptions, no investigation of the sort proposed by Riemann, into measure-relations “in the small,” could possibly yield a legitimate constructive procedure for geometry. From Riemann’s point of view, however, this must appear naive: rather than standing before physics as (in this restricted sense) an a priori framework, Poincar´e’s conception of space ties geometry to a particularly simplistic physical principle, just because of the latter’s privileged role in the genesis of our geometrical ideas, instead of recognizing that subtler physical principles might yield subtler principles of measurement and correspondingly more complex geometries. And, from the same point of view, even Helmholtz’s attitude would seem comparatively sophisticated. Helmholtz also believed that free mobility was both the original and the only possible basis for geometry, but forestalled conventionalism by insisting that mechanics could decide among the geometries of constant curvature; he thus recognized at least one part of Riemann’s view, that the notion of rigid body is not privileged over the rest of physics. For Poincar´e, its privileged status is what defines our notion of space. This last remark is especially important. The only plausible defence of Poincar´e’s narrow conception is that the physical principles invoked by Riemann and Helmholtz are not proper to the theory of space, but involve extrinsic factors; in particular, as dynamical principles, they essentially involve time; therefore the principle of free mobility is privileged over them, as far as geometry is concerned, precisely insofar as it is a purely spatial principle. But if this argument excuses Poincar´e, it also reveals the deeper insight behind the empiricism of Riemann and Helmholtz: that the geometry of space is not independent of the principles that connect space with time — in modern language, that spacetime is more fundamental than space. Indeed, for an understanding of Poincar´e’s conventionalism in historical perspective, the separation of space from spacetime is perhaps more important than the distinction between homogeneous and inhomogeneous geometry. In particular, that an analysis of geometrical postulates as “disguised definitions” should automat-
Conventionalism and Modern Physics: a Re-Assessment
195
ically lead to conventionalism is an aspect of space as opposed to spacetime: Poincar´e’s hierarchy of sciences is possible just to the extent that geometry is defined to be spatial geometry, and that the concept of space is completely explicated independently of any dynamical principles. In that case the latter are open to conventional choice, and certainly cannot play any constitutive role for the spatial geometry against which they are framed. For the geometry of spacetime, however, those principles are precisely the constitutive principles. The truth obscured behind an aforementioned remark of Poincar´e’s, that inhomogeneous Riemannian geometries are not properly synthetic, is that a purely spatial principle such as that of free mobility is unlikely to provide a constructive basis for an inhomogeneous geometry; the appropriate principle would likely be a dynamical, i.e. a spatio-temporal, principle, like the microphysical principles of causal connection envisaged by Riemann, or general relativity’s identification of gravitational free-fall with inertial motion. The familiar historical example of inhomogeneous spatial geometry is Einstein’s prediction of spatial curvature near the sun, as corroborated by the bending of starlight; but the prediction follows from the theory of non-uniform spacetime geometry, in which light-propagation plays a crucial constitutive role.
3.
Conceptual analysis and the foundations of spacetime theories
Poincar´e’s view of geometry, in sum, starts from a conceptual analysis that leads to a constitutive principle, a principle that is a kind of definition — insofar as it is a principle of interpretation — without therefore being an arbitrary stipulation; it ends with arbitrary stipulations, however, because the analysis is restricted to the constitutive principles of spatial geometry. In the development of constitutive principles for spacetime geometry, we see the fundamental role played by such conceptual analysis, and the comparative irrelevance of arbitrary stipulations. One clear illustration is the emergence of the Newtonian conception of inertia, beginning with the work of Galileo. As we saw, that force is proportional to acceleration, rather than to velocity or some other quantity, and inertia is thus resistance to acceleration rather than to change of position, were for Poincar´e obvious examples of mere definitions. Galileo did not attempt to argue that these are factual claims arrived at by induction, but recognized them as interpretations of facts already known to the Aristotelians. So the question arises, what kind of non-circular argument could Galileo possibly provide for a definition, other than that it leads to a generally simpler system of physics? This question is especially pressing in view of the problem of “incommensurability,” for, in the absence of a plausible extension of his physical principles to all the phenomena embraced by Aristotle’s physics — an extension that was not available before Newton — this pragmatic argument from global simplicity is not one that Galileo was in a position to make.
196
Intuition and the Axiomatic Method
The answer lies in the dialectical process described in the Dialogue Concerning the Two Chief World-Systems (1632), by which Galileo’s spokesman, Salviati, elicits assent to his conception from the Aristotelian Simplicio. Of course the conversion of Simplicio, and the dialogue as a whole, are highly contrived. But the general principle behind Galileo’s argument is more compelling: that his conception of inertia is already in use, implicitly, in familiar and well-established empirical judgments. The empirical facts are, apparently, that motion does not persist and that bodies come to rest on the earth when forces cease to move them. But it is equally apparent that in familiar cases of relative motion, we implicitly assume that motion does persist, and implicitly associate force with change of motion — for example, in the case of a horserider who throws an object directly to instead of in front of another rider, or of a shooter who follows a moving target with the gun-barrel instead of “leading” the target. And while Aristotle’s conception of motion may serve as an interpretation of the first set of facts, and may not directly conflict with the second, Galileo’s conception is implicit in the assumption — tacitly but successfully employed by anyone conducting experiments on a moving ship — that both sets are phenomena of essentially the same kind. To the challenge of incommensurability, then, Galileo could answer that just this conceptual analysis measures his conception against Aristotle’s and shows Galileo’s to be superior.6 Galileo’s analysis of motion falls short of establishing a constructive principle for spacetime geometry, because of the well-known fact that it remains ambiguous about the natural state of motion for bodies: either uniform motion in a straight line, or uniform circular motion (e.g., parallel to the surface of the earth) may be indistinguishable from rest. And this is in a sense appropriate, since the analysis attempts only to draw out the concepts implicit in the dynamics of motion near the earth’s surface; the precise Newtonian concept of inertia arises from the extension of Galilean dynamics to the entire planetary system, as first envisaged by Descartes and his followers. It is also well known that the spacetime structure proposed by Newton, “absolute space,” is not the structure implicit in his conception of inertia, but something stronger. What Newton’s laws enable us to construct is not absolute space, but an equivalence class of inertial frames; absolute space, however, makes just that distinction between uniform motion and rest that the equivalence of inertial frames denies. But two points about absolute space require emphasis. First, it is, in fact, a spacetime structure; as Newton defines it, at least, it implies the connection of space through time in such a way that states of motion are defined, albeit more states of motion than the dynamics can distinguish. Therefore Poincar´e was wrong to think that what he calls “the relativity of space” implies the impossibility of absolute space (1913, pp. 84–85). For the former follows from his understanding of space through the group of rigid motions, which, again, is independent of any dynamical principle; it is perfectly compatible with a theory that connects homogeneous space through time in the manner of absolute space, as well as with the correct theory of Newtonian spacetime.7 Second, a
Conventionalism and Modern Physics: a Re-Assessment
197
constructive principle for absolute space is easy to imagine, and was in fact imagined by Poincar´e himself when he noted the possibility of defining force by velocity instead of acceleration; in that case rest and motion in absolute space would be as well defined as acceleration is in the Newtonian case, and one could attach a dynamical meaning to the claim that the world has the structure of absolute space, which for Poincar´e was only a convention (1913, pp. 109–111). We have here another example of something that is a matter of convention from Poincar´e’s view, only because the constructive basis of geometry is seen in exclusively spatial terms. This understanding of Newtonian spacetime goes against the familiar view of the positivists, on which not only absolute space, but also absolute time and absolute rotation, are outstanding examples of empirically ill-defined notions. Newton’s arguments in support of these notions, especially the ”water-bucket” argument for absolute rotation, seemed to them to be illegitimate inferences from observation to metaphysical conclusions. It is now obvious, however, that Newton was not trying to infer the existence of “absolute rotation” from observations, but was defining absolute rotation as a theoretical quantity by exhibiting the phenomena that enable us to measure it.8 To this extent Newton’s proposal has the essential characteristics of a coordinative definition precisely in Reichenbach’s sense. To be fair, then, the positivists ought to have conceded Newton’s freedom to define rotation as he saw fit, provided that he could state — as he undoubtedly did — empirical criteria for the application of the concept. But Newton’s own defence of the definition is not merely that it has an empirical application as part of a useful conceptual framework. Newton provides, in addition, a conceptual analysis similar to Galileo’s, with a similarly dialectical emphasis: he argues that this conception of rotation is implicit in the dynamical reasoning of his contemporaries, whatever their official pronouncements about the relativity of motion; in particular, it is already in use in their dynamical theory of celestial vortices.9 Indeed, the dynamical distinctions that Newton defines among states of motion — that is, the distinctions of absolute rotation and acceleration from uniform motion — are implicit in the 17th -century understanding of causal interaction: a body acts on another by changing its state of motion; non-uniform motion thus requires a causal explanation that uniform motion does not. On such grounds even Leibniz held that kinematically equivalent motions could be dynamically distinct. But if this implicit causal distinction is taken seriously, the relativist approach to the “system of the world” is untenable, and the issue between Copernicus and Ptolemy cannot be a matter of hypothesis or convention. For the concepts of inertia and force provide, for any system of bodies, procedures for constructing a frame of reference — an inertial frame — relative to which their states of motion correspond to their causal influences on one another, and these are the motions that Newton felt justified in calling the “true” motions. This process changes the fundamental question of cosmology: the question “which body is at rest?” is no longer appropriate, and is replaced
198
Intuition and the Axiomatic Method
by empirical questions about the relative masses of the bodies and the location of their common centre of mass. Since the sun contains most of the mass of our system, Newton shows, it will never be far from the centre of mass, and so the heliocentric theory is a better approximation than the geocentric. If the mass were more evenly distributed, however, the difference between the two would be correspondingly less interesting; it could even be said to be a matter of convention for a system of two nearly equal masses. These aspects of Newton’s theory are difficult to appreciate from the logical positivists’ perspective. Reichenbach, for example, regarded Newton’s choice of the Copernican system as a coordinative definition of a rest-frame, motivated by the need to accommodate his theory of gravity in the simplest possible way. This made it difficult to recognize the constitutive principles of the Newtonian spatio-temporal framework, and their origin in a conceptual analysis of dynamics. On the contrary, the positivists regarded the framework as an unnecessary metaphysical addition to the dynamical theory. At the same time, they regarded Einstein’s theories as products of philosophical analysis, but, again, they understood this as “epistemological analysis” of the most reductive sort. Therefore they obscured the essential philosophical continuity between Newton’s and Einstein’s work, and the essential similarity of the conceptual analyses involved. As we have already seen, misunderstandings of this sort were encouraged by Einstein’s own remarks, particularly about general relativity as a reduction of geometry to coincidences. Even in the case of special relativity, there is some apparent encouragement for the positivists’ view of a formal structure connected to experience by stipulation. We know that Einstein (1905) derived the Lorentz contraction from the “relativity principle” in conjunction with the constancy of the velocity of light, and we know that the apparent contradiction between these two premises stems from the hidden assumption of absolute simultaneity; we can resolve the contradiction, then, by granting the relativity of simultaneity. From here we see that the Lorentz contraction, instead of being an “ad hoc” adjustment of Maxwell’s theory to the failure to detect the earth’s motion relative to the ether, follows logically and naturally from Einstein’s premises. But this reasoning only reveals the existence of two equivalent interpretations of the same facts: either the Lorentz contraction is genuine and explains the apparent invariance of the velocity of light, or the invariance of the velocity of light is genuine and explains the apparent Lorentz contraction. What is needed, in addition to the formal reasoning, is some justification for Einstein’s starting-point, which his discussion of simultaneity is obviously meant to supply. If the difference between the two interpretations hangs on the definition of simultaneity, however, then just to that extent it would appear to be a matter of convention. Einstein himself begins by asserting that a “common time” for different observers “cannot be defined at all unless we establish by definition that the ‘time’ required by light to travel from A to B equals the ‘time’ it requires to
Conventionalism and Modern Physics: a Re-Assessment
199
travel from B to A” (1905, p. 40). Later discussions seem to portray this assumption of the isotropy of light-propagation as an arbitrary stipulation. In his popular exposition of his work (1917), he writes, “Only one requirement is to be set for the definition of simultaneity: that in every real case it provide an empirical decision about whether the concept to be defined applies or not”; that light takes the same amount of time to travel in both directions “is neither a supposition nor a hypothesis, but a stipulation that I can make according to my own free discretion, in order to achieve a definition of simultaneity” (1917, p. 15). In his Princeton lectures (1922), he says that “It is immaterial what kind of processes one chooses for such a definition of time,” except that it is “advantageous . . . to choose only those processes concerning which we know something certain” (1922, pp. 28–29). Remarks like these suggest that special relativity, as a theoretical framework, is connected with reality only by the choice of light-propagation as the standard of simultaneity. The best that one could say of the framework is that it is based on a “practical” procedure for determining which events are simultaneous; in the Newtonian framework, instantaneous causal propagation provides absolute simultaneity with a theoretical basis, but not a practical procedure. On a closer look at Einstein’s stipulation, however, we can discern the process of conceptual analysis that provides its physical and philosophical motivation. One aspect of the analysis has already been mentioned, that is, the analysis of the contradiction between the relativity principle and the light postulate, and the resulting recognition that their incompatibility depends on the assumption of absolute simultaneity; observers in relative motion can agree on the velocity of light only if they disagree on which events are simultaneous. A similar analysis shows that the concept of simultaneity is bound up with our measurements of spatial and temporal intervals, so that observers who disagree on which events are simultaneous have no common measure of length and time. These are familiar aspects of special relativity. But while they illuminate the distinction between Einstein’s framework and Lorentz’s and the assumptions on which each is founded, they don’t by themselves argue for either. The second, in particular, had already been articulated within the Newtonian framework, with no intention of questioning the framework, but merely in order to acknowledge its fundamental assumptions.10 The decisive analysis is the one that exhibits Einstein’s definition of simultaneity, not merely as a free stipulation that is logically unexceptionable, but as an account of the physical content of our empirical judgments of simultaneity. Thus it is not freely chosen, because it has a number of requirements to satisfy. Nor is the definition an operationalistic reduction of those judgments to practical procedures, because it makes essential use of theoretical principles. Einstein begins by proposing two practical procedures, each of which supplies an “operational definition” of simultaneity: to define “time” by “the position of the small hand of my watch,” and to coordinate the time of every event with a watch at a fixed location, by the time at which a light-signal from
200
Intuition and the Axiomatic Method
each event reaches the watch. The first obviously fails to meet the requirement of defining simultaneity for distant events; the second meets that requirement, but “has the disadvantage that it is not independent of the standpoint of the observer” (1905, p. 39). From the rejection of these possibilities we see that Einstein is not only trying to coordinate the concept with a physical process of propagation. He is also trying to capture what is contained in the abstract notion of “absolute simultaneity” — not necessarily absolute simultaneity in Newton’s sense, but, at least, a criterion of simultaneity that does not depend on the standpoint of the observer, and that makes simultaneity a symmetric and transitive relation. Therefore the required coordination is not an arbitrary choice, for it has two independent motivations. On the one hand, it is “the most natural definition of simultaneity” (1917, p. 18); it is in fact the definition that human beings ordinarily use, insofar as we consider events to be simultaneous when we see them at the same time, without stopping to wonder whether this criterion would give the same results for observers in relative motion. On the other hand, once we raise the problem of relative motion, we need a criterion derived from an invariant process of propagation, and the invariance of the velocity of light uniquely meets this need. It is this unexpected accord between the “natural” definition of simultaneity and the empirically established invariance of the laws of electrodynamics, rather than the need for a stipulation or an operational definition, that Einstein’s conceptual analysis reveals. The need for an abstract notion of “absolute simultaneity” is an instructive point of comparison between Einstein and Newton. Our intuitive sense of simultaneity, based on seeing events at the same time, neglects both the motion of the earth and the time of propagation for light, which for most practical purposes is immeasurably small. From Newton and Einstein, respectively, we have two ways of abstracting from this criterion of simultaneity to arrive at one that has a theoretical basis in the laws of physics, and that is independent of the reference frame of the earth. Newton’s approach neglects as a matter of principle the time of propagation: the abstraction consists precisely in leaving out altogether the method of signalling, and assuming that which events are simultaneous is a matter of fact that does not depend on the standpoint of any observer. And this is no more than what is implicit in the principle that force, mass, and acceleration are independent of the motion of the observer, and the principle that gravitational attraction depends only on the masses and distances of the interacting bodies without reference to time. The latter provides (in principle) an instantiation of absolute simultaneity, rather than a practical means of determining it, but the phenomena addressed by Newton present no reason to suspect that this difference might be important. In any practical situation, it would seem intuitively obvious that the light-signalling method would provide, at least, a retrospective account of which past events were absolutely simultaneous — assuming that, in cases of relative motion, the Newtonian addition of velocities would apply to light signals as well.
Conventionalism and Modern Physics: a Re-Assessment
201
Einstein’s abstraction, by contrast, consists in giving precedence to the intuitive method of determining simultaneity, and asserting its independence of the motion of the observer — an entirely contingent assertion that was warranted by the state of electrodynamic theory and experiment in 1905, but that would have made little sense before then. It turns out, however, that this “absolute” criterion of simultaneity does not give the same results for observers in relative motion, but results that vary from frame to frame strictly according to the degree of relative motion. We can see from this that Einstein’s analysis of simultaneity was not, any more than Newton’s was, an epistemological reduction of the concept to purely phenomenal means of verification. Rather, each was an abstraction from the familiar concept, made possible by the contemporary state of development of theoretical physics. In other words, neither conceptual analysis is merely an analysis of “what we mean by simultaneity”; both are analyses of the relationship between what we ordinarily mean by simultaneity, and the meaning that is implicit in established theoretical principles. The foregoing highlights the difference between the positivists’ philosophical reconstructions of Einstein’s analysis of simultaneity, and its true philosophical content. Einstein’s analysis enabled Minkowski (1908) to formulate the space-time geometry that is implicit in special relativity, and, in particular, to see that the Lorentz transformations, rather than the Galilean, constitute the symmetry-group of spacetime. As Helmholtz and Poincar´e had understood the notion of space through the possibility of certain spatial displacements, Minkowski recognized that the laws of Newtonian mechanics and special relativity enable us to understand a spatio-temporal structure through the possibility of certain “spatio-temporal displacements”: the coordinate transformations that preserve the dynamical invariants of each theory. The symmetry group “G∞ ” of Newtonian mechanics defines one sort of spatiotemporal structure, while the symmetry group “Gc ” of electrodynamics, as Einstein had shown, defines a different structure, and G∞ arises from Gc in the limit as the parameter c (the invariant velocity) goes to infinity. From the conventionalist point of view, to accept either of these as the structure of spacetime is to make an arbitrary stipulation. But this situation is not quite analogous to the one confronted by Poincar´e. As we have seen, explicating the concept of space through the group of rigid motions identifies the physical principle that is constitutive of space, but leaves the precise geometry indeterminate. Minkowski’s conceptual analysis of spacetime, however, identifies spatiotemporal displacements as symmetries of the laws of physics. Therefore when we identify the constitutive principles of spacetime, or the fundamental physical laws, we are already determining the geometry of spacetime, or at least placing it beyond the reach of conventional choice.11 Minkowski makes it clear that his picture of spacetime is not founded on a stipulation, nor is it advanced as a hypothetical explanation of electrodynamic phenomena. Rather, it simply is the structure implicit in our understanding of the laws of electrodynamics: “Now the impulse and true motive for assuming the group Gc came from the fact that the differential equation
202
Intuition and the Axiomatic Method
for the propagation of light in empty space possesses that group Gc ” (1908, p. 81). That the electrodynamics of moving bodies possesses this structure is not decisive, of course, since, as Minkowski points out, the same structure characterizes Lorentz’s theory. But the decisive arguments had already been given by Einstein, who had recognized that “the time of the [moving] electron is just as good as that of the [electron at rest]” (1908, p. 82). In other words, that this structure expresses the symmetries of electrodynamics is a mathematical fact (one already noticed by Poincar´e, who nonetheless held to the Lorentzian framework), but that this structure also expresses the fundamental symmetries of spacetime emerges from Einstein’s analysis of time. The cases of Newtonian mechanics and special relativity reveal, in sum, the manner in which the laws of physics serve as the constitutive principles of spacetime geometry, and the kind of conceptual analysis from which those principles have emerged. For general relativity, however, Einstein explicitly offered the kind of reductive epistemological analysis that we have already discussed, in order to eliminate not only the privileged status that the previous theories granted to inertial frames, but the physical objectivity of space and time in general. In all of this, a crucial role was played by the reduction of spatiotemporal measurement to the determination of coincidences. As we have seen, this conceptual analysis yields a completely general mathematical framework for spacetime geometry that appears to have no physical content. But it is easy to see why it would not have appeared so to Einstein and the logical positivists: in addition to the principle of general covariance, which in itself functions as a kind of meta-principle, they assumed the (generally covariant) Einstein field equation, which does impose more definite constraints on spacetime than merely capturing “the objective spacetime coincidences”, and which constitutes a physical relation that is unchanged by the arbitrary change of coordinates. Thus the accounts of general covariance and of pointcoincidences suggest a radical “geometrical relativity of space,” while the field equation saves the theory from physical vacuousness. In that case, however, the motivation for the field equation becomes a serious philosophical question. Einstein’s explicit philosophical starting-points — Mach’s principle, and the identification of general covariance with the “general relativity” of motion and the equivalence of all frames of reference — motivate, at most, a framework in which any Riemannian geometry is physically possible, and the homogeneous geometries of Newton and Minkowski appear to be relatively naive physical idealizations. This framework may perhaps raise the expectation of spacetime curvature, and it unquestionably played an important psychological and heuristic role in Einstein’s thought. It is a great leap, however, from such a general expectation to the theory of spacetime curvature as a physical quantity that depends on physical conditions. As Friedman (2001) points out, a crucial link was Einstein’s analysis of geometrical measurement on a rotating disc (cf. Einstein (1916), pp. 115–117), which provided his first glimpse of non-Euclidean geometry as a way of modelling a physical field, and the first step toward a
Conventionalism and Modern Physics: a Re-Assessment
203
theory of non-Euclidean spacetime. But more than this is required to provide the basis for a constructive theory of spacetime geometry, in which curvature plays the role of the gravitational field. To the positivists, what was required was a stipulation — again, a stipulation that the spacetime geodesics are the paths of falling bodies. And we have already seen why this would have seemed plausible: not only because of the reduction of the objective content of geometry to coincidences, but because the identification of free-fall trajectories as geodesics seems to be a clear case of a coordinative definition. But we are now in a position to understand the origins of this definition in a conceptual analysis. The analysis starts from the empirical equivalence of inertial and gravitational mass, and the consequent indistinguishability of inertial motion from free fall.12 In Einstein’s well-known example, a frame of reference at rest in a homogeneous gravitational field, with gravitational acceleration g, is observationally indistinguishable from a frame with uniform acceleration –g; for the same reason, a freely-falling frame is indistinguishable from an inertial frame. It is also well known that in Einstein’s initial analysis, this indistinguishability was taken to indicate the physical equivalence not only of freely-falling and inertial frames, but of any frames whatsoever, and of all states of motion. To derive from this apparently destructive analysis a constructive basis for spacetime geometry, we have to see that it defines, in spite of the apparent arbitrariness, an objective physical quantity. From there we would see why the interpretation of gravitational free-fall as a privileged state of motion, and the trajectories of falling bodies as the constructive basis for spacetime geometry, is not a convention, but the outcome of an analysis of what is implicit in our knowledge of gravitational fields. For it follows from the equivalence principle that no actual measurement of gravitational acceleration is ever a measurement of deviation from a flat-spacetime geodesic — that is, the measured quantity is never absolute acceleration in Newton’s sense, but the relative acceleration of free-fall trajectories. If, after the example of Minkowski’s analysis of special relativity, we now ask what structure is exhibited by these trajectories, we naturally arrive at a spacetime whose curvature varies with the distribution of mass. This last inference may sound like a drastic oversimplification, but it is in fact a paraphrase of Einstein’s actual procedure in moving from “the general theory of relativity” as articulated in the first three sections of Einstein (1916) — in which spacetime is assumed to be locally Minkowskian, but otherwise open to arbitrary choices of reference-frame — to the theory of spacetime curvature as the objective expression of gravitational phenomena. For the first link between the generally covariant formalism and the physics of gravitation is the “equation of the geodetic line” (pp. 131–32): mathematically, it is independent of the choice of coordinates, and its physical correlate is the privileged state of motion of a particle, namely gravitational free-fall. From this perspective, the Newtonian account of free fall as “forced” deviation from geodesic motion turns out to depend precisely on the arbitrary choice of a coordinate system.
204
Intuition and the Axiomatic Method
For when we measure the acceleration of a falling body relative to a supposed inertial frame, such as the centre of mass frame of some system, we have no way to determine whether that frame itself is in free fall or inertial motion, since, by the equivalence principle, the system will behave in the same way in either case. What we actually measure, again, is merely the acceleration of the free-fall trajectories within the frame relative to the free-fall trajectory of the frame itself. Therefore to interpret the former as measures of “the gravitational field” is to make an arbitrary stipulation that the centre of mass follows a geodesic — a stipulation that manifestly amounts to a mere choice of coordinates; if the geodesic motions are to be objectively identified, and not merely stipulated, the free-fall trajectories are the only possible ones. But these trajectories have relative accelerations, and the relative acceleration of geodesics is a defining characteristic of curved spacetime. One could make the same point by arguing from the Newtonian field equation (the Poisson equation), which relates gravitational acceleration to the distribution of mass. In principle we could learn the absolute magnitude of the gravitational potential by applying this equation; in fact, however, what we actually measure is only the relative acceleration of free-fall trajectories, or the gravitational tidal field, which is independent of the free-fall motion of the entire system — a fact already exploited by Newton’s analysis of the gravitating system of Jupiter and its moons, whose interactions are (practically) independent of the system’s free-fall toward the Sun. It follows that the gravitational potential itself depends on the arbitrary designation of some freelyfalling frame as an inertial frame. By analogy to the argument about geodesic motion, if we seek to replace the coordinate-dependent “absolute” acceleration, in the field equation, with an objectively measurable quantity, we require a structure that simply represents the tidal accelerations themselves, without the arbitrary assumption of an inertial frame relative to which their true magnitudes are known. The identification of free fall as geodesic motion enables us to identify the required structure as the curvature of spacetime. The foregoing helps to clarify Einstein’s use of the principle of general covariance: hidden behind its destructive use to eliminate all objective spatiotemporal distinctions, we find a constructive conceptual analysis, showing that in our empirical knowledge of the gravitational field, there is an implicit distinction between objective physical quantities and coordinate-dependent quantities. But this distinction is not reflected in the spatio-temporal framework in which we ordinarily understand the field. In particular, the Newtonian definition of geodesic motion is embodied in Newton’s laws, but is not really in use in the analysis of any gravitating system; any such analysis implicitly uses the definition of privileged trajectory, and of privileged frame of reference, that Einstein derives from the equivalence principle. In some sense this was acknowledged by Newton himself in Corollary VI to the laws of motion, which states that a uniformly accelerated system of bodies will be indistinguishable from one in uniform non-rotational motion; this enabled him not only to treat
Conventionalism and Modern Physics: a Re-Assessment
205
Jupiter and its moons as an isolated system, but also to treat the solar system itself as isolated, and to neglect any parallel acceleration of the whole by some other unknown forces, since by such forces “no change would happen in the situation of the planets to one another, nor any sensible effect follow” (1729, p. 558). In hindsight, this reasoning leads to a geometrical reformulation of Newtonian gravity, since it shows that the reasons to identify the gravitational field with the (curved) affine structure of spacetime are independent of the transition from Galilean to Lorentz invariance. To compare Newton’s and Einstein’s points of view, we should note that, analogously to their definitions of simultaneity, their respective definitions of geodesic arise from two ways of abstracting from the empirical conception of gravitational force. The empirical problem is to decompose the acceleration of any body into components, one for each of the action-reaction pairs of which the body is a member; this is Newton’s precise and general form of Galileo’s analysis of inertial and projectile motion (which was too closely bound to the reference frame of the earth). In actual cases, however, this decomposition fails to identify the inertial component of the body’s motion; for example, it identifies not the inertial component of a planet’s motion, but the component that is uniform with respect to centre of mass of the solar system, which may itself be freely falling. Of course, by Newton’s third law, if the solar system is falling, then it belongs to some larger interacting system; if that system is falling, it must belong to some still larger system; and so on. Thus Newton’s conception of an inertial frame abstracts from this infinite regress, conceptually separating the process of decomposition from all finite physical systems in which it might conceivably be carried out. This amounts to supposing that, in principle, all the contents of the universe might be included in one interacting system, and an acceleration of the centre of mass of the system would then be excluded by the laws of motion. Einstein’s abstraction, instead, identifies the actual process of decomposition as the definitive one: analogously to his method of determining simultaneity, it is this procedure that is the same for every actual physical system. And by this means he identifies the “local” inertial component as definitively inertial — that is, the free-fall trajectories as the genuinely inertial trajectories. It would seem, then, that Einstein’s conceptual analysis of gravity and inertia is one that Newton might already have undertaken, given the right mathematical tools. Yet on a closer consideration, we can see that the analysis is essentially contingent on the subsequent evolution of physics, and in particular on the state of electrodynamics in Einstein’s time. From Newton’s perspective, the indistinguishability of inertial motion and free fall does not necessarily undermine the global determination of spacetime geometry, because his theory of gravitation has no particular implications for the behaviour of light: to the extent that the mass of light is unknown, it is unclear whether optical phenomena ought to be subject to Corollary VI. This is why for Einstein’s theory, it is crucially important to have shown that no phenomena, mechanical or electro-
206
Intuition and the Axiomatic Method
dynamical, behave differently in a freely-falling frame and in a Lorentz frame, and therefore there are no phenomena that provide a measure of the relative acceleration of the one frame and the other.13 And this is the deeper significance of the celebrated light-deflection observations, which supported Einstein’s inclusion of light among the phenomena governed by the equivalence principle. Newton might hold out the possibility that some optical, or other non-gravitational, effect could provide a constructive basis for flat spacetime geometry, distinct from the implicit geometry of free-fall trajectories (equivalently, for an inertial frame as distinct from a freely-falling frame); in this context, one might even be able to make sense of the conventionalist claim, that there is a free choice between two adequate constructive bases for spacetime geometry. Given Einstein’s extended principle of equivalence, however, any physical procedure for identifying an inertial trajectory (or a Lorentz frame) must fail to distinguish it from a free-fall trajectory (or a freely-falling frame). Moreover, we can see from this argument why the status of Einstein’s analysis remains contingent on the future development of physics: it leaves open the possibility that, at higher levels of precision or in novel experimental contexts (for example, those connected with quantum effects in gravitational fields), violations of the Einstein equivalence principle could weaken the grounds for identifying gravity with spacetime geometry.
4.
Conclusion
In all of these historical cases, we find that the constitutive principles of spacetime geometry are definitions of a sort; more precisely, they are interpretive claims rather than empirical claims, for they propose that certain characteristic physical phenomena be interpreted through certain geometrical structures. Yet these definitions are in no sense mere conventions. Instead, each arises from a conceptual analysis of procedures of spatiotemporal measurement; in each case the definition is not chosen from among equivalent alternatives, but discovered to be implicit in current empirical principles at a critical moment in the history of physics. The logical positivists, as we have seen, had difficulty reconciling the interpretive aspect of Einstein’s principles with their constructive physical content. In Einstein’s conceptual analyses, these aspects are not only compatible, but inseparable. Moreover, the analyses justify the positivists’ sense that relativity represented a philosophical advance in our understanding of space and time, and not merely a set of new physical hypotheses. But they do so without appealing to narrow and unrealistic epistemological restrictions. Probably no general methodological rule can be given for deciding when enlarged empirical knowledge should lead to a re-evaluation of fundamental concepts; the disappointed expectation of such a rule is perhaps a major psychological motivation for the view that conceptual change is rationally inexplicable. But the kind of change introduced by Galileo, Newton, and Einstein arises from a critical conceptual engagement with an existing frame-
Conventionalism and Modern Physics: a Re-Assessment
207
work, an engagement that is philosophically comprehensible, but not as an instance of any methodological maxim. The great conceptual changes in spacetime theory thus vindicate, in a narrow sense, a central philosophical theme of logical positivism: that physical geometry constitutes a framework that makes ordinary empirical arguments and measurements possible, and therefore that arguments for the framework itself must be of a fundamentally different kind. To this narrow extent, moreover, the history of spacetime theory vindicates the positivists’ neo-Kantian association of space and time with the general problem of a priori knowledge. Their account of the a priori aspect of physical geometry improved upon Kant’s, at least, by recognizing the connection between geometrical postulates and physical hypotheses, and the interpretive character of the postulates. The positivists had learned, in other words, that the empirical content of the postulates derives from the physics of motion, while in form the postulates are more like analytic or meaning-constitutive principles than synthetic principles in Kant’s sense. And this reconciled their brand of apriorism with the historical contingency and mutability of geometry in a way that was impossible for Kant’s. Einstein’s theories, in particular, seemed to exemplify the idea that empirical geometry has the status of an a priori framework: first, because they seemed to arise from analysis rather than from ordinary empirical inference, and second, because they present spacetime structure as a background against which the forces of nature are defined and investigated. (Even in the case of general relativity, in which spacetime is no longer prior to gravity, a form of Poincar´e’s hierarchy survives insofar as all non-gravitational interactions are defined with respect to the local Minkowski metric.) The positivists were not completely misguided, therefore, in thinking that they had captured the Kantian idea of space and time as “conditions of the possibility” of ordinary empirical reasoning, while incorporating the insights of Helmholtz and Poincar´e into the empirical origins of geometry. Like Kant in the 18th -century context, they were in a position to transcend metaphysical disputes about space and time — such as whether they are “substantival” or “relational” — by showing how spatio-temporal structures are presupposed by our usual reasoning about substances and their relations. By embracing conventionalism, however, the positivists went beyond acknowledging the distinction between a spatio-temporal framework and the kind of empirical reasoning that is possible within it; they claimed to remove the defining principles of the framework from theoretical reasoning altogether. This result would not arise for Kant, because he identified one particular framework as the sufficient and necessary condition of scientific reasoning. But by end of the 19th century, the multiplicity of possible frameworks was an obvious fact. For the positivists, this fact had obvious implications for the Kantian idea of objective knowledge, as judgment conforming to the conditions of the possibility of experience. For it implied not only the “relativized a priori” — i.e., the contingency and mutability of the framework-constituting principles — but also the relativity of
208
Intuition and the Axiomatic Method
objective knowledge itself, as something defined only with respect to a set of arbitrary stipulations. Carnap’s distinction between internal and external questions merely expressed this relativity in its starkest form: for the comparison among frameworks one had, at best, a neutral descriptive language rather than a theoretical or critical standpoint, so that the adoption of any framework was necessarily a matter of pragmatic decision. In the dispute about abstract entities in the foundations of mathematics (cf. Carnap (1956)), Carnap’s distinction may appear to play a modest clarificatory role; in the dispute between competing theories of spacetime geometry, it seems to deny that there is any serious issue concerning the structure of the physical world. By appreciating the role of conceptual analysis in conceptual change, we arrive at a subtler distinction than Carnap’s: between questions that take a particular framework for granted, and questions about the conceptual structure of the framework itself — in particular, questions about the relation between the explicit principles of the framework, on the one hand, and the concepts that are implicit in our empirical knowledge, on the other. Questions of the first kind may be internal, but questions of the second kind are not really external in Carnap’s sense. On Carnap’s view, for example, the question whether falling bodies follow spacetime geodesics is either an internal question about how geodesics are defined within a given spacetime theory, in which case it is answered by internal logical analysis, or a question about the expediency of adopting a framework that defines geodesics as the paths of falling bodies. But neither question could have motivated Einstein’s analysis of gravitation. Both the internal and the external questions take the competing frameworks as given, whereas the framework of curved spacetime is precisely the product of Einstein’s analysis; the given material for the analysis is only the known behaviour of falling bodies as understood within the flat-spacetime framework. Carnap’s distinction, in other words, does not comprehend the possibility of a conceptual analysis that discovers, within a given framework, the principle on which a radically new framework can be constructed. The failure to comprehend this possibility epitomizes the failure of conventionalism as a critique of the synthetic a priori. The conventionalists supposed that objective reasoning is either logical analysis of the structure of a framework — including the identification of its arbitrary stipulations — or empirical reasoning within the constraints of a framework. Conventionalism thus did not go beyond or even reject Kant’s notion of the synthetic a priori, but merely denied that any of our knowledge answers to that notion, and classified framework-constitutive principles as analytic. But the essential problem for Kant’s notion, in light of the developments in physical geometry over the last two centuries, was not the discovery that alternative frameworks could be arbitrarily defined and freely adopted. Rather, it was the realization — implicit already in Galileo and Newton, and clearly articulated by Helmholtz, Riemann, and Einstein — that the constitutive principles of physical geometry are not quite synthetic in Kant’s sense, and yet they are founded in our
Conventionalism and Modern Physics: a Re-Assessment
209
empirical knowledge of physics; the revolutionary changes in these principles were in some sense changes of definition, and yet they express a deepening understanding of physical space and time. These aspects of the frameworkconstitutive principles help to explain Quine’s objections to calling them “true by convention”. Comprehending all of these aspects requires a subtler view of conceptual analysis than conventionalism allows, and a role for analysis beyond the logical reconstruction of existing theories. Above all, it requires an appreciation of the continuing interaction between conceptual analysis and the growth of empirical knowledge, and the decisive part that such interaction has played in the evolution of physics.
Notes 1. See, for example, Putnam (1974), Glymour (1977), and Norton (1995). 2. See Friedman, (1999c). For the sake of convenience, I follow Friedman in writing “the positivists” to refer, primarily, to Schlick, Reichenbach, and Carnap. 3. Reichenbach’s discussion of universal forces was endorsed by Carnap (cf. 1966, p. 171), but subsequently has been widely criticized; see, e.g., Torretti (1983) for an especially useful analysis. Still it should be recognized that, as an effort to judge possible coordinative definitions on methodological grounds — by proposing to eliminate hypothetical and undetectable “forces” that might be introduced in order to save a particular geometry — Reichenbach’s discussion anticipates some celebrated later attacks on conventionalism, such as Glymour (1977). 4. Cf. Friedman (1999b). According to Friedman, Reichenbach’s earlier position on the a priori constitutive principles of physics, in his 1920 work The Theory of Relativity and A Priori Knowledge (1965), was an insightful one that was largely obscured by his attempt to assimilate it to the conventionalism of Schlick in the 1927 work, The Philosophy of Space and Time (1957). Admitting the justice of this criticism, I suggest that Reichenbach recast his constitutive principles as conventions because he realized that such principles have a definitional aspect, even if they also appear to be suggested by the facts: “It is again a matter of fact that our world admits of a simple definition of congruence because of the factual relations holding for the behaviour of rigid rods; but this fact does not deprive the simple definition of its definitional character.” (1957, p. 17.) But he did not arrive at a characterization of those principles that would do justice both to their definitional character and to his earlier philosophical concerns. On the one hand, he came to share Schick’s view that these definitions are arbitrary; on the other hand, he continued to criticize Poincar´e for “overlook[ing] the possibility of making objective statements about real space” (1957, p. 36 n. 3). 5. See, for example, Schlick (1917), section VI, and Reichenbach (1957), section 36. 6. This illustrates Torretti’s conclusion that incommensurability is not a serious difficulty in a case where the new conceptual framework arises from conceptual criticism of the old, as does Galileo’s in relation to Aristotle (1989, section 2.5). It may also illustrate what Stein means by the “dialectic” of science, as a criticism of Carnap’s view that conceptual frameworks can be compared only on general pragmatic grounds (1992, pp. 291–292). 7. It is not generally acknowledged, in the literature on the “absolute-relational” controversy, that the familiar “indiscernibility” arguments against absolute space (as first presented by Leibniz) are, like Poincar´e’s, arguments from the homogeneity and isotropy of space; therefore, they are as irrelevant as Poincar´e’s arguments to the spatio-temporal structure identified by Newton as “absolute space.” The confusion arises from the failure to separate the question whether the structure of space allows for a distinguished position, and the question whether the structure of spacetime allows for a distinguished velocity. 8. This is documented in detail by Stein (1967).
210
Intuition and the Axiomatic Method
9. Newton refers directly to the Cartesians’ use of the centrifugal forces in vortices in the causal explanation of planetary motion, and points out its implicit accord with his definition: “Thus, even in the system of those who hold that our heavens revolve below the heavens of the fixed stars and carry the planets around with them, the individual parts of the heavens, and the planets that are relatively at rest in the heavens to which they belong, are truly in motion. For they change their positions relative to one another (which is not the case with things that are truly at rest), and as they are carried around together with the heavens, they participate in the motions of the heavens and, being parts of revolving wholes, endeavour to recede from the axes of those wholes” (1726, p. 413). In his unpublished manuscript, “De gravitatione et aequipondio fluidorum,” he explicitly notes the discrepancy between Descartes’s relativistic definition of “motion in the philosophical sense,” and his use of “motion in the vulgar sense” for actual philosophical (i.e. physical) reasoning: “And since the whirling of the comet around the Sun in his philosophical sense does not cause a tendency to recede from the center, which a gyration in the vulgar sense can do, surely motion in the vulgar sense should be acknowledged, rather than the philosophical” (Hall and Hall, p. 125). This interpretation of Newton is documented at greater length in DiSalle (2002). 10. James Thomson articulated the connection between the assumption of absolute simultaneity and the measurement of length and time, in introducing the notion of (what we now call) an inertial frame in Newtonian mechanics (1884). See also Torretti (1983, pp. 52–53). 11. The qualification is required for the case of general relativity, in which the laws fix not the geometry itself, but the correspondence between the geometry and the distribution of matter. 12. For useful studies of Einstein’s use of the equivalence principle, see Torretti (1983, chapter 5.2), and Norton (1985). 13. Cf. Einstein: “But this view of ours [i.e. of the equivalence of a system K at rest in a homogeneous gravitational field, and a system K’ that is uniformly accelerating] will not have any deeper significance unless the systems K and K’ are equivalent with respect to all physical processes, that is, unless the laws of nature with respect to K are in entire agreement with those with respect to K’.” (1911, p. 101.)
References Carnap, R. (1956), “Empiricism, Semantics, and Ontology” in: Meaning and Necessity, Supplement A, University of Chicago Press, pp. 205–221. Carnap, R. (1963), “W. V. Quine on Logical Truth” in: The Philosophy of Rudolf Carnap, ed. by P. A. Schilpp, Open Court, La Salle, Illinois, pp. 915–922. Carnap, R. (1966 [1995]), An Introduction to the Philosophy of Science, Dover Publications, New York (reprint). DiSalle, R. (2002), “Newton’s Philosophical Analysis of Space and Time” in: The Cambridge Companion to Newton, edited by I. B. Cohen and G. Smith, Cambridge University Press, pp. 33–56. Einstein, A. (1905), “On the Electrodynamics of Moving Bodies” in: Einstein, et al. (1952), pp. 35–65. Einstein, A. (1911), “On the Influence of Gravitation on the Propagation of Light” in: Einstein, et al. (1952), pp. 97–108. Einstein, A. (1916), “The Foundation of the General Theory of Relativity” in: Einstein, et al. (1952), pp. 109–164. ¨ Einstein, A. (1917), Uber die spezielle und die allgemeine Relativitatstheorie (Gemeinverstand¨ ¨ lich), second edition,Vieweg und Sohn, Braunschweig. Einstein, A. (1922), The Meaning of Relativity, Princeton University Press. Einstein, A., H. A. Lorentz, H. Minkowski, and H. Weyl (1952), The Principle of Relativity, edited by W. Perrett and G. B. Jeffery, Dover Books, New York. Friedman, M. (1999a), Reconsidering Logical Positivism, Cambridge University Press. Friedman, M. (1999b), “Geometry, Convention, and the Relativized A Priori: Reichenbach, Schlick, and Carnap” in: Friedman (1999a), pp. 59–70. Friedman, M. (1999c), “Poincar´e’s Conventionalism and the Logical Positivists” in: Friedman (1999a), pp. 71–86. Friedman, M. (2002), “Geometry as a Branch of Physics: Background and Context for Einstein’s ‘Geometry and Experience’ ” in: Reading Natural Philosophy: Essays in the History
Conventionalism and Modern Physics: a Re-Assessment
211
and Philosophy of Science and Mathematics to Honor Howard Stein on his 70th Birthday, edited by D. Malament, Open Court Press, Chicago, pp. 193–229. Galileo (1632 [1967]), Dialogue Concerning the Two Chief World Systems-Ptolemaic and Copernican, translated by S. Drake, University of California Press, Berkeley. Glymour, C. (1977), “The Epistemology of Geometry” in: Nous 11, 227–251. Hall, A. R. and M. B. Hall (eds.) (1962), Unpublished Scientific Papers of Isaac Newton, Cambridge University Press. Helmholtz, H. (1884), “Ueber den Ursprung und die Bedeutung der geometrischen Axiome” in: Vortr¨age und Reden II, Vieweg und Sohn, Braunschweig, 1–31. Majer, U. and H-J. Schmidt (eds.) (1994), Semantical Aspects of Spacetime Theories, BI Wissenschaftsverlag, Mannheim. Minkowski, H. (1908), “Space and Time” in: Einstein, et al. (1952), pp. 75–91. Newton, I. (1726 [1999]), The Principia: Mathematical Principles of Natural Philosophy, translated by I. Bernard Cohen and A. Whitman, University of California Press, Berkeley. Newton, I. (1729 [1962]), “The System of the World” in: Sir Isaac Newton’s Mathematical Principles of Natural Philosophy and his System of the World, edited by F. Cajori, translated by A. Motte, 2 vols., University of California Press, Berkeley. Norton, J. (1985), “What Was Einstein’s Principle of Equivalence?” in: Studies in History and Philosophy of Science 16, pp. 203–246. Norton, J. (1994), “Why Geometry is Not Conventional: the Verdict of Covariance Principles” in: U. Majer and H.-J. Schmidt (1994), pp. 159–168. Poincar´e, H. (1912), “L’Espace et le Temps” in: Derni`eres Pens´ees, Flammarion, Paris 1913, pp. 97–109. Poincar´e, H. (1913), The Foundations of Science: Science and Hypothesis; The Value of Science; Science and Method, translated by G. B. Halsted, PA: The Science Press, Lancaster. Putnam, H., (1974), “The Refutation of Conventionalism” in: Nous 8, 25–40. Reichenbach, H. (1957), The Philosophy of Space and Time, translated by M. Reichenbach, Dover Publications, New York; originally published as Philosophie der Raum-Zeit-Lehre, Berlin 1927. Reichenbach, H., (1965), The Theory of Relativity and A Priori Knowledge, translated by M. Reichenbach, University of California Press, Berkeley; originally published as Relativit¨atstheorie und Erkenntnis Apriori, Berlin 1920. ¨ Riemann, B. (1867), “Uber die Hypothesen, die der Geometrie zu Grunde liegen” in: Abhandlungen der k¨oniglichen Gesellschaft der Wissenschaften zu G¨ottingen 13, 133–152; translated as “On the Hypotheses Which Lie at the Foundations of Geometry” in: A Sourcebook in Mathematics, ed. by D. E. Smith, McGraw Hill, New York 1929, pp. 411–425. Schlick, M. (1917), Raum und Zeit in der gegenw¨artigen Physik. Zur Einf¨uhrung in das Verst¨andnis der Relativit¨ats- und Gravitationstheorie, Berlin. Stein, H. (1977), “Some Philosophical Prehistory of General Relativity” in: Foundations of Space-Time Theories, Minnesota Studies in Philosophy of Science 8, ed. by Earman, Glymour and Stachel, University of Minnesota Press, Minneapolis, pp. 3–49. Stein, H. (1992), “Was Carnap Entirely Wrong, After All?” in: Synthese 93, 275–295. Thomson, J. (1884), “On the Law of Inertia; the Principle of Chronometry; and the Principle of Absolute Clinural Rest, and of Absolute Rotation” in: Proceedings of the Royal Society of Edinburgh 12, 568–578. Torretti, R. (1983), Relativity and Geometry, Pergamon Press, Oxford. Torretti, R. (1989), Creative Understanding, University of Chicago Press.
INTUITION AND THE AXIOMATIC METHOD IN HILBERT’S FOUNDATION OF PHYSICS HILBERT’S IDEA OF A RECURSIVE EPISTEMOLOGY IN HIS THIRD HAMBURG LECTURE
Ulrich Majer Universit¨at G¨ottingen, Germany
Tilman Sauer California Institute of Technology, Pasadena, U.S.A.
In July 1923, two years after his lectures on the “New Foundation of Mathematics ” (Hilbert (1922)) and six months before the republication of his two notes“The Foundations of Physics”in merged and revised form (Hilbert (1924)), Hilbert presented a trio of lectures in Hamburg. Notes for these lectures are extant in the Hilbert Papers in G¨ottingen and will be published in Volume 5 of David Hilbert’s Foundational Lectures (forthcoming from Springer) under the title “Fundamental Questions of Modern Physics”; the third lecture was presented again some months later in Z¨urich. This third lecture is of particular interest for philosophers since, as its title indicates, it deals in an unusually explicit manner with epistemological questions associated with modern mathematical physics.1 Moreover, Hilbert claims that he has found a definitive answer to a fundamental philosophical question: The last lecture . . . deals with a general philosophical question in such a way that a definitive answer will emerge. . . . Although they are philosophically general assertions, I wish it to be the case that they are just as definite and just as certain as if they were pure mathematical assertions.2
This is a strong claim and one might suspect that Hilbert — already in his sixties — had fallen victim to a certain kind of hubris, typical perhaps for mathematicians who have “left reality behind”. In contrast to this suspicion, it is one of the aims of this paper to show that Hilbert tackled a serious philosophical question, a question inherited from Kant, and that he had good reasons to hope
213 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 213–233. © 2006 Springer. Printed in the Netherlands.
214
Intuition and the Axiomatic Method
that he could answer it in a definitive way by means of the “axiomatic method” and by some new considerations. The question at stake is the classical epistemological question: what do we know a priori and what part of our knowledge comes from experience. This question, which came into focus with Kant’s transcendental philosophy, can be answered in fundamentally different ways, and the debate over which of the different answers is the ‘right’ or the most appropriate one became one of the main obstacles and sources of frustration of twentieth century philosophy. However, before we enter the debate and present Hilbert’s answer as a reasonable option, we have to make some remarks concerning our further procedure. At first glance, the third lecture looks as if Hilbert was defending a naive view of empiricism. Yet, upon closer inspection it becomes clear that, on the contrary, Hilbert favours a moderate kind of ‘apriorism,’ in accordance with his new ‘finite point of view [finiter Standpunkt]’ regarding the foundations of mathematics. In Hilbert’s ‘finitism’ intuition plays an indispensable role for a firm foundation of mathematics. At the same time, Hilbert seems to defend a kind of radical empiricism with respect to geometry and all more comprehensive physical theories including an all-embracing field theory based on what he called “world equations” (see below). The two “-isms” taken together appear to be rather incoherent or even inconsistent. This would be the end of Hilbert’s philosophical ambitions, and his claim that he is going to answer a philosophical question “definitively” could be dismissed as what it appeared to be in the first place, a rather arrogant pretension of a mathematician with little knowledge of serious philosophy. However, a closer analysis of the three lectures, and of earlier lectures on “Nature and Mathematical Cognition”3 , shows that Hilbert knew precisely what he was talking about, not only with respect to physics, but also with regard to philosophy. The main philosophical dogma that Hilbert is combating is not Kant’s theory of the apriority of mathematical knowledge based on pure intuition, at least not in the first place. What he is really arguing against is Poincar´e’s geometrical conventionalism — including Einstein’s appreciation of it. This raises the delicate question as to Hilbert’s own epistemological position with respect to mathematics and geometry as well as with respect to the natural sciences in general, and as to the role that the highly controversial notion of “intuition” played in it. In answering these questions, we will follow a strategy that is already entailed by Hilbert’s discussion of the matter. The strategy follows a distinction made with respect to the concept of experience or, respectively, with respect to the notion of the a priori. Roughly, we will distinguish three kinds or levels of experience. First, we have a kind of “pre-scientific” experience, which we need in order to survive. Following Hilbert, we will call this type of experience “the experience of every day life [die Erfahrung des t¨aglichen Lebens]”. The second kind of experience is the conscious or “scientific” experience in the broadest sense of the term. It began some thousands of years ago in the
Hilbert ’s Foundation of Physics
215
Near East with reflection on man’s position in the world and his relation to the gods. It resulted, in ancient Greece, in a kind of speculative or theoretical knowledge about man and his place in nature, in particular about geometry, astronomy and the social and political order of human societies. The development of knowledge based on “scientific experience” is an ongoing process and has meanwhile led to a huge corpus of knowledge, theoretical as well as practical, particularly in the domain of mathematics and the natural sciences.4 There is, however, according to Hilbert, a third kind or level of experience that is the result of an emancipation from our anthropomorphic point of view and the aspiration to a totally objective form of knowledge. The latter goal cannot be achieved in a single stroke, but only by a stepwise “emancipation” from the human forms of cognition. The best example to illustrate what is meant by such an emancipation is the realisation that the so-called secondary qualities of our senses like colour, tone, heat etc. do not exist in the objects themselves which cause these sensations, but only in our conscious perceptions. They are — as Helmholtz expressed it in concordance with his teacher Johannes M¨uller — only a “function” of the “specific energies” of our senses by which we contact the world outside our mind.5 As this example shows, the emancipation from the anthropomorphic point of view requires a sharp distinction between the merely ‘subjective’ and the truly ‘objective’ moments in our scientific, mathematical description of the world. This is an extremely difficult task, because we are in most cases completely unaware of the subjective aspects and erroneously take them for objective properties of things, as the example of secondary qualities like colours, shows. The search for a totally objective form of knowledge requires, as we will see, a non-ending recursive investigation. In the following, we will characterize Hilbert’s epistemological position and explain why his answer to the question of the frontier between knowledge by experience and knowledge a priori seems to us well-founded. Our explanation will make use of the three notions of “experience”, as outlined above, and of their corresponding negations, which are tantamount to three distinct concepts of a priori. We will explain what this means in more detail below. For now, it is sufficient to say that this distinction is motivated by the idea that the portion of ‘a priori’ knowledge shrinks during the cultural evolution of man, and, in particular, with the development of science. This leads to an epistemological point of view, that one of us (UM) has called “recursive epistemology” — a term which does not occur in Hilbert’s writings. In explaining this point of view, we will now follow to a certain extent Hilbert’s own argumentation.
1.
Hilbert’s basic attitude towards the “objectivity” and “self-sufficiency” of a physical theory
Hilbert begins his considerations in the first talk with a distinction between two areas of knowledge. This distinction may easily be overlooked but will later turn out to be very important.
216
Intuition and the Axiomatic Method If we want to achieve an overview of the domain of those sciences which rest on the contemplation of inanimate nature, we have to take into account all facts, results and concept formations that belong to the domains of mathematics, physics, astronomy, chemistry, technology, and all neighbouring sciences. This complex of knowledge is huge in size and extension and finely and widely ramified; and yet it is marked off sharply from the other domains of human knowledge, namely the biological sciences and even more from those, in which the human being as such is the focus of interest.6
In other words, Hilbert distinguishes two different domains of natural sciences: the domain of ‘inanimate’ nature, which is the proper domain of physics in the widest sense, and the domain of living beings including “human beings as such”, which is the domain of biology and the social and cultural sciences. One may object that this distinction is very problematic, if only because the laws of physics are universally valid and living beings — humans included — are no exception to these laws. But this argument leaves open the possibility that we need special, and (presumably) totally different types of laws than those of present physics in order to understand the phenomena of life in general and of man in particular. In fact, Hilbert has a particular reason for his rigorous distinction of the two domains of knowledge. It has to do with the special status of the ‘kinetic theory of gases’ as a statistical theory of apparently ‘irreversible’ processes in nature. We will come to this point later, when we discuss the question of whether in Hilbert’s view irreversibility is an ‘objective,’ or only an apparent phenomenon, conditioned unavoidably by our anthropomorphic point of view. The first and second of Hilbert’s three Hamburg lectures are both centred around two new meta-scientific concepts. They were introduced by Hilbert in order to explain and clarify the special epistemological significance that the system of “world-equations” has in the whole edifice of physics, i.e., in our knowledge of inanimate nature. The first of these two meta-scientific concepts is the notion of “objectivity”, the second that of “accessorial” laws, concepts and principles. Both notions are indispensable for a proper understanding of Hilbert’s epistemology and its recursive nature. However, before discussing these notions, we have to mention two presuppositions taken for granted by Hilbert. First, for Hilbert, modern physics is unthinkable without the concept formations, ideas, and methods of mathematics. Specifically, the concept of number has a central position in mathematics, and therefore numbers play a decisive role in physics too. But numbers as such form no part of inanimate nature, they are, so to speak, alien to nature.7 Hence, for the mathematical description of nature by numbers, a “mediation” is required between numbers and the objects and processes to be described. This mediation is achieved by the idea of a “coordinate system”. Let us now return to the notion of objectivity and ask: What did Hilbert understand by “objective”? According to Hilbert’s first Hamburg lecture, a system of space-time coordinates can achieve this mediation only if a certain principle, the principle of objectivity, is satisfied:
Hilbert‘s Foundation of Physics
217
A sentence about nature, expressed in coordinates, is a proposition about the objects in nature only if it [the sentence] has a content which is independent of the coordinates. For example, “The x-coordinate of Hamburg is 100 km.” is a sentence which expresses no proposition about reality. The coordinates, which were necessary [for the representation of the objects in space and time], have to be, as it were, eliminated again.8
This emancipation from the coordinate system, Hilbert observes, can be achieved in three different ways, which correspond to the three forms of singular, particular and general judgements. First, by presenting a concrete object which fixes the coordinate system. Second, when we have an assertion stating the existence of a coordinate system in which all the relations formulated between the objects under consideration are valid. Third, if the proposition in question is valid in every coordinate system. Hilbert then goes on to discuss which of the three admissible ways of emancipation from a coordinate system is the most satisfactory. Not unexpectedly, he concludes that it is the third one, and this is the form which is realised in his “world-equations”. Let us recall at this point what those world equations were. In his first communication on the foundations of physics from 1915,9 Hilbert had presented what he called “fundamental equations of physics”, the basic characteristics of which were the following. Hilbert assumed from the outset a four-dimensional real manifold that was equipped with a real, symmetric tensor field and a real four-vector field. The tensor field was interpreted as a metric field in the sense of Einstein’s generalized theory of relativity, the vector field was interpreted as the electromagnetic four-potential. Specifically, Hilbert then formulated two fundamental axioms of physics. The first axiom of the theory postulated the existence of an invariant, i.e., scalar function, a so-called “world-function”, that would depend on the components of both the metric tensor field and of the electromagnetic four-potential. The second axiom postulated field equations for those components as solutions to the variational problem of finding the extremum for the invariant space-time integral over that world-function. As in standard general relativity, variation of the integral of the worldfunction with respect to the components of the tensor field yields the gravitational field equations, while a variation with respect to the components of the vector field yields electromagnetic field equations. Global or topological questions were not addressed, but Hilbert did discuss the implications of the postulate of general diffeomorphism covariance with respect to the independence of the field equations. As a consequence of Noether’s theorem, proved in generality only later in 1918, four of the fourteen field equations, obtained by variations of the action integral with respect to the tensor and vector field components respectively, are not independent of the other ten. This fact was interpreted by Hilbert as the mathematical expression of the assertion that the electromagnetic field equations are, in fact, a consequence of the gravitational ones, or, more generally, as the mathematical expression of an inner unity of gravitation and electromagnetism.10
218
Intuition and the Axiomatic Method
The two axioms actually do not yet fix the precise form of the “world equations”, but rather define a whole class of field equations. In order to obtain a specific set of field equations, a definite form of the “world function” has to be specified. In his 1915 communication, Hilbert had assumed, in this respect, that the world function was a sum of a gravitational and an electromagnetic part, and he had further specified the gravitational field equation to be given by the Riemann curvature scalar, and the electromagnetic part to be given as a linear combination of invariant terms that only depend on the vector potential and its first derivatives. With these specifications, the gravitational field equations take the form of the field equations of Einstein’s General Theory of Relativity, with a matter term that would only contain electromagnetic contributions. The precise form of this matter term as well as the precise form of the electromagnetic field equations would depend on the precise form of the electromagnetic part of the “world function”, but Hilbert did not definitively put forward just one. If, as in standard General Relativity, the electromagnetic part were proportional to just the square of the electromagnetic field, given by the derivatives of the potential, the resulting matter term and field equations would be the classical Maxwellian equations. But Hilbert explicitly allowed contributions to the electromagnetic part of the “world function” that would explicitly depend on the components of the potential itself. Those contributions would yield non-linear modifications of Maxwell’s equations and, following Mie’s ideas on an electromagnetic theory of matter, Hilbert was indeed hoping that such non-linearities could be used in order to explain the structure of matter. However, he had not found a specific form of the electromagnetic part of the “world function” that would explicitly allow the writing down of non-linear field equations whose exact solutions were known and could be interpreted in a sensible way.11 The specific form of the electromagnetic part that Hilbert did discuss in his 1915 note was the one that Mie had discussed in his work for the Lorentzinvariant, i.e., special relativistic, case. But already Mie had found that the explicit solutions of the resulting equations were unacceptable for reasons of physics. The world equations of Hilbert’s 1923 Hamburg lectures are just the ones that he had advanced eight years earlier in his 1915 note, except for the fact that he now gives a different explicit form of the electromagnetic part of the “world function.” He still does not give an explicit solution to the resulting field equation but rather confines himself to an aside on a recent article by Einstein on unified field theory. Picking up Eddington’s approach towards a unified field theory based on the affine connection, Einstein had proposed field equations for Eddington’s theory and, in a follow-up note had reintroduced the metric field and the vector potential as dynamical variables of the resulting theory. The latter fact induced Hilbert to comment on Einstein’s work as a confirmation of his own original approach. The explicit form of the “world function” given in the Hamburg lectures has the usual Maxwell term quadratic in the fields but it also adds a term propor-
Hilbert‘s Foundation of Physics
219
tional to the square of the vector potential. As we have said, Hilbert did not discuss explicit solutions of the corresponding field equations and their physical interpretation. But the task of obtaining explicit field equations from the “variational Ansatz” and of finding explicit integrals for those field equations was at the core of Hilbert’s research program to find a unified field theory. The hope was that a “world function” could be found whose corresponding physical fields might be interpreted as particle-like configurations of the field. Einstein, too, responding to contemporary approaches towards a unified field theory of gravitation and electromagnetism began to work along those lines and, in fact, made the requirement of the existence of a stationary, spatially, spherically symmetric, non-singular solution of any field equation a touchstone of the physical viability of any such theory. Both for Einstein and for Hilbert, and for many other contemporaries, therefore, the existence of “world equations” implied an unfulfilled research program. In his Hamburg lectures, the fact that Hilbert could not give an explicit solution to his “world equations” is reflected by a cautious disclaimer that he would reserve the right to supplement or modify the field equations in any way that may turn out to be necessary. The important point in our context concerns the fundamental significance of the “world equations”. As long as the viability of the Mie-Hilbert research program had not been refuted, the “world equations” contained a theory of matter, if only implicitly. In this sense, it overcomes the duality of matter and motion inherent in classical physics a` la Newton and Lorentz. And it is this characteristic of Hilbert’s all-embracing field theory that is at the basis of his further epistemological considerations. In the second talk, Hilbert introduces, for the first time as far as we know, the notion of “accessorial” laws, concepts and principles. This notion is not easy to grasp. Why does Hilbert introduce this notion at all? What does it mean? And what it is good for? We hope, we can answer these questions properly. Here is Hilbert’s first introduction of the term: . . . I will call everything that has to be added to the world-equations in order to understand the events in inanimate nature accessorial for short.12
This does not sound very clear and promising, and yet it is a very important notion, as we will see in a moment. If we ask whether the “world equations” stand in need of a supplementation in order to determine the potentials as functions of place and time, the answer is yes: we need initial conditions and constraints in order to obtain a determinate solution of the world equations. This means that these equations only allow us to predict future events if we know enough about the present state of affairs. In this respect, the world equations resemble Newton’s equations of motion. But there is an important difference, one which Hilbert wishes to capture with his notion of “accessorial”. This difference becomes visible if we ask: Do we need accessory laws of nature? The answer is in case of the “world equations” is that we do not, but in Newton’s case the answer is that we do. The reason for this difference is related to the circum-
220
Intuition and the Axiomatic Method
stance that the world equations permit propositions about the present state of nature, without the support of accessorial laws, whereas Newton’s equations do not; they are in need of accessorial laws. This explication will perhaps become clearer, we hope, if we consider some of the examples which Hilbert discusses. The first example suggesting that accessorial laws are necessary is that of thermodynamics. The symmetry properties of the “world equations” imply that each process in nature can be reversed. In fact, reversibility is a necessary consequence of the invariance of the world equations under general coordinate transformations. But the law of reversibility “brings us in conflict with our everyday perceptions and with the ordinary disposition in our thinking”13 . Furthermore, we know from thermodynamics that irreversible processes play an important role in nature. Consequently, one could expect that we may run here into an accessorial law. Hence, we have to ask, according to Hilbert, whether the doctrine of increasing entropy and the assumption of a privileged direction of time are necessarily connected. Hilbert’s answer is, of course, no. In the context of this paper, we cannot explain how he justifies this answer.14 For the present purposes, it will suffice to explain the quintessence of Hilbert’s reasoning, i.e., why he thinks that irreversibility is no “objective” property of nature in the sense already defined, but only the effect of our anthropomorphic point of view. According to Hilbert, we must be aware of the following: in the explanation of irreversible processes by means of statistical mechanics, the asymmetry regarding past and future is generated by nothing other than a special selection of initial states and conditions. Consequently, irreversibility is no objective property of inanimate nature but only the result of our anthropocentric point of view. To put it differently, the imagined privileged direction of time is just that, solely an idea of our minds, and not an objective property of lifeless nature. For this very reason, thermodynamics does not imply any accessorial laws that we should add to the world equations. The second example is the assumption that all matter is made out of molecules and atoms, and the latter in turn out of electrons and protons. This assumption is called the “principle of atomism”. Is this principle an accessorial law? At first glance it seems so, and if we took Newton’s equations as fundamental instead of the “world equations” this would certainly be the case. But with the “world equations” things are different; here the answer is negative, and we must understand why. According to Hilbert, the “world equations” alone are sufficient to deduce not only the existence, but also the structure and properties, of matter, at least in principle. However, with this assertion Hilbert is not on the same safe ground as in the case before, and this is presumably the reason why he adds the remark that the world equations perhaps have to be modified in order to reach the goal of an all-embracing theory. Be that as it may. What Hilbert has in mind will become clear from the following two quotations, which are given here in reverse order:
Hilbert‘s Foundation of Physics
221
At the beginning of today’s lecture, I pointed to the circumstance that the world equations, because they are differential equations, can’t determine as such definitive solutions, and for this reason it seems as if the world equations can only serve to predict the future from the present that is known to us, whereas we get nothing about the present state of things from them. We have seen that the contrary is the case.15
and The thought (supported by me here) that all physical being and happening is ruled by a world law, one which has the form of generally invariant system of partial differential equations, has a predecessor in the former theoretical ideal of Mechanism. But the difference from earlier, the change with respect to the former point of view is very significant. . . . Whereas previously the multiplicity of substances and physical constants were accepted as something final, and the so-called general properties of bodies were simply taken as basic forces each on a par, we assume today that there are only two different constituents of matter, electrons and positrons, from whose composition and relative motion we can explain the chemical and physical properties of substances, and we also regard their existence as a consequence of the basic equations.16
The circumstance that the “world equations” do not need to be supplemented by accessorial laws in order to develop a theory of matter does not mean that we do not need accessorial ideas and principles at all. On the contrary, demands such as stability and periodicity of the solutions of the world equations are essential for a proper application of these equations to nature. However, Hilbert emphasises that these accessorial demands do not have the mathematical character of new equations, but that they are of a more general nature; they are related to our thinking about, and our attitudes towards, nature. For example, the assumption that we can apply the principle of probability to nature is such an accessorial idea of our mind, which has no immediate reference in inanimate nature. The fact that we need such accessorial ideas and principles expressing our fundamental attitudes towards nature is one reason why Hilbert is so concerned to distinguish carefully between physics on the one hand, and the biological sciences on the other, including anthropology and the cultural sciences.17 This observation brings us back to the main question: What is Hilbert’s epistemology, if he has a consistent approach at all to the question of how we come to know?
2.
Hilbert’s basic attitude towards the epistemological question regarding the sources of our knowledge
We shall begin our considerations of Hilbert’s epistemology with a somewhat irritating quotation from the opening remarks of his third lecture: he repeats the main point of the second lecture that the world equations do not require supplementation by accessorial laws because they are, so to speak, autarchic or self-sufficient, although for their application to the world we need, as we have seen, some accessorial ideas and principles such as stability and the principle of probability. Hilbert then continues:
222
Intuition and the Axiomatic Method Here we face a decision about an important philosophical problem, namely the old question as to the share that our thinking, on the one hand, and our experience, on the other, have in our knowledge. This old question is a legitimate one, because to answer it means in the end to settle the question of the nature of our scientific knowledge, and in what sense the knowledge that we gather in the pursuit of science is truth.18
Obviously, Hilbert here recapitulates Kant’s epistemological question regarding the different sources of knowledge and makes it his own. He certainly does not side with the most extreme form of apriorism, according to which all knowledge about the world can be deduced from mere conceptual thinking, a form to which Weyl’s “unified field theory” came dangerously close, as Hilbert had pointed out already in 1919. Although the articulation of the question looks rather naive, its division of knowledge into two complementary parts points in the right direction. One part of our knowledge is a priori in the sense that it is a pure product of our thinking. The other part is empirical; i.e., it has its origin, not exclusively in our thinking, but also in nature. This rather crude distinction will be refined later on, but first let us ask why Hilbert is so confident that he can answer the old question in a “definitive” way, even though philosophers (up to the present day) have not been able to agree on the right answer. Hilbert gives two reasons for his confidence. First, the time around 1920 is ripe, so to speak, for an answer to be given to this question in a much more reflective and profound way than had been possible in the centuries before. Hilbert quite correctly points out that science has made enormous progress since 1900, and has deepened our understanding, not only of nature, but also of the scientific enterprise itself, and of its historical development. In contrast to the older philosophers, we have the advantage of having witnessed a large number of such [great] discoveries, and that we came to know the new points of view effected in this way while they emerged. Among these new discoveries were many that greatly changed or totally removed the old deeply rooted views and ideas. Let us just mention the new concept of time and the decomposition of the chemical elements. Prejudices, which nobody before had dared to touch upon.19
In addition to this historical argument, Hilbert gives a second, more systematic reason for his confidence that he would be able to find a definitive answer to the above philosophical question. It is the development of the famous (but frequently misunderstood) “axiomatic method”. But yet a second circumstance is to our benefit in determining the answer to the old philosophical question. There exists a general method for the theoretical treatment of scientific questions, a method which is very developed today and which is often unconsciously applied by researchers. This is the axiomatic method, and it most certainly helps in formulating the question at issue more precisely.20
Since it is very important that we avoid any possible misunderstanding in this delicate matter, we would like to point out, and also correct, what we perceive
Hilbert‘s Foundation of Physics
223
as the two most widespread and dangerous misconceptions with respect to Hilbert’s axiomatic method.21 Frequently people tend to think that to “axiomatize” a theory is a licence to choose axioms arbitrarily. Nothing could be more misleading than this view. It fails to appreciate that the axiomatic “presentation” of a certain field of facts is not the same as the application of the axiomatic method to a theory, which is already presented in axiomatic form. The axiomatic method is a means of investigating the logical structure of a theory, the dependence and independence of its sentences, and its deductive completeness and consistency. The axiomatic presentation of a field of facts is a symbolic representation of things and their relations; it has to be faithful, not only to the facts, but also to the logical order and the relational structure of a given field of facts. The axiomatic method, on the other hand, not only allows, but actually requires, the deliberate and systematic variation of the axioms. The only limit in this respect is the requirement that the resulting systems of axioms remain consistent. A second misconception is the common superstition that the axiomatic method onlymakes sense with respect to mathematics.22Although theaxiomaticmethod asa means of logical analysis has its origin in geometry, it is by no means restricted to mathematics. On the contrary, its real value lies in its application to physics, because it is in this domain, more than in mathematics, that we pile up assumption on assumption, without asking whether the assumptions are all necessary and whether the results are still consistent. This was stressed most explicitly for the first time by Heinrich Hertz with respect to the concept of “force” in classical mechanics. Consequently, Hertz tried to reduce the danger of inconsistency by eliminating the problematic concept of force from mechanics altogether.23 Hilbert, who was a great admirer of Hertz, follows him in this critical attitude. However, in distinction to Hertz, who cured the logical muddle by a specific technical adjustment, Hilbert offers a very general method that allows him to localise the logical dependencies and independencies as well as possible inconsistencies in an axiomatic presentation of a theory by means of the axiomatic method, i.e., by means of a systematic variation of the axioms of the theory in question. Geometry itself offers the best example for this method. But instead of dwelling on Hilbert’s treatment of geometry, let us proceed and discuss Hilbert’s epistemological position in three consecutive steps by asking how Hilbert gave an answer to the old philosophical question as to which part of our total knowledge comes from experience, and which part is in fact a priori, relative to the experience that we have accumulated at a certain point in time24
3.
Hilbert’s first, somewhat rough answer
In order to understand Hilbert’s answer precisely one must first introduce a twofold distinction regarding the epistemological status of laws, which Hilbert
224
Intuition and the Axiomatic Method
made explicit only towards the end of his talk, although he made use of it implicitly before. Roughly speaking, it runs like this: The question regarding the empirical character of the laws of nature and the [question] regarding their exact validity are two questions of a totally different kind and are independent of each other; the constructions serving to answer them have to be strictly distinguished.25
This distinction is very important, because it opens up the possibility that we may be able to recognise a certain law of nature a priori, relative to the experience that we possess at a certain time, although we are not able to validate the law exactly at that point, because we do not possess the experimental skills and means to do so. And this is precisely what happened in the case of geometry. But it happened not only in geometry, as Hilbert stresses, but also in other fields of science.26 In accordance with this distinction, Hilbert introduces another, namely that between the setting up of a theory, on the one hand, and the task of establishing its relation to the world, on the other. The former is characterised as a mental construction of a “framework of concepts”, as Hilbert calls it, while the latter is the application of such a framework to the real world according to the different criteria of objectivity in experiments. If these two actions are not sufficiently distinguished, confusion inevitably arises regarding the different sources of human knowledge. In order to understand Hilbert’s first answer, let us now make the following two assumptions: 1) We will assume that we have already constructed an allembracing physical theory, i.e., a framework of concepts that covers the whole complex of physical facts known at the time. 2) The world-equations are the axioms of this theory; they are self-sufficient in the sense that they do not stand in need of supplementation by “accessorial laws”. If we now ask how this theory is related to the world, we get the following rather dialectical answer: If these world equations, and with them the framework of concepts were complete, and if we knew that it fits in its totality with reality, then in fact only thinking, i.e., conceptual deduction, would be required in order to acquire all physical knowledge.27
This would mean, to make it crystal clear, that there would be only a single source of knowledge, namely logical thinking and conceptual deduction. But, of course, Hilbert does not share this view. On the contrary, he argued already in 1919 that such a position would lead into a number of paradoxes, such as total determinism, complete reversibility of life-processes, and, what is particularly absurd, the impossibility of minds. Unfortunately, space does not allow us to go into these topics. Instead we have to ask: what is Hilbert’s own position? How do we get to know the “world equations”? Where do they come from? Here is his first, still somewhat rough, answer: . . . I claim that it is precisely the world laws that can be obtained in no other way than from experience. It may be that in the construction of the framework (of physical concepts) many different speculative points of view play a part; [but]
Hilbert‘s Foundation of Physics
225
whether the proposed axioms and the logical framework erected from them is valid, experience alone is able to decide.28
This quotation seems to leave no space for differing interpretations: we may come to know the world equations somehow a priori, but the question of whether they are valid or not can only be answered by experience. However, already at the end of the same passage Hilbert begins to reflect on this answer by considering two alternatives to this kind of “logical empiricism”.29 Either one maintains that we have a third source of knowledge besides logic and experience, or one denies that the “world equations” alone contain our knowledge about nature, and asserts instead, as Poincar´e did, that the world equations lead to knowledge about nature only in conjunction with a kind of stipulation regarding the meaning of the theoretical terms. We shall present Hilbert’s discussion of the alternatives in the same order. Let us begin with the first alternative: is there a third source of knowledge?
4.
Hilbert’s second, more sophisticated answer
It should be clear from Hilbert (1922) that he was not sympathetic to Frege’s “logicism”, i.e., to the idea that mathematics is nothing but a branch of logic. In opposition to Frege, he insisted that mathematics is in need of “non-logical discrete objects, being intuitively present as immediate events before all thinking” (Hilbert (1922), p. 163). Whatever that may mean, and Parsons is right when he says that this may turn out to be a difficult question, one point should be beyond doubt. Hilbert believes that mathematics requires a “third” source of knowledge, besides logic and experience, which has to do with our ability to represent things. This is corroborated by the following passage from the third Hamburg lecture, which comes shortly after the claim that the world equations are in fact empirical: Philosophers have indeed claimed — and Kant is the classical proponent of this point of view — that, in addition to logic and experience, we have a certain apriori knowledge of reality. Now, I admit that certain apriori insights are necessary for building up the theoretical frameworks, and that such insights are always at the basis of the generation of our knowledge. I believe that, in the end, mathematical knowledge rests on a kind of intuitive insight, and that even for building up the theory of numbers a certain apriori intuitive view is necessary. With this, the most general, basic idea of Kantian epistemology retains its significance, namely, the philosophical problem of characterising that intuitive view and thus investigating the conditions of possibility of all conceptual knowledge and, at the same time, of every experience. It is my opinion that this has been achieved in essentials in my investigations of the principles of mathematics. The apriori is here no less and no more than a basic disposition, or the expression for certain indispensable preconditions of thought and experience.30
This sounds familiar. There is indeed a close resemblance between Hilbert’s and Kant’s philosophy of mathematics. But there is also a sharp difference, one which can hardly be overestimated: whereas for Kant the term “mathematics” encompasses both arithmetic and geometry, this is not the case in
226
Intuition and the Axiomatic Method
Hilbert’s view. He restricts the term “mathematics” to arithmetic, including analysis, but excluding geometry. This raises a serious problem that we shall discuss in a moment. Before we do so, let us emphasise that the exclusion of geometry from mathematics is by no means accidental, but the result of a long and conscious development. In the course of this development, it became little by little clearer that space and time as the “forms of our intuition” are not the same structures as space and time in the objective sense of physics. Hilbert was aware of the difficulties in this development, as becomes clear from the following passage: At the time of Kant, one could well think that the representations of space and time which one had were just as generally and immediately applicable to reality as our ideas of number, sequence and quantity, which we use permanently in the mathematical and physical theories in the familiar ways. Then, indeed, the doctrine of space and time, in particular geometry, would be, like arithmetic, something which precedes all natural knowledge. This Kantian view was abandoned, particularly by Riemann and Helmholtz, even before the development of physics necessitated it, and quite rightly, since geometry is nothing but that part of the whole conceptual framework of physics which represents the possible position relations among rigid bodies in the world of real objects. That there exist rigid bodies at all, and what these position relations are, is purely a fact of experience.31
This, in a nutshell, is Hilbert’s view of the evolution of geometry from Kant to Poincar´e. Kant’s view of space and time as forms of human intuition was justified in his time and consequently Kant could argue that Euclidean geometry is an epistemological presupposition of experience. However, this view had to be abandoned for many reasons, the most important of which was without doubt the recognition of the logical independence of the axiom of parallels from the remaining axioms of Euclidean geometry. This opened up, once and for all, the logical possibility of non-Euclidean geometries, and, consequently, the question arose as to which of the various geometries is the valid one, in the sense that it fits most precisely with the data. The history, of how this question was answered, is well-known. Therefore, let us proceed directly to the conclusion, to what is quintessential, so to speak, in what Hilbert draws from this history: Hence we see: in the Kantian a priori theory, there is still contained anthropomorphic residue of which it has to be cleaned, leaving, after its removal, only that a priori view which is also required for pure mathematical cognition.32
So much for Kant. Now let’s come to Hilbert’s third and most sophisticated answer with respect to the epistemological question.
5.
The refutation of conventionalism and Hilbert’s plea for a recursive epistemology
One option for saving Kantian “apriorism” in a reduced form is Poincar´e’s “conventionalism” with respect to geometry. According to Poincar´e’s view, geometrical sentences in themselves do not express facts, but they do so only
Hilbert‘s Foundation of Physics
227
in connection with certain stipulations that fix the meaning of the geometrical terms. Instead geometrical sentences are regarded as a system of “agreements” by which the facts may be represented, similar to a system of measuring units, like meters or yards, by which distance can be represented numerically. Hilbert, of course, does not agree. On the contrary, and rightly it seems to us, he argues that the whole idea is hopelessly confused. The central point of his criticism is this. If Poincar´e assumes that we need stipulations in order to fix the “meaning” of the geometrical terms above and beyond the axioms of geometry which suffice for a representation of the transformations of rigid bodies, he introduces a an “foreign body [Fremdk¨orper]”, which works like an idle wheel.33 Poincar´e’s conventionalism with respect to geometry rests on two assumptions. First, we need rigid bodies in order to measure distances. Second, according to a theorem of Lie, “free mobility” of rigid bodies exists precisely in spaces of constant curvature. Hence, by means of rigid bodies alone we cannot discriminate among the spaces of constant curvature. Therefore, it is assumed, the corresponding geometries of these spaces are all empirically equivalent. In order to test this opinion Hilbert proposes the following thought-experiment. Suppose that we know from experience that rigid bodies exist, and that they can move freely, i.e., that all the facts expressed by the axioms of congruence are empirically valid, but that the sum of the angles in a triangle, constructed by means of rigid bodies, is smaller than a right angle, in other words that the rigid bodies form a hyperbolic geometry. According to Poincar´e, we can still represent the relations between the rigid bodies as functions of three “parameters” such that the whole resulting system of points, lines and angles forms a Euclidean geometry. With respect to this thought experiment and Poincar´e’s principal answer to it, Hilbert makes the following remark: it is a mathematically correct procedure, if one uses for the representation of the non-Euclidean relations of rigid bodies three parameters, . . . but it is neither appropriate, nor simple, nor conventional to impose a metric on these parameters, not even if it is the simplest one, namely the Euclidean. The Euclidean metric is after all an extant scheme of concepts, and even if one regards it as simple and intuitive, it is in any case more complicated than no geometry and no metric at all.34
Poincar´e’s procedure amounts, according to Hilbert, to the introduction of a second, superfluous frame of concepts into the first one, which is sufficient for a representation of the real world. The second frame has no relation to the real world; it appears to be, so to speak, “only a dead, closed off, alien element”35 . Consequently, Hilbert sides with Hertz, and his demand for “simplicity”: no superfluous elements in a physical theory, particularly not in geometry, which means that a theory must not containl elements that can be eliminated without impeding the task of the theory. So much for Hilbert’s refutation of Poincar´e’s conventionalism.
228
Intuition and the Axiomatic Method
But what is Hilbert’s own position with respect to the development of physics? How shall we judge the history of physics? Was it an irrationaldevelopment, driven by historical accidents, or can we recognise a certain rationality in its development? Which attitude should be taken towards the development of science? To answer this question properly, it is very important to note that the “thought experiment” just discussed is very different from the real situation in which Poincar´e put forward his proposal; this is stressed by Hilbert himself. Hilbert contradicts Poincar´e’s view once more by expressing his opinion that If it had turned out, for example, that our experience with respect to the motion of rigid bodies had been as I described it, that is, that out of rigid rods triangles could be constructed in which the sum of the angles is < 1 right-angle, then nobody would have doubted that part of the total framework of concepts which corresponds to the Bolyai-Lobachevsky structure, and in any case the Euclidean scheme would have been excluded right from the beginning as mathematically impossible.36
Hilbert then continues: But as the things in our experience in fact are, where only a small deviation from the Euclidean scheme comes into play, the mathematically appropriate and generally prescribed method consists precisely in this: to take the Euclidean geometry as a basis and then to treat its variation. Conventions cannot be claimed to be a reason for this, just as little as the theory of perturbation is a convention in astronomy: on the contrary the method of perturbation theory was forced on us exclusively by the problem itself . . . as the appropriate mathematical method.37
Although in this quote Hilbert does not address the history of physics directly, it is easy to project its systematic message onto a time-scale: And this is exactly, what Hilbert did in other lectures, such as the courses “Space and Time” and “Basic Ideas in the Theory of Relativity” from 1918 and 1921 respectively.38 These lectures deal primarily with the different experiments and their conflicting interpretations in the nineteenth century, which led finally, after many unsuccessful attempts to interpret them consistently, first to the Special and then to the General Theory of Relativity. The quintessence of these lectures is exactly the same as is contained in the quotation above: as long as our measuring devices are not precise enough to measure the extremely small deviations from Euclidean geometry, we take Euclidean geometry as the basis for our theoretical and experimental research and study its variations by means of the axiomatic method. In this way, we may be able to improve the precision of our measuring devices to such a degree that the extremely small deviations from Euclidean geometry become observable in particularly designed experiments. This procedure seems very close to Dingler’s method of “exhaustion”, but there is a fundamental difference. Whereas Dingler is not prepared to suspend Euclidean geometry as the conceptual framework, but instead tries to hold on to it under any circumstances, Hilbert takes the opposite point of view. He is not only prepared to suspend Euclidean geometry under well-defined experi-
Hilbert‘s Foundation of Physics
229
mental circumstances, but he also wishes to understand the limits of validity for Euclidean geometry. This is made clear in the last section of his lecture: The opinion, advocated by us, rejects absolute apriorism and conventionalism; but nevertheless it in no way escapes the question of the precise validity of the laws of nature. I would like instead to answer this question in the affirmative, and indeed in the following sense. The individual laws of nature are constituent parts of the total framework, set up axiomatically from the world equations. The world-equations are the precipitate of a long, in part very strenuous experimental inquiry and experience, often held up by following erroneous path. In this way we come to the idea that, by continued elaboration and completion of the worldequations, we approach asymptotically a definitivum.39
What is the significance of all these considerations with respect to the question of the relation of the a priori and the empirical in our knowledge? Our principle answer, in agreement with Hilbert, is the following: this relation is not fixed once and for all, but is itself a function of time. In the beginning, the a priori part was relatively large, because scientists had to presuppose, besides explicit assumptions or hypotheses, a good number of tacit assumptions, which they had not yet investigated explicitly and consciously. As time proceeds, this part diminishes, because in the process of logical investigations and experimental testing we reveal more and more of these tacit assumptions, in particular those anthropomorphic presuppositions of which we were not aware previously, but on which our theories were built. This inquiry is a recursive process, in which we proceed in the direction of an objective (i.e., less subjective) knowledge of nature on the basis of what we have achieved so far. It is a slow and presumably unending process.
Notes 1. “Grunds¨atzliche Fragen der modernen Physik” is the title of the lectures as announced in the “Verzeichnis der Vorlesungen. Sommersemester 1923”, Hamburg 1923, p. 41. The lectures were held in Hamburg on 26, 27 and 28 July, 1923. The third of the three lectures was given again, together with short summaries of the first two lectures, in a lecture held at the Physikalische Gesellschaft in Z¨urich, under the title “Erkenntnistheoretische Grundfragen der Physik” on 27 October, 1923; see Neue Z¨urcher Zeitung, Nr. 1473, Erstes Morgenblatt from 27 October, 1923. The manuscript Cod. Ms. Hilbert 596 in the Hilbert Nachlaß at the Nieders¨achsische Staats- und Universit¨atsbibliothek (henceforth: SUB) contains the notes for both the Hamburg and the Z¨urich lectures. The Hamburg lectures will be referred to as Hilbert (*1923). 2. “Der letzte Vortrag soll . . . eine allgemeine philosophische Frage in der Weise behandeln, dass eine definitive Antwort zu Stande kommt. . . . Denn obwohl es philosophisch allgemeine Behauptungen sind, will ich doch, dass sie ebenso bestimmt und ebenso sicher sind wie als w¨aren es rein mathematische Behauptungen.” (Hilbert (*1923), p. 1 of Lecture I) 3. Hilbert gave a course of lectures in G¨ottingen in 1919 under the title “Natur und mathematisches Erkennen”; these are preserved in the library of the Mathematical Institute of the University of G¨ottingen, and were published in 1992 (Hilbert (*1919)). Hilbert also gave a talk in Copenhagen in 1921 with the same title; this is preserved in the Hilbert Nachlaß (Hilbert (*1921b)). 4. Unfortunately, the development in the social and political sciences is very far from being as successful as is the case with the exact sciences. For example, not only do events like the Holocaust and the Second World War pose a formidable explanatory challenge for the social sciences, but there is also the further question of whether these sciences could help in suggesting means of presenting such catastrophes.
230
Intuition and the Axiomatic Method
5. It was Johannes M¨uller, Helmholtz’s teacher in sense-physiology, who clarified this issue beyond any possible doubt. M¨uller’s point is a very simple, almost logical one: because different objects like massive bodies, waves, drugs, etc. lead to the same secondary qualities in our sensations if they stimulate the same sense organ, and to different qualities if they stimulate different sense organs, the secondary qualities of our sensations cannot be a function, in the mathematical sense of the term, of the objects which cause the sensations, but are only a function of the organ stimulated and its specific “energy”. ¨ 6. “Wenn wir eine Uberschau halten wollen u¨ ber den Bereich derjenigen Wissenschaften, die auf der Betrachtung der leblosen Natur beruhen, so haben wir an alle Tatsachen, Ergebnisse, Begriffsbildungen zu denken, welche den Gebieten der Mathematik, Physik, Astronomie, Chemie der Technik und deren Nachbarwissenschaften angeh¨oren. Dieser Wissenskomplex ist fein und weit verzweigt und gewaltig an Umfang und Ausdehnung; und doch grenzt er sich scharf ab gegen¨uber den u¨ brigen Gebieten des menschlichen Wissens, n¨amlich gegen¨uber den biologischen Wissenschaften und denjenigen, die vom Menschen als solchem handeln.” (Hilbert (*1921b), pp. 1–2) The opening passages of both the 1919 lecture course (Hilbert (*1919)) and of the first Hamburg lecture (in Hilbert (*1923)) are virtually identical to this. The same passage was also used elsewhere in Hilbert’s unpublished writings from around this time. 7. As Hilbert says, “an sich etwas der Natur Fremdes” (Hilbert (*1923), pp. 3–4 of Lecture I). ¨ ¨ ¨ ¨ Satz uber die Natu rist nur dann eine Aussage uber die Gegensta8. “Ein in Koordinaten ausgedruckter nde in der Natur, wenn er von denKoordinaten unabh¨angig einen Inhalt hat: “Diex-Coordinate von Hamburg betr¨agt 100 km” ist z. B. ein Satz, der keine Aussage u¨ ber die Wirklichkeit enth¨alt. Die Coordinaten, die notwendig waren, m¨ussen gewissermassen wieder entfernt werden.” (Hilbert (*1923), pp. 3–4 of Lecture I) 9. Hilbert (1915). For a detailed historical discussion of this paper, on which the following is based, see Sauer (1999). 10. From a modern point of view, Hilbert’s assertion requires some further specification, e.g., the nonsingularity of the electromagnetic field. 11. From a modern point of view, terms explicitly depending on the electromagnetic potential are not taken into account because they would violate gauge invariance of the theory. 12. “. . . ich m¨ochte Alles, was noch zu den Weltgleichungen hinzugef¨ugt werden muss, um die Geschehnisse in der leblosen Natur zu verstehen, kurz accessorisch nennen” (Hilbert (*1923), pp.1–2 of Lecture II). 13. The full passage is: “Wir k¨onnen aber nicht leugnen, dass uns dieses Gesetzt mit unseren allt¨aglichen Wahrnehmungen und mit der gew¨ohnlichen Einstellung unseres Denkens im Konflikt bringt” (Hilbert (*1923), p. 27 of Lecture I). 14. See instead Majer (2001b). 15. “Zu Beginn meines heutigen Vortrages habe ich auf den Umstand hingewiesen, dass die Weltgleichungen, weil sie Differentialgleichungen sind, an sich keinerlei bestimmte L¨osungen festlegen k¨onnen und dass es daher scheint, als k¨onnten die Weltgleichungen nur dazu dienen, aus bekannter Gegenwart die Zukunft vorauszusagen, w¨ahrend wir f¨ur den gegenw¨artigen Zustand der Dinge aus ihnen gar nichts erfahren.” (Hilbert (*1923), p. 25 of Lecture II) 16. “Der (hier von mir vertretene) Gedanke, dass das gesammte physikalische Sein und Geschehen durch ein Weltgesetz beherrscht wird, das die Form eines allgemein invarianten Systems von partiellen Differentialgleichungen besitzt, hat einen Vorl¨aufer in dem fr¨uheren theoretischen Ideal der Mechanistik. Aber der Unterschied mit fr¨uher, die Wandlung gegen¨uber dem damaligen Standpunkt ist sehr bedeutend. . . . W¨ahrend fr¨uher die Vielheit der Stoffe und der physikalischen Konstanten als etwas Letztes hingenommen und die sogenannten allgemeinen Eigenschaften der K¨orper einfach jede f¨ur sich als Grundkr¨afte nebeneinander gestellt wurden, gehen wir heute davon aus, dass es nur zweierlei Bausteine der Materie gibt, die Elektronen und positive Kerne, aus deren Zusammensetzung und Relativbewegungen sich die chemischen und physikalischen Eigenschaften der Stoffe erkl¨aren lassen und denken uns auch deren Vorhandensein als Folge der Grundgleichungen.” (Hilbert (*1923), p. 23–24 of Lecture II) 17. As Hilbert says: “Es ist aber zu beachten, dass diese accessorischen Forderungen — auch die Annahme der Anwendbarkeit der Wahrscheinlichkeitsprinzipien kann dazu gerechnet werden — dass diese accessorischen Prinzipien nicht den mathematischen Charakter neuer Gleichungen haben, sondern allgemeiner Art sind, mit unserem Denken u¨ berhaupt und unserer Einstellung gegen¨uber der Natur zusammenh¨angen.” (Hilbert (*1923), pp. 26–28 of Lecture II)
Hilbert‘s Foundation of Physics
231
18. “Wir stehen da vor einer Entscheidung u¨ ber ein wichtiges philosophisches Problem, n¨amlich vor der alten Frage nach dem Anteil, den das Denken einerseits und die Erfahrung andererseits an unserer Erkenntnis haben. Diese alte Frage ist berechtigt; denn sie beantworten, heisst im Grunde feststellen, welcher Art unsere naturwissenschaftliche Erkenntnis u¨ berhaupt ist und in welchem Sinne all’ das Wissen, das wir in dem naturwissenschaftlichen Betriebe sammeln, Wahrheit ist.” (Hilbert (*1923), pp. 3–4 of Lecture III) 19. “Wir haben also den a¨ lteren Philosophen gegen¨uber den Vorteil, eine grosse Anzahl solcher Entdeckungen miterlebt und die dadurch bewirkten Neueinstellungen w¨ahrend ihrer Entstehung kennen gelernt zu haben. Dabei waren unter den Neuentdeckungen viele solche, die alte festgewurzelte Auffassungen und Vorstellungen ab¨anderten oder ganz beseitigten. Denken wir nur an den neuen Zeitbegriff und die Zerf¨allung der chemischen Elemente. Vorurteile, an denen zu r¨uhren fr¨uher u¨ berhaupt Niemand eingefallen w¨are.” (Hilbert (*1923), p. 5 of Lecture III) 20. “Aber noch ein zweiter Umstand kommt der Entscheidung u¨ ber jene alte philosophische Frage zu Gute. Es gibt eine allgemeine Methode f¨ur die theoretische Behandlung naturwissenschaftlicher Fragen, die heute sehr ausgebildet ist, auch oft unbewusst von den Forschern angewandt wird, das ist die axiomatische Methode und diese verhilft auf alle F¨alle dazu, die betreffende Fragestellung zu praecisiren.” (Hilbert (*1923), p. 9 of Lecture III) 21. For a more detailed analysis and correction of the misapprehensions, see Majer (2001a). 22. By the word “mathematics” we mean in this section, as is usually understood, arithmetic as well as geometry. This use deviates from Hilbert’s, who includes in mathematics arithmetic and analysis but not geometry, which latter he regards (with Kronecker) as belonging to the natural sciences. We shall return to Hilbert’s use after this section. 23. See the Introduction to Hertz’s book Prinzipien der Mechanik in neuem Zusammenhange dargestellt (1894). For an analysis of this Introduction, see Majer (1998). 24. The meaning of the term “experience” can be taken individually or collectively; only the latter is relevant here. 25. “Die Frage nach dem empirischen Charakter der Naturgesetze und die nach ihrer genauen G¨ultigkeit sind aber 2 v¨ollig verschieden geartete und von einander unabh¨angige Fragen; die f¨ur ihre Beantwortung dienenden Konstruktionen m¨ussen streng auseinander gehalten werden.” (Hilbert (*1923), p. 40 of Lecture III) 26. An example is mentioned in the following quote: “Sometimes, an idea has its origin in pure thinking, as for example the idea of atomism, whereas the existence of atoms was proved only two thousand years later by experimental physics. Sometimes, experience leads the way and forces the intellect to adopt a speculative point of view” (Hilbert (*1923), pp. 21–22 of Lecture III). The German passage reads: “Bisweilen hatte eine Idee ihren ersten Ursprung im reinen Denken, wie z. B. die Idee der Atomistik, w¨ahrend die Existenz der Atome erst 2 Jahrtausende sp¨ater durch die Experimentalphysik bewiesen wurde. Bisweilen geht die Erfahrung voran und zwingt dem Geist den spekulativen Gesichtspunkte auf.” 27. “Wenn nun diese Weltgleichungen und damit das Fachwerk vollst¨andig vorl¨age, und wir w¨ussten, dass es auf die Wirklichkeit in ihrer Gesamtheit passt und davon bedarf es tats¨achlich nur des Denkens und der begrifflichen Deduktion, um alles physikalische Wissen zu gewinnen; . . . ” (Hilbert (*1923), pp. 19–21 of Lecture III) 28. “. . . behaupte ich, dass gerade die Weltgesetze auf keine andere Weise zu gewinnen sind, als aus der Erfahrung. M¨ogen bei der Konstruktion des Fachwerkes der physikalischen [Begriffe] mannigfache spekulative Gesichtspunkte mitwirken: ob die aufgestellten Axiome und das aus ihnen aufgebaute logische Fachwerk stimmt, das zu entscheiden, ist allein die Erfahrung im Stande.” (Hilbert (1923), p. 21 of Lecture III) 29. By “logical empiricism” we do not mean precisely the same as what is known by this expression in the philosophical literature. We wish only to express the dogma that there are just two sources of knowledge, logic and experience.
232
Intuition and the Axiomatic Method
30. “Es haben in der Tat Philosophen — und Kant ist der klassische Vertreter dieses Standpunktes — behauptet, dass wir ausser der Logik und der Erfahrung noch a priori gewisse Erkenntnisse u¨ ber die Wirklichkeit haben. Nun gebe ich zu, dass schon zum Aufbau der theoretischen Fachwerke gewisse apriorische Einsichten n¨otig sind und dass stets dem Zustandekommen unserer Erkenntnisse solche zu Grunde liegen. Ich glaube, dass die mathematische Erkenntnis letzten Endes auf einer Art anschaulicher Einsicht beruht und dass wir sogar zum Aufbau der Zahlentheorie eine gewisse anschauliche Einstellung a priori n¨otig haben. Damit beh¨alt also der allgemeinste Grundgedanke der Kantschen Erkenntnistheorie seine Bedeutung: n¨amlich das philosophische Problem, jene anschauliche Einstellung a priori festzustellen und damit die Bedingungen der M¨oglichkeit jeder begrifflichen Erkenntnis und zugleich jeder Erfahrung zu untersuchen. Ich meine, dass dies im Wesentlichen in meinen Untersuchungen u¨ ber die Prinzipien der Mathematik geschehen ist. Das Apriori ist dabei Nichts mehr und Nichts weniger als eine Grundeinstellung oder der Ausdruck f¨ur gewisse unerl¨assliche Vorbedingungen des Denkens und Erfahrens.” (Hilbert (*1923), pp. 23–24 of Lecture III) 31. “Zur Zeit Kants konnte man denken, dass die Raum- und Zeit-Vorstellungen, die man hatte, ebenso allgemein und unmittelbar auf die Wirklichkeit anwendbar sind wie unsere Vorstellungen von Anzahl, Reihenfolge und Gr¨osse, die wir in den mathematischen und physikalischen Theorien best¨andig in der uns gel¨aufigen Weise verwenden. Und dann w¨urde in der Tat die Lehre von Raum und Zeit, insbesondere also die Geometrie etwas sein, das ebenso wie die Arithmetik aller Naturerkenntnis vorausgeht. Dieser Standpunkt Kants wurde bereits, ehe die Entwicklung der Physik dazu zwang, insbesondere von Riemann und Helmholtz verlassen — mit vollem Recht; denn Geometrie ist nichts Anderes, als derjenige Teil des gesammten physikalischen Begriffsfachwerkes, der die m¨oglichen Lagenbeziehungen der starren K¨orper gegeneinander in der Welt der wirklichen Dinge abbildet. Dass es bewegliche starre K¨orper u¨ berhaupt gibt und welches die Lagenbeziehungen sind ist lediglich Erfahrungstatsache.” (Hilbert (*1923), pp. 25–26 of Lecture III) 32. “Wir sehen also: in der Kantschen a priori Theorie sind noch anthropomorphe Schlacken enthalten, von denen sie befreit werden muss und nach deren Entfernung nur diejenige aprorische Einstellung u¨ brig bleibt, die auch zur rein mathematischen Erkenntnis n¨otig ist.” (Hilbert (*1923), p. 33 of Lecture III) 33. There are three types of geometry which contain the axioms of congruence and are, consequently, sufficient for a representation of the rotations and translations of rigid bodies, hyperbolic, elliptic and Euclidean geometry. For a more detailed analysis of Poincar´e’s conventionalism, see Majer (1996). 34. “Nun ist es gewiss ein sachgem¨asses mathematisches Verfahren, wenn wir zur Darstellung der nichtEuklidischen dreidimensionalen Bewegungen und Lagenbeziehungen unserer starren K¨orper 3 Parameter verwenden; aber es ist weder sachgem¨ass, noch einfach, noch konventionell, diesen Parameter eine Metrik aufzuerlegen und sei es auch die einfachste, n¨amlich die Euklidische. Die Euklidische Metrik ist immerhin ein ausgedehntes Fachwerksger¨ust und wenn man dasselbe noch so als einfach und anschaulich veranschlagen will; auf jeden Fall ist es komplizierter als u¨ berhaupt keinerlei Geometrie und keinerlei Metrik.” (Hilbert (*1923), pp. 36–37 of Lecture III) 35. “. . . nur als toter abgekapselter Fremdk¨orper erscheint” (Hilbert (*1923), p. 38 of Lecture III). 36. “Wenn n¨amlich z. B. unsere Erfahrung hinsichtlich der Bewegung der starren K¨orper so ausfallen w¨urde, wie ich vorhin annahm, indem sich in den aus den St¨aben konstruirten Dreiecken sich die Winkelsumme < 1 Rechter herausstellte, so w¨are in dem Gesammtfachwerk der Begriffe als bez¨ugliches Teilst¨uck das Bolyai-Lobatschefskysche Ger¨ust f¨ur Niemand zweifelhaft und jedenfalls das Euklidische Schema von vornherein ausgeschlossen und mathematisch unm¨oglich.” (Hilbert (*1923), p. 41 of Lecture III) 37. “Wie aber die Dinge in unserer Erfahrung tats¨achlich liegen, wo nur eine geringe Abweichung von dem Euklidischen Schema ins Spiel kommt, besteht die mathematisch sachgem¨asse und allgemein vorgeschriebene Methode gerade darin, die Euklidische Geometrie zu Grunde zu legen und ihre Variation zu behandeln. Es kann hierf¨ur als Grund so wenig die Konvention geltend gemacht werden wie etwa die St¨orungstheorie in der Astronomie eine Konvention ist: vielmehr ist uns die Methode der St¨orungstheorie allein durch die Sache selbst . . . aufgezwungen worden — als die sachgem¨asse mathematische Methode.” (Hilbert (*1923), pp. 41–42 of Lecture III) 38. See Hilbert (*1918) and Hilbert (*1921a). 39. “Die von uns vertretene Meinung verwirft den unbedingten Apriorismus und den Konventionalismus; aber sie entzieht sich trotzdem keineswegs der vorhin aufgeworfenen Frage nach der genauen G¨ultigkeit der Naturgesetze. Ich m¨ochte diese Frage vielmehr bejahen und zwar in folgendem Sinne. Die einzelnen Naturgesetze sind Bestandteile des Gesamtfachwerkes, das sich aus den Weltgleichungen axiomatisch aufbaut. Und die Weltgleichungen sind der Niederschlag einer langen zum Teil sehr m¨uhsamen und oft durch Irrwege aufgehaltenen experimentellen Forschung und Erfahrung. Wir gelangen dabei zu der Vorstellung, dass wir uns durch fortgesetzte Ausgestaltung und Vervollst¨andigung der Weltgleichungen asymptotisch einem Definitivum n¨ahern.” (Hilbert (*1923), pp. 42–43 of Lecture III)
Hilbert’s Foundation of Physics
233
References Items marked with an ‘*’ were originally unpublished. Hilbert, D. (1915), “Die Grundlagen der Physik. (Erste Mitteilung.)” in: Nachrichten von der k¨oniglichen Gesellschaft der Wissenschaften zu G¨ottingen, mathematisch - physikalische Klasse, 1915, 395–407. Hilbert, D. (1917), “Die Grundlagen der Physik. (Zweite Mitteilung.)” in: Nachrichten von der k¨oniglichen Gesellschaft der Wissenschaften zu G¨ottingen, mathematisch-physikalische Klasse, 1917, 53–76. Hilbert, D. (*1918), “Raum und Zeit”, Cod. Ms. Hilbert 561, Nieders¨achsische Staats- und Universit¨atsbibliothek (henceforth: SUB), Handschriftenabteilung. Hilbert, D. (*1919), Natur und mathematisches Erkennen, ed. by D. Rowe, Birkh¨auser, Basel, 1992. Hilbert, D. (*1921a), “Grundgedanken der Relativit¨atstheorie”, Cod. Ms. Hilbert 564, SUB, Handschriftenabteilung. Hilbert, D. (*1921b), “Natur und mathematisches Erkennen”, Cod. Ms. Hilbert 589, SUB, Handschriftenabteilung. Hilbert, D. (1922), “Neubegr¨undung der Mathematik. Erste Mitteilung.” in: Abhandlungen aus dem mathematischen Seminar der Hamburgischen Universit¨at 1, 157–177; reprinted in Hilbert (1935), 157–178. Hilbert, D. (*1923), “Grunds¨atzliche Fragen der Modernen Physik”, Cod. Ms. Hilbert 596, SUB, Handschriftenabteilung. Hilbert, D. (1924), “Die Grundlagen der Physik” in: Mathematische Annalen 92, 1–32; revised and expanded version of Hilbert (1915) and Hilbert (1917). Reprinted in Hilbert (1935), 258–289. Hilbert, D. (1935), Gesammelte Abhandlungen, Dritter Band, Julius Springer, Berlin. Majer, U. (1996), “Hilbert’s Criticism of Poincar´e’s Conventionalim” in: Henri Poincar´e: Science and Philosophy, G. Heinzmann et al. (ed.), Berlin, 1996. Majer, U. (1998), “Heinrich Hertz’s Picture-Conception of Theories” in: Heinrich Hertz: Classical Physicist, Modern Philosopher, D. Baird et al. (ed.), Kluwer Academic Publishers, 1998. Majer, U. (2001a), “The Axiomatic Method and the Foundations of Science: Historical Roots of Mathematical Physics in G¨ottingen” in: John von Neumann and the Foundations of Quantum Physics, M. Redei, M. St¨oltzner (ed.), Kluwer Academic Publishers, 2001. ¨ Majer, U. (2001b), “Lassen sich phanomenologische Gesetze, im Prinzip‘ auf mikro-physikalischeTheorien reduzieren?” in:Phanomenales Bewußtsein—Ruckkehr zur Identitatstheorie?, ¨ M. Pauen, A. Stephan (ed.), Mentis, Paderborn, 2001. Sauer, T. (1999), “The Relativity of Discovery. Hilbert’s First Note on the Foundations of Physics”, Archive for the History of the Exact Sciences 53, 529–575.
SOFT AXIOMATISATION: JOHN VON NEUMANN ON METHOD AND VON NEUMANN’S METHOD IN THE PHYSICAL SCIENCES Mikl´os R´edei Lor´and E¨otv¨os University, Budapest, Hungary
Michael St¨oltzner Universit¨at Bielefeld, Germany
One can discern two typical attitudes towards von Neumann’s achievements in the physical sciences: the appreciative and the ambivalent. Members of the appreciative camp view and evaluate von Neumann’s work as the typical representative of the successful application of the axiomatic method in the physical sciences. Members of the ambivalent group acknowledge von Neumann’s results as great intellectual achievements but view his work in physics as an example of useless and pointless striving for mathematical exactness for exactness’ own sake. To quote a representative of the appreciative camp, the mathematician Paul Halmos relates: The ‘axiomatic method’ is sometimes mentioned as the secret of von Neumann’s success. In his hands it was not pedantry but perception; he got to the root of the matter by concentrating on the basic properties (axioms) from which all else follows. The method, at the same time, revealed to him the steps to follow to get from the foundations to the application. (Halmos (1973), p. 394)
In a similar vein, Arthur Wightman, one of the most prominent representatives of modern mathematical physics, writes: I do not know whether Hilbert regarded von Neumann’s book as the fulfillment of the axiomatic method applied to quantum mechanics, but, viewed from afar, that is the way it looks to me. In fact, in my opinion, it is the most important axiomatization of a physical theory up to this time. (Wightman (1976), p. 157)
It is more difficult to document the presence of ambivalence towards von Neumann’s work by quoting explicit statements; evidence in favor of the case is 235 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 235–249. © 2006 Springer. Printed in the Netherlands.
236
Intuition and the Axiomatic Method
more circumstantial (absence in physics curricula of von Neumann’s achievements, paying lip service to von Neumann’s results without actually analysing precisely the content and significance of his results, historians of physics not paying attention to his role in the development of 20th century physics, etc.). Yet we think this attitude is even more widespread than the appreciative one. This ambivalent attitude is an expression of physicists’ general understanding of what sort of a science physics is and how a physicist is supposed to proceed within his profession. Richard P. Feynman represents an influential case in point; his views were typical of a whole generation of quantum field theorists who found themselves amidst a set of spectacularly successful rules of computation which, however, involved blatantly inconsistent mathematical objects. The mathematical rigor of great precision is not very useful in physics. But one should not criticize the mathematicians on this score . . . They are doing their own job. If you want something else, then you work it out for yourself. Some day, when physics is complete and we know all the laws, we may be able to start with some axioms . . . so that everything can be deduced. But while we do not know all the laws, we can use some to make guesses at theorems which extend beyond the proof. (Feynman (1965), p. 56, 49)
A similar attitude also prevails in some statements of John S. Bell whose famous inequalities supplanted Von Neumann’s No-hidden-variable theorem. Rather than just criticizing Von Neumann’s concept of hidden variable as inadequate, Bell considers the axiomatic introduction of this concept as suggesting a kind of a priori reasoning or a “law of thought” (Bell (1987), p. 32, cf. St¨oltzner (2002b)). Surprisingly, even a mathematician such as Hermann Weyl, who was equally interested in foundations and applications of mathematics endorsed Feynman’s evaluation of the role of axiomatisation in physics. About Hilbert’s program of axiomatisation of physics, Weyl writes: The maze of experimental facts which the physicist has to take into account is too manifold, their expansion too fast, and their aspect and relative weight too changeable for the axiomatic method to find a firm enough foothold, except in the thoroughly consolidated parts of our physical knowledge. Men like Einstein or Niels Bohr grope their way in the dark toward their conceptions of general relativity or atomic structure by another type of experience and imagination than those of the mathematician, although mathematics is an essential ingredient. Thus Hilbert’s vast plans in physics never matured. (Weyl (1944), p. 653)
Even Wolfgang Pauli, generally known for his insistence on precision in physics, is reported to have doubts about von Neumann’s work: Walter Thirring, another prominent active mathematical physicist recalls a conversation he had with Pauli about von Neumann’s significance.
Soft Axiomatisation
237
For a long time his [von Neumann’s] importance for physics was underrated, Pauli once told me that he had said to v. Neumann: “If a mathematical proof is what matters in physics you would be a great physicist”. I disagree with this statement, I think he had the right vision of what will become important in physics. (Thirring (2001), p. 5)
Thirring lists four overlapping areas in physics about which he thinks von Neumann both had the right intuition and to which von Neumann contributed in a decisive way: operators in B(H), infinite tensor products of von Neumann algebras, quantum statistical mechanics and quantum logic. Thirring’s remark also hints at an aspect of von Neumann’s work that seems to have escaped even the attention of the appreciative camp, to wit, that von Neumann never saw his achievements in mathematical physics as purely mathematical ones because he did not assume a neat separation between mathematics and the sciences. Our main claim in this paper is that both the appreciative and the ambivalent camps misinterpret and misread von Neumann’s views and intention in important respects. A closer look at what von Neumann actually said about the scientific method and at how he actually acted as a working scientist, especially in quantum physics, does not entirely confirm the picture painted by either camp of the role of axiomatisation and of the place of mathematical rigor in von Neumann’s thinking and work. By painting a more detailed and, we hope, more faithful picture of von Neumann’s views and practice we also hope to show a remarkable continuity in his attitude and method, a continuity in the period from 1926, when he first encountered the idea of the axiomatic method in physics in the Hilbert school in G¨ottingen, through his remarks in the fifties about the method in the sciences.
1.
Von Neumann’s opportunistic soft axiomatisation
One can distinguish two different notions of “axiomatising” and “axiomatic theory” in von Neumann’s works: (i) axiomatising and axiomatic theory in the strict sense of formal systems or languages (call this “formal axiomatics”); (ii) axiomatising and axiomatic theory in the less formal sense in which it occurs in physics (call this “soft axiomatics”). Formal axiomatics is what von Neumann does in his work on axiomatic set theory (which was the topic of his PhD dissertation in 1926). This formal type of axiomatisation is a standard one, more or less as it is understood today. However, even in connection with formal axiomatics von Neumann takes a very sensible, only moderately formalist position, making clear that there is some intuitively given content or meaning behind the primitive concepts and the axioms in terms of which axiomatic set theory is formulated:
238
Intuition and the Axiomatic Method We begin with describing the system to be axiomatized and with giving the axioms. This will be followed by a brief clarification of the meaning of the symbols and axioms. . . . It goes without saying that in axiomatic investigations like ours, expressions such as ‘meaning of a symbol’ or ‘meaning of an axiom’ should not be taken literally: these symbols and axioms do not have a meaning at all (in principle at least), they only represent (in more or less complete manner) certain concepts of the untenable ‘naive set theory’. Speaking of ‘meaning’ we always intend the meaning of the concepts taken from ‘naive set theory’. (von Neumann (1962b), p. 344, our translation)1
As opposed to formal axiomatics, soft axiomatics is a less well-defined, more intuitive and a structured conception. Its explicit formulation can be found already in the 1926 joint paper by Hilbert, Nordheim and von Neumann on the foundations of quantum mechanics. This paper contains a relatively lengthy passage on the axiomatic method in physics. The main idea is that a physical theory consists of three, sharply distinguishable parts: (i) physical axioms, (ii) analytic machinery (also called “formalism”), (iii) physical interpretation. The physical axioms are supposed to be semi-formal requirements (postulates) formulated for certain physical quantities and relations among them. The basis of these postulates is our experience and observations; thus the basis of the axioms in physics is empirical. (This is not necessarily the case in formal axiomatics: von Neumann points out that the fifth postulate in Euclid’s geometry is non-empirical.) The analytic machinery is a mathematical structure containing quantities that have the same relation among themselves as the relation between the physical quantities. Ideally, the physical axioms should be strong and rich enough to determine the analytic machinery completely. The physical interpretation then connects the elements of the analytic machinery and the physical axioms. Here is the idea in the author’s words and specified for the case of quantum mechanics, where probability density for the distribution of values of physical quantities is taken as the basic concept: The way leading to this theory is the following: one formulates certain physical requirements concerning these probabilities, requirements that are plausible on the basis of our experiences and developments and which entail certain relations between these probabilities. Then one searches for a simple analytic machinery in which quantities appear that satisfy exactly these relations. This analytic machinery and the quantities occurring in it receive a physical interpretation on the basis of the physical requirements. The aim is to formulate the physical requirements in a way that is complete enough to determine the analytic machinery unambiguously. This way is then the way of axiomatising, as this had been carried out in geometry, for instance. The relations between geometric shapes such as point, line, plane are described by axioms, and then it is shown
Soft Axiomatisation
239
that these relations are satisfied by an analytic machinery, namely, linear equations. Thereby one can deduce geometric theorems from properties of the linear equations. (Hilbert, Nordheim, von Neumann (1926), p. 105, our translation)2
Hilbert, Nordheim and von Neumann see clearly, however, that not even soft axiomatics is practiced in actual science. They point out that what happens is that one typically conjectures the analytic machinery first and without having formulated the physical axioms. It is only after the analytic, mathematical part is fixed that one gets an insight into what the physical axioms should be. In their words: In physics the axiomatic procedure alluded to above is not followed closely, however; here and as a rule the way to set up a new theory is the following. One typically conjectures the analytic machinery before one has set up a complete system of axioms, and then one gets to setting up the basic physical relations only through the interpretation of the formalism. It is difficult to understand such a theory if these two things, the formalism and its physical interpretation, are not kept sharply apart. This separation should be performed here as clearly as possible although, corresponding to the current status of the theory, we do not want yet to establish a complete axiomatics. What however is uniquely determined, is the analytic machinery which — as a purely mathematical entity — cannot be altered. What can be modified — and is likely to be modified in the future — is the physical interpretation, which contains a certain freedom and arbitrariness. (Hilbert, Nordheim, von Neumann (1926), p. 106, our translation)3
So, to the extent that axiomatics is a method practiced in science (physics) it is only this soft axiomatics, and as Hilbert-Nordheim-Neumann point out, even axiomatisations of this kind are typically practiced in a very opportunistic manner with many concessions to the given science’s state of formalisation. It seems fair to say then that, according to the Hilbert-Nordheim-Neumann paper, axiomatisation in physics is of an opportunistic soft kind, which seems such a soft notion indeed that one may wonder whether such a method should at all bear the name “axiomatisation” at all and not be called simply “model building”. We think von Neumann would agree with this “model building” terminology; in fact, as we shall quote him shortly, he says explicitly that the aim of the sciences is building models. One might think that the notion of soft axiomatics as formulated in the Hilbert-Nordheim-Neumann paper is mainly Hilbert’s idea; after all the paper was based on Hilbert’s 1926 lectures on the foundations of quantum mechanics. To some extent, this is indeed the case, and passages similar to the one above in which Hilbert cites geometry (in particular his own Foundations of Geometry, Hilbert (1899)) as the model for the axiomatisation of science abound throughout the years following 1900 when he formulated the Sixth Problem in his famous Paris Lecture. But there are two conflicting tendencies in Hilbert’s manifold applications of the axiomatic method. There are cases, such as continuum mechanics, where Hilbert attributes only a preliminary status to the system of axioms — because the conceptual framework provided by physics is still far from being a definitive one (see Majer (2001), p. 26). But when it comes to general relativity, or in Hilbert’s earlier axiomatizations
240
Intuition and the Axiomatic Method
of mechanics, a strongly reductionist attitude prevails. In these cases Hilbert believed that the conceptual framework was close to its final stage and that the axiomatic method had already succeeded in finding the deepest structural level. Axiomatising in this second sense came very close to formal axiomatics because what remained to be done was the definition of appropriate number fields to reduce the consistency of the physical theory to the consistency of arithmetics. And only in this last step did G¨odel’s incompleteness theorems come to bear and put an insurmountable limit to the import of axiomatisation. Soft axiomatics is hardly affected by the results that restrict formal axiomatics because the axiomatic method taken in the soft sense deals with still preliminary conceptual frameworks and its main objective is to deepen the foundations — Tieferlegung in Hilbert’s terms (St¨oltzner (2002a)). On the basis of von Neumann’s remarks on method in Sections 4 and 5 it can be argued (St¨oltzner (2001)) that von Neumann shifted the balance in Hilbert’s axiomatic method between pragmatism and foundationalism strongly towards the first tendency. Soft axiomatics so conceived is not driven any more exclusively by the search for universality and is thus less susceptible to results such as G¨odel’s, which put a principal limit to formal axiomatics. Deepening the foundations then becomes an effective tool for critical analysis rather than a reductionist project.
2.
Soft axiomatisation in quantum mechanics
We claim that von Neumann followed the method of opportunistic soft axiomatics in his work on quantum mechanics. To prove the claim in full would require a detailed and lengthy historical analysis of what von Neumann does in his 1927 papers (von Neumann (1927a, b, c)) and in his book Neumann (1932), which cannot be done here (see R´edei (1996, 98, 99, 2001) for some results in this direction). Let us just briefly recall (without presenting the actual wording) the key elements in von Neumann’s argumentation. In doing so we will treat the 1926 papers and von Neumann’s book on a par although there are revealing and significant differences between the book and the papers — even in chapters of the book that are largely verbatim identical with the papers. There are only two explicitly formulated physical axioms, both concerning the nature of the expectation value of physical quantities in a statistical ensemble: A. Expectation value assignments a → E(A) are linear: E(αa + βB + . . .) = E(αa) + E(βB) . . . B. Expectation value assignments are positive: E(A) ≥ 0
if a can take on only non-negative values
241
Soft Axiomatisation
These two postulates are informal and are based on empirical observations exactly in the sense in which the Hilbert-Nordheim-Neumann paper talks about physical axioms: the physical quantities a, b are left completely unspecified, and the two postulates spell out something that is a basic, empirically observable feature of expectation value assignments in a relative frequency interpreted probability theory, which von Neumann takes in a rather intuitive sense, without detailing the concept. The analytic machinery is the set of all selfadjoint operators on a Hilbert space, the third (C) and fourth (D) “postulates” (von Neumann does not even call them “postulates”) specify the physical interpretation, the bridge between the physical quantities and the operators: C. If the operators S, T . . . represent the physical quantities a, b . . . then the operator αS + βT + . . . represents the physical quantity αa + βb . . . . D. If operator S represents the physical quantity a then the operator f (S) represents the physical quantity f (a). The opportunistic aspect of this soft axiomatisation manifests itself in the fact that postulates A. and B. do not imply that the physical quantities must be represented by the set of all linear operators on a Hilbert space. One has to, and von Neumann indeed does so, stipulate that the physical quantities are represented by the formal machinery of linear operators on a Hilbert space. Hence one knows what properties the physical quantities possess only by inspecting the structural properties of the Hilbert space operators. From A + B + C + D von Neumann deduces that every expectation value assignment is of the form E(a) = T r(U S)
(1)
with some statistical operator U (= positive, linear). This formula is the heart of the whole theory, it contains all probability statements; specifically, according to von Neumann’s interpretation, the formula (1) yields the probabilities of quantum events: (2) p(P S (d)) = T r(U P S (d)) where P S (d) is a spectral projection of some observable S with spectral measure P S , the projection P S (d) representing the event that observable S takes its value in the set d of real numbers. This is, in a nutshell, the skeleton of the core of von Neumann’s approach to quantum mechanics in the years 1926–1932. Von Neumann realized however that his interpretation of the trace formula was beset with deep conceptual problems: in order to be able to interpret the probabilities p(X) as relative frequencies (in von Mises’ sense) the probability assignment X → p(X) needs
242
Intuition and the Axiomatic Method
to satisfy the following “subadditivity” property: p(X) + p(Y ) = p(X ∧ Y ) + p(X ∨ Y )
for all projections X, Y
(3)
where ∧ and ∨ are the standard lattice operations between Hilbert space projections. But the subadditivity property is violated by every p defined by a non-trivial statistical operator U = I; on the other hand, the “probabilities” given by the identity operator as statistical operator are not finite, hence they cannot be interpreted as relative frequencies at all. Von Neumann was struggling with this problem already in his second 1926 paper on the foundations of quantum mechanics and also in his book, and this conceptual problem was the main reason, we claim, why he lost his belief in the Hilbert space formalism by about 1935 (see R´edei (1996) for more details). This is clear in his 1936 work with G. Birkhoff on quantum logic, where Birkhoff and von Neumann suggest the theory of type II1 factor von Neumann algebras as the proper framework of quantum mechanics. It is worth pointing out that von Neumann’s preference for the theory of II1 factors as the proper mathematical framework of quantum theory was not based on any mathematical imprecision in the Hilbert space formalism, nor was it motivated by any discovery of a new physical fact or phenomenon: it was motivated exclusively by informal, conceptual-philosophical difficulties — this we take as another indication that what drove von Neumann’s research in physics was not his desire to have mathematically impeccable theories but to create theories that, besides being mathematically precise, are also conceptually sound. What better additional proof of this claim could one imagine than the fact that von Neumann realized that even taking the theory of II1 factor von Neumann algebras as the proper mathematical framework for quantum theory does not solve the problem of how to interpret quantum probability, and in 1936 he finally abandoned the relative frequency view of quantum probabilities altogether: This view, the so-called ‘frequency theory of probability’ has been very brilliantly upheld and expounded by R. von Mises. This view, however, is not acceptable to us, at least not in the present ‘logical’ context. (von Neumann (1937), p. 196)
(See R´edei (1999, 2001) for further details of von Neumann’s post 1932 views on quantum mechanics and quantum probability.) Further evidence that von Neumann was not at all dogmatic in his insistence on mathematical rigour in physics is provided in the Preface of his 1932 book: The object of this book is to present the new quantum mechanics in a unified representation which, so far as it is possible and useful, is mathematically rigorous. . . . The method of Dirac . . . in no way satisfies the requirements of mathematical rigor — not even if these are reduced in a natural and proper fashion to the extent common elsewhere in theoretical physics. (von Neumann (1932), pp. vii–ix, our emphasis)
It is true, in the late 1940s, Laurent Schwartz succeeded in giving a rigorous mathematical basis to Dirac’s distributions. Von Neumann’s unwarranted skepticism that such a theory was at all feasible does not fault his methodology,
Soft Axiomatisation
243
but rather confirms it. For, opportunistic soft axiomatics derives a large part of its support from the fact that an axiom system can be improved at a later stage and that the deep mathematical significance of certain concepts could gradually come to light. In 1932, the alternative was simply to adopt a mathematically well-entrenched framework, operators in Hilbert space, or one that had no mathematical basis at all. In the absence of the Hilbert space concept, von Neumann would most probably not have objected to Dirac’s pragmatic research strategy.
3.
Von Neumann on method in the physical sciences Compare the following quotation: To begin, we must emphasize a statement which I am sure you have heard before, but which must be repeated again and again. It is that the sciences do not try to explain, they hardly ever try to interpret, they mainly make models. By a model is meant a mathematical construct which, with the addition of some verbal interpretations, describes observed phenomena. The justification of such a mathematical construct is solely and precisely that it is expected to work — that is correctly to describe phenomena from a reasonably wide area. I will further limit myself to saying a few things about procedure and method which will illustrate the general character of method in science. Not only for the sake of argument but also because I really believe it, I shall defend the thesis that the method in question is primarily opportunistic – also that outside of the sciences, few people appreciate how utterly opportunistic it is. (von Neumann (1961a), p. 492)
In his 1955 paper on “Method in the Physical Sciences”, von Neumann goes on to discuss various pragmatic criteria of theory preference which are valid both for the empirical sciences and mathematics. “Simplicity is largely a matter of historical background . . . and it is very much a function of what is explained by it” (ibid., p. 492), to wit, how heterogeneous the material covered by the explanation is. Accordingly, simplicity and unificatory power have to be constantly weighed against each other. Von Neumann attributes surprisingly little importance to whether a fact can be predicted in advance or just explained after the fact has been observed. A theory’s capability of dealing with heterogeneity ranks higher; in particular “confirmations in areas which were not in the mind of anyone who invented the theory” (ibid., p. 493). He emphasizes that both these criteria are “clearly to a great extent of an aesthetical nature” (ibid., p. 493). which brings them rather close to the mathematical criteria of success. Mathematics proper possesses a genuine criterion of success. In “The Mathematician” of 1947 von Neumann writes: “One expects a mathematical theorem or a mathematical theory not only to describe and to classify in a simple and elegant way . . . One also expects ‘elegance’ in its ‘architectural’, structural makeup” (von Neumann (1962a), p. 9); e.g., a surprising twist in the argument which immediately makes a point very easy, or some general principle which explains why difficulties crop up and which reduces the apparent arbitrariness. “These criteria are clearly those of creative art” (ibid., p. 9), so that
244
Intuition and the Axiomatic Method . . . the subject begins to live a particular life of its own and is better compared to a creative one, governed by almost entirely aesthetical motivations, than to anything else and in particular, to an empirical science. . . . As a mathematical discipline travels far from its empirical source . . . it is beset with very grave dangers. It becomes more and more purely aestheticizing, more and more purely l’art pour l’art. (ibid., p. 9)
The field is then in danger of developing along the line of least resistance and will “separate into a multitude of insignificant branches” (ibid., p. 9). “[W]henever this stage is reached, the only remedy seems . . . to be a rejuvenating return to the source: the reinjection of more or less directly empirical ideas” (ibid., p. 9). While the aesthetic criteria of success in mathematics and theoretical physics are quite similar, von Neumann locates major differences regarding their aims and their actual modus procedendi. Even without signs of degeneration, mathematics is more finely subdivided into subdisciplines because often the selection of problems itself is aesthetically oriented. Theoretical physics, on the contrary, is typically highly focused to resolve an internal difficulty or to solve a problem that was posed by experimental results. Once a break-through is reached, “the predictive and unifying achievements usually come afterward” (ibid., p. 8). From this diagnosis von Neumann concludes that . . . the problems of theoretical physics are objectively given; and, while the criteria which govern the exploitation of a success are . . . mainly aesthetical, yet the portion of the problem, and that which I called above the original ‘breakthrough’, are hard, objective facts. (ibid., p. 8)
Thus instead of furnishing an absolute foundation, mathematics plays a central role in the sciences and in society — perhaps still as central as Hilbert had believed — by being aesthetically-oriented and opportunistic. I feel that one of the most important contributions of mathematics to our thinking is, that it has demonstrated an enormous flexibility in the formation of concepts, a degree of flexibility to which it is very difficult to arrive in a non-mathematical mode. (von Neumann (1961b), p. 482)
In the two examples von Neumann discusses in the 1955 paper, namely classical mechanics and quantum mechanics, further aspects of von Neumann’s opportunistic soft axiomatics come to the fore; they show the pragmatic virtues of a mathematization not driven by foundationalism. In connection with classical mechanics von Neumann argues that there are two mathematically equivalent formalisms of the theory: one can either set up a second-order differential equation which locally describes the dynamical evolution, or one can apply the Principle of Least Action over a finite time interval, i.e. globally. Von Neumann identifies the first formalism with a “teleological”, and the second one with a “causal” mode of description. Since the two are strictly equivalent mathematically, von Neumann declares the problem of whether processes are causal or teleological, as a pseudo-problem: Newton’s description is causal and d’Alembert’s description is teleological. . . . All the difference between the two is a purely mathematical transformation. . . .
Soft Axiomatisation
245
Thus whether one chooses to say that classical mechanics is causal or teleological is purely a matter of literary inclination at the moment of talking. This is very important, since it proves, that if one has really technically penetrated a subject, things that previously seemed in complete contrast, might be purely mathematical transformations of each other. Things which appear to represent deep differences of principle and of interpretation, in this way may turn out not to affect any significant statements and any predictions. They mean nothing to the content of the theory. (ibid., p. 496)
Von Neumann also had a rather opportunistic attitude concerning the philosophical issues in quantum theory. He writes: [W]hile there appears to be a serious philosophical controversy between the interpretations of Schr¨odinger and Heisenberg, it is quite likely that the controversy will be settled in quite an unphilosophical way. The decision is likely to be opportunistic in the end. The theory that lends itself better to formalistic extension towards valid new theories will overcome the other, no matter what our preference up to that point might have been. It must be emphasized that this is not a question of accepting the correct theory and rejecting the false one. It is a matter of accepting that theory which shows greater formal adaptability for a correct extension. This is a formalistic, esthetic criterion, with a highly opportunistic flavor. (ibid., p. 498)
4.
Von Neumann on mathematical rigour
In his writings about pure mathematics and mathematical rigour von Neumann also takes a very down-to-earth, relaxed attitude, taking the position that mathematical rigour is changeable and has changed several times in the course of the history of mathematics. Nevertheless, in order to be of pragmatic value, an appropriate standard of rigour had to be maintained. Whatever philosophical or epistemological preferences anyone may have in this respect, the mathematical fraternities’ actual experiences with its subject give little support to the assumption of the existence of an a priori concept of mathematical rigor. I have told the story of this controversy [about the foundations of mathematics] in such detail, because I think that it constitutes the best caution against taking the immovable rigor of mathematics too much for granted. This happened in our lifetime, and I know myself how humiliatingly easily my own views regarding the absolute mathematical truth changed during this episode, and how they changed three times in succession! (von Neumann (1962a), p. 6) [I]t is not necessarily true that the mathematical method is something absolute, which was revealed from on high, or which somehow, after we got hold of it, was evidently right and has stayed evidently right ever since. To be more precise, maybe it was evidently right after it was revealed, but it certainly didn’t stay evidently right ever since. There have been very serious fluctuations in the professional opinion of mathematicians on what mathematical rigor is. To mention one minor thing: In my own experience, which extends over only some thirty years, it has fluctuated so considerably, that my personal and sincere conviction as to what mathematical rigor is, has changed at least twice. And this is in a short time of the life of one individual! (von Neumann (1961b), p. 480)
246
Intuition and the Axiomatic Method
The last change in the concept of rigour was in a certain respect the most radical one. After G¨odel there no longer existed any absolute foundation for mathematics in the purely algorithmic sense which Hilbert had in mind, and G¨odel’s proof was valid under all current definitions of mathematical rigour including the intuitionist one. Von Neumann was one of the first to admit defeat. Hilbert’s program of formal axiomatics was — though not based on wrong intentions — simply unfeasible. But soft axiomatics was not threatened by this wreckage because it had never attempted to create an absolute foundation. And accordingly von Neumann simply went on to provide an axiomatic foundation of quantum mechanics. While such a strategy might have been provisionally acceptable for Hilbert (who never really commented in depth on G¨odel’s results), von Neumann’s conclusions about rigour are more radical, and he combines mathematics and the sciences in a manner that was very far from Hilbert’s repeated talk about a non-Leibnizian pre-established harmony between mathematics and physics. In von Neumann’s hands, the ontological problem became a pragmatic one. In his 1947 “The Mathematician” he writes: The main hope for justification of classical mathematics — in the sense of Hilbert or of Brouwer and Weyl — being gone, most mathematicians decided to use that system anyway. After all, classical mathematics . . . stood on at least as sound a foundation as, for example, the existence of the electron. Hence, if one is willing to accept the sciences, one might as well accept the classical system of mathematics. (von Neumann (1962a), p. 6)
It would be wrong to infer from the above quotations that von Neumann subscribed to the view that rigour can be no more than a sociological criterion. To von Neumann’s mind — although any particular set of basic propositions can be doubted — mathematics “establishes certain standards of objectivity, certain standards of truth . . . rather independently of everything else” (von Neumann (1961b), p. 478). The source of this objectivity does not contradict the historical fact that many non-rigorous arguments were accepted — either with a certain sense of guilt or due to bona fide disagreements as to whether a particular proof was really a proof. Already fluctuations in the style of proofs can come close to differences in rigour. This yields surprising consequences for mathematical ontology. “The variability of the concept of rigor shows that something else besides mathematical abstraction must enter into the makeup of mathematics” (von Neumann (1962a), p. 4). The “something else”, as indicated already in the quotation comparing the degree of mathematical certainty with that of an electron, is empirical science: “The most vitally characteristic fact about mathematics is . . . its quite peculiar relationship . . . to any science which interprets experience on a higher than purely descriptive level” (von Neumann (1962a), p. 1). This relationship has two sides: On the one side, [i]n modern empirical sciences it has become a major criterion of success whether they have become accessible to the mathematical method or to the near-mathematical methods of physics. Indeed, throughout the natural sciences an unbroken
Soft Axiomatisation
247
chain of pseudomorphoses, all of them pressing toward mathematics, and almost identified with the idea of scientific progress, has become more and more evident. (von Neumann (1962a), p. 2)
On the other side, “[s]ome of the best inspirations of modern mathematics (I believe, the best ones) clearly originated in the natural sciences” (von Neumann (1962a), p. 2). Von Neumann quotes the examples of geometry and analysis as two branches of mathematics that have empirical origins and show that history was richer than a linear increase of abstractness and rigour. In his view an adequate appraisal of former inexact results of these disciplines is not reached by merely stressing that they satisfied the standards of the day and that the gaps were filled in later years. Thus in the end, soft axiomatics might appear as a fourth change in the concept of rigour — at least for the domain of mathematical physics.
5.
Summary
Mathematical precision and rigour without conceptual clarity was for von Neumann neither possible nor desirable either in physical sciences or in mathematics. It seems justified to say that what drove von Neumann in his research, especially in physics, was the desire to achieve conceptual clarity and formulate conceptually consistent theories. Von Neumann’s work on quantum mechanics and especially his abandoning the Hilbert space formalism corroborates this interpretation to a large extent. In arriving at acceptable theories von Neumann was relying on the method of an opportunistically interpreted soft axiomatics, a method of axiomatisation which was not affected by G¨odel’s results. Von Neumann himself, when speaking of the method in physics, emphasized that the aim of theoretical physics is to create mathematical models. His success in creating powerful mathematical models in physics was due to his unparalleled skill and talent in combining algebraic-combinatorial techniques with analysis. Acknowledgements: Mikl´os R´edei’s work was supported by OTKA (contract numbers: T 035234, T 032771 and TS 040899). Michael St¨oltzner’s work was supported by the Volkswagen Foundation.
Notes 1. “Wir beginnen mit der Beschreibung des zu axiomatisierendes Systems und mit der Angabe der Axiome. Eine kurze Erl¨auterung des Sinnes der einzelnen Symbole und Axiome lassen wir nachfolgen. . . . Es ist u¨ brigens selbstverst¨andlich, daß man bei axiomatischen Untersuchungen, wie die unsere, die Ausdrucksweise ‘Sinn eines Symbols’ oder ‘Sinn eines Axioms’ nicht w¨ortlich nehmen darf: diese Symbole und Axiome haben (in Prinzip wenigstens) keinerlei Sinn, sie vertreten nur (in mehr oder minder vollst¨andiger Weise) gewisse Begriffsbildungen der unhaltbar gewordenen ‘naiven Mengenlehre’. Wenn wir von ‘Sinn’ sprechen, so ist damit also stets der Sinn der ersetzten Begriffe der ‘naiven Mengenlehre’ gemeint.”
248
Intuition and the Axiomatic Method
2. “Der Weg, der nun zu dieser Theorie f¨uhrt, ist folgender: Man stellt gewisse physikalische Forderungen an diese Wahrscheinlichkeiten, die durch unsere bisherigen Erfahrungen und Entwicklungen nahe gelegt sind, und deren Erf¨ullung gewisse Relationen zwischen den Wahrscheinlichkeiten erfordern. Dann sucht man zweitens einen einfachen analytischen Apparat, in dem Gr¨oßen auftreten, die genau dieselben Relationen erf¨ullen. Dieser analytische Apparat, und damit die in ihm auftretenden Rechengr¨oßen, erfahren nun auf Grund der physikalischen Forderungen eine physikalische Interpretation. Das Ziel ist dabei, die physikalischen Forderungen so vollst¨andig zu formulieren, dass der analytische Apparat gerade eindeutig festgelegt wird. Dieser Weg ist also der einer Axiomatisierung, wie sie z. B. in der Geometrie durchgef¨uhrt worden ist. Durch die Axiome werden die Relationen zwischen den geometrischen Gebilden, wie Punkt, Gerade, Ebene, beschrieben, und dann gezeigt, dass diese Relationen gerade ebenso bei einem analytischen Apparat, n¨amlich den linearen Gleichungen erf¨ullt sind. Dadurch kann man wieder umgekehrt aus den Eigenschaften der linearen Gleichungen geometrische S¨atze gewinnen.” 3. “Das oben angedeutete Verfahren der Axiomatisierung wird nun in der Physik gew¨ohnlich nicht genau so befolgt, sondern der Weg zur Aufstellung einer neuen Theorie ist, wie in der Regel, so auch hier, folgender. Man mutmaßt meistens den analytischen Apparat, bevor man noch das vollst¨andige Axiomensystem aufgestellt hat, und kommt dann erst durch die Interpretation des Formalismus zur Aufstellung der physikalischen Grundrelationen. Es ist schwer, eine solche Theorie zu verstehen, wenn man diese beiden Dinge, der Formalismus und seine physikalische Interpretation, nicht scharf genug auseinanderh¨alt. Diese Scheidung soll hier m¨oglichst deutlich durchgef¨uhrt werden, wenn wir auch, dem jetzigen Zustand der Theorie entsprechend, noch nicht eine vollst¨andige Axiomatik begr¨unden wollen. Das, was jedenfalls eindeutig festliegt, ist der analytische Apparat, der — rein mathematisch — auch keiner Ab¨anderung f¨ahig ist. Was dagegen modifiziert werden kann, und voraussichtlich auch noch werden wird, ist die physikalische Interpretation, bei der eine gewisse Freiheit und Willk¨ur besteht.”
References Bell, J.S. (1987), Speakable and Unspeakable in Quantum Mechanics, Cambridge University Press. Feynman, R. (1965), The Character of Physical Law, MIT Press, Cambridge, Mass. Halmos, P. (1973), “The Legend of John von Neumann” in: American Mathematical Monthly 50, 382–394. Hilbert, D. (1899), Grundlagen der Geometrie, Teubner, Leipzig (English translation: Foundations of Geometry, Open Court, La Salle, IL, 1971). ¨ Hilbert, D., Nordheim, L. and J. von Neumann (1926), “Uber die Grundlagen der Quantenmechanik” in: von Neumann (1962/I), 104–133. Majer, U. (2001), “The Axiomatic Method and the Foundations of Science: Historical Roots of Mathematical Physics in G¨ottingen (1900–1930)” in: R´edei, St¨oltzner (2001), 11–33. R´edei, M. (1996), “Why John von Neumann Did Not Like the Hilbert Space Formalism of Quantum Mechanics (and What He Liked Instead)” in: Studies in History and Philosophy of Modern Physics 27, 493–510. R´edei, M. (1998), Quantum Logic in Algebraic Approach, Kluwer Academic Publishers, Dordrecht. R´edei, M. (1999), “Unsolved Problems in Mathematics’ J. von Neumann’s Address to the International Congress of Mathematicians, Amsterdam, September 2-9, 1954” in: The Mathematical Intelligencer 21, 7–12. R´edei, M. (2001), “John von Neumann’s Concept of Quantum Logic and Quantum Probability” in: R´edei, St¨oltzner (2001), 153–172. R´edei, M. and M. St¨oltzner (eds.) (2001), John von Neumann and the Foundations of Quantum Physics (Vienna Circle Institute Yearbook 8), Kluwer, Dordrecht. St¨oltzner, M. (2001), “Opportunistic Axiomatics: John von Neumann on the Methodology of Mathematical Physics” in: R´edei, St¨oltzner (2001), 35–62. St¨oltzner, M. (2002a), “How Metaphysical is ‘Deepening the Foundations’ — Hahn and Frank on Hilbert’s Axiomatic Method” in: M. Heidelberger and F. Stadler (eds.) (2002), History of Philosophy of Science. New Trends and Perspectives, Kluwer, Dordrecht, 245–262.
Soft Axiomatisation
249
St¨oltzner, M. (2002b), “Bell, Bohm, and von Neumann: Some Philosophical Inequalities Concerning No-go Theorems and the Axiomatic Method” in: T. Placek and J. Butterfield (eds.) (2002), Modality, Probability, and Bell’s Theorem, Kluwer, Dordrecht, 37–58. Thirring, W. (2001), “J. v. Neumann’s Influence in Mathematical Physics” in: R´edei, St¨oltzner (2001), 5–10. von Neumann, J. (1927a), “Mathematische Begr¨undung der Quantenmechanik”, G¨ottinger Nachrichten, 1–57; in: von Neumann (1962/I), 151–207. von Neumann, J. (1927b), “Wahrscheinlichkeitstheoretischer Aufbau der Quantenmechanik”, G¨ottinger Nachrichten, 245–272; in: von Neumann (1962/I), 208–235. von Neumann, J. (1927c), “Thermodynamik quantenmechanischer Gesamtheiten”, G¨ottinger Nachrichten, 245–272; in: von Neumann (1962/I), 236–254. von Neumann, J. (1932), Mathematische Grundlagen der Quantenmechanik, Springer Verlag, Heidelberg (English translation by R. T. Beyer, Princeton Universtity Press, 1955). von Neumann, J. (1937), “Quantum Logics (Strict- and Probability Logics)”, unfinished manuscript, John von Neumann Archive, Libarary of Congress, Washington, D.C., reviewed by A. H. Taub in: von Neumann (1962/IV), 195–197. von Neumann, J. (1961/VI), Collected Works Vol. VI. Theory of Games, Astrophysics, Hydrodynamics and Meteorology, A.H. Taub (ed.), Pergamon Press, Oxford. von Neumann, J. (1961a), “Method in the Physical Sciences” in: von Neumann (1961/VI), 491– 498. von Neumann, J. (1961b), “The Role of Mathematics in the Sciences and in Society” in: von Neumann (1961/VI), 477–490. von Neumann, J. (1962/I), Collected Works Vol. I. Logic, Theory of Sets and Quantum Mechanics, A.H. Taub (ed.), Pergamon Press, Oxford. von Neumann, J. (1962a), “The Mathematician” in: von Neumann (1962/I), 1–9. von Neumann, J. (1962b), “Die Axiomatisierung der Mengenlehre” in: von Neumann (1962/I), 339–422. von Neumann, J. (1962/IV), Collected Works Vol. IV. Continuous Geometry and Other Topics, A.H. Taub (ed.), Pergamon Press, Oxford. Weyl, H. (1944), “David Hilbert and his Mathematical Work” in: Bulletin of the American Mathematical Society 50, 612–654. Wightman, A.S. (1976), “Hilbert’s Sixth Problem” in: Proceedings of Symposia in Pure Mathematics 28, AMS, 147–240.
THE INTUITIVENESS AND TRUTH OF MODERN PHYSICS Peter Mittelstaedt Universit¨at zu K¨oln, Germany
1.
Introduction
Since its emergence, twentieth century physics has had the reputation of being unintuitive, abstract and difficult to understand, while the earlier classical physics, by contrast, is regarded as intuitive and comprehensible. The history of physics in the twentieth century demonstrates how this widespread assessment resulted in a clear rejection of the new theories, the defamation of individual scientists and even political persecution. Einstein’s Special Theory of Relativity1 of 1905 was especially affected by this rejection, probably also because it was the first of the theories of modern physics to be apparently in clear contradiction to “common sense”. The, in many respects, much more radical General Theory of Relativity of 1916, on the other hand, provoked much less public agitation and rejection, a fact that was probably due also to its complex mathematic form. The continued interest in the cosmological consequences of this theory such as the big bang, the age of the universe, cosmic expansion etc. is due to the significance of these results for our world view and not to the alleged unintuitiveness of the underlying theory. It was not until Quantum Mechanics was discovered in 1925 that there was again considerable public irritation due to the numerous philosophically explosive theses contained in this theory. In the following reflections, we shall consider whether there is a factual justification for the above-mentioned assessment of classical and modern physics. Initially, this impression seems to be confirmed by the fact that modern physics, i.e. the Theory of Relativity and Quantum Mechanics, deals with previously unknown phenomena that do not occur in our ordinary experience. The difficulties with modern physics, however, could also be due to the fact that the intuitiveness of its phenomena is not easily recognized and not at first glance. First, we shall give a more precise definition of the concept of intuitiveness and then discuss the question whether in this sense classical physics is intuitive 251 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 251–266. © 2006 Springer. Printed in the Netherlands.
252
Intuition and the Axiomatic Method
and modern physics unintuitive. The first impression seems to support these two theses. The peculiar relation between classical and modern physics shows, however, that, upon closer inspection, these two theses cannot be defended as they stand, but must be substantially modified in order to do justice to the actual conditions in classical and modern physics. The objective of the present study is systematic rather than historical. We are not interested in knowing how the three great theories of modern physics — the Special Theory of Relativity, the General Theory of Relativity, and Quantum Mechanics — actually came about and what intentions their discoverers pursued. Neither do we want to focus on the reception of these theories. Instead, we want to discuss the purely theoretical question, how these three theories could have been discovered, if, step by step, scientists had freed themselves of metaphysically motivated hypotheses contained in Classical Mechanics. The answer to this question yields surprising results regarding the intuitiveness and truth of modern physics.
2.
The world of ordinary experience and classical physics
The world as we know it through ordinary experience determines what we call intuitive or comprehensible. We do not want to begin with the meaning of the word “intuitive” and its history. Rather, this concept will find its bearing in ordinary experience (OE), which, as prescientific experience, precedes all scientific cognition. This vague sense of intuitive and comprehensible is not only used in popular science but also corresponds largely to the usage of the terms in the literature of modern physics. This does not rule out the possibility that individual physicists interested in fundamental question have understood these concepts in a deeper sense and in line with the philosophical tradition.2 Such an interpretation, however, seems initially to have been limited to Quantum Mechanics and will therefore be omitted for present purposes. At this point, it is already useful to introduce an important distinction. The concept of intuitiveness that has its bearing in ordinary experience (OE) can be understood in two different ways.3 First, it can be taken to mean the immediately and directly discernible intuitiveness, i.e. the agreement of a new experience with already known and familiar elements of ordinary experience. In this case, we also speak of direct intuitiveness. “Intuitive”, however, can also refer to a familiarity or resemblance with ordinary experience, which is revealed to the observer only after prolonged observation and via a number of logical steps. In this case, we speak of indirect intuition. Both types of intuition are found in physics.4 Hence, there exists a kind of reductionism of intuitiveness. By means of a more or less long series of steps, phenomena that at first glance are by no means directly intuitive, comprehensible and familiar, can nevertheless be traced back to directly intuitive phenomena. These are the indirectly intuitive processes and
The Intuitiveness and Truth of Modern Physics
253
experiential structures. Obviously, the indirectness of intuitiveness can differ by degree with respect to various phenomena and different observers. There are experiences, particularly in modern physics, the indirect intuitiveness of which has only been revealed gradually over the span of several years, while simple, although initially irritating classical-mechanical processes — such as the Coriolis forces in an accelerated frame of reference — can be traced back very quickly to directly intuitive structures. Between different observers there can also be gradual differences. A trained and experienced natural scientist will be able to discern the indirect intuitiveness of a process more quickly and in fewer steps than a lay person in the same area. Classical physics sees itself as the scientific theory of the world of ordinary experience. In the formulation of classical physics, especially of Classical Mechanics (CM), certain basic experiences are extrapolated beyond immediate experience and elevated to general principles, which then form the basis of the corresponding classical-physical theory. These principles, which are not justified by experience alone, can probably be attributed to the influence of modern metaphysics in the rise of classical physics in the 17th and 18th centuries.5 This influence concerns the assumption of the existence of an absolute space, the assumption that Euclidean geometry applies to this space, the existence of an absolute and universal time as well as the assumption of an unbroken causality. Furthermore, there is the assumption that matter consists of individual and permanent substances, the properties of which are “thoroughly determined”6 and simultaneously decidable. Ordinary experience (OE) does not justify these ontological hypotheses. But it is also not precise enough to contradict them. Hence, since these presupposition enter explicitly into Classical Mechanics, it is to be expected that classical mechanics will enter into conflict with physical experience as that experience becomes more and more precise. In order for us to be able to respond adequately to such contradictions, we shall now investigate in more detail, where these ontological hypotheses enter as premises into the structure of Classical Mechanics. Once that is known, the hypotheses could possibly be weakened or abandoned.
3.
Modern physics of the twentieth century
The modern physics of the twentieth century consists of three essential components that differ from Classical Mechanics (CM) in various ways: the Special Theory of Relativity, the General Theory of Relativity and Quantum Theory. These theories are held to be unintuitive, abstract, counterintuitive and incomprehensible, not only in the popular literature, but in general. In particular, this concerns the following claims of the three modern physical theories:
254
Intuition and the Axiomatic Method
a) Special Theory of Relativity (SR) There is no absolute simultaneity independent of the observer There is a time dilation of moving clocks There is a clock paradox There is a length contraction of moving bodies There is a maximum velocity of moving observers b) General Theory of Relativity (GR) Space S3 does not have a Euclidean but rather a Riemannian metric Four-dimensional spacetime S4 has a Riemannian metric The rate of a clock depends on the location of the clock There is in general no cosmic time binding for all observers There is a clock paradox even for observers on geodesic trajectories There is in general no definite temporal order of events c) Quantum Mechanics (QM) There is no preserved substance An object does not have all properties either positively or negatively There are no distinguishable and reidentifiable individuals There is in general no causality between successive events There is in general no objectification of properties of an object The whole of classical logic is not valid for statements of Quantum Mechanics All of these claims of the modern physical theories stand in contradiction to the corresponding propositions of Classical Mechanics. Classical Mechanics plays a central and very special role in this connection, since it is not the theory of a special domain of objects, but rather an “empty” theory, which at bottom deals with nothing but classical physics itself. Classical Mechanics formulates the general laws and structures that apply to any possible classicalphysical theory of concrete objects. Hence, the differences and contradictions of the theories of modern physics in relation to this classical theory are of very general significance. These are not contradictions, however, in relation to prephysical or pretheoretical ordinary experience (OE). That is, they are not instances of unintuitiveness in the sense introduced earlier, but at best processes and structures that are only indirectly intuitive. For closer inspection reveals that the supposed unintuitive aspects of modern physics all derive from contradictions between modern
The Intuitiveness and Truth of Modern Physics
255
physics and the above-mentioned hypothetical structures and laws, which, although they are the subject of Classical Mechanics, substantially overstep the bounds of ordinary experience and are not justified by it. These connections will be discussed in more detail in the following section.
4.
The reconstruction of modern physics on the basis of Classical Mechanics
In order to see in detail how classical physics, and in particular Classical Mechanics, contains hypothetical assumptions that go beyond ordinary experience and stand in contradiction to the theories of modern physics, one must construct Classical Mechanics with the same conceptual and mathematical means as are used in the theories of modern physics. This has been attempted by a number of authors,7 and, in at least one instance, with the same objective.8 Formulated in this way, Classical Mechanics can then serve as a basis for a reconstruction of the three theories of modern physics through a reduction of the respective problematic ontological hypotheses.
4.1
Reconstruction of the Special Theory of Relativity on the basis of Classical Mechanics
The unintuitive elements of the Special Theory of Relativity listed in (3.a) refer to a confrontation of this theory with Classical Mechanics (CM) but not with the more qualitative ordinary experience (OE). The fundamental hypothesis of Classical Mechanics that is at stake in this case is the assumption of an absolute and universal time. The presupposition of this hypothesis H(T) in the construction of Mechanics leads directly to Galilean-Newtonian spacetime and Galileo-invariant Newtonian mechanics. But it can hardly be claimed that the hypothesis H(T) suggests itself through a high degree of intuitiveness and intelligibility. Rather, it is an assumption that substantially oversteps the experiential scope of Classical Mechanics and that can neither be proved nor disproved within this framework. If, in the construction of the Mechanics, we leave out the hypothesis H(T), then we obtain Minkowskian spacetime and the special-relativistic Lorentz-invariant mechanics.9 To be sure, the discoveries of the Special Theory of Relativity then stand in contradiction to Classical Mechanics, but not to ordinary experience. The fact that the compatibility of (SR) with (OE) is not immediately recognized is due to the long path just sketched leading to this result. In the sense of the terminology introduced above, therefore, the Special Theory of Relativity is indirectly intuitive, but by no means unintuitive. The construction of Mechanics and of the Special Theory of Relativity occurs in several steps, which will be only briefly mentioned here. Details can be found in the literature. The first step is the introduction of space and geometry. The hypothesis H(E), that finitely extended standards of measurement in space are freely movable, leads via the first Helmholtz theorem to the result that the geometry measured with these standards is elliptical, hyperbolical or
256
Intuition and the Axiomatic Method
Euclidean. This makes it possible to establish a Euclidean local space E3 with the help of additional conventional postulates. For the introduction of time, we consider an ensemble Γ(k1 , k2 , . . .) of bodies ki freely thrust into space as well as a frame of reference equipped with measuring rods, which can be visualized as a material base of an observer (e. g. a spacecraft). If the trajectories of the test bodies ki are Euclidean straight lines from the perspective of this frame of reference, then the frame of reference is called an inertial frame I. While topological time can be tied to an arbitrary process, metric time is subject to the requirement that the test bodies ki not only move in a straight line but also at a constant velocity. This requirement can be met, since empirically the bodies ki move uniformly relative to one another, which can be determined without knowledge of a metric time. The combination of space and time in a spacetime occurs through the definition of the synchronicity of spatially separated clocks and through a determination of the transformations between inertial frames. The internal transformations I → I consist of the transformations of the Euclidean group and one parametric transformation of the time translation. On the basis of the above instructions, the transformations I → I (v) from I to another inertial frame I (v) moving at velocity v have (in a space coordinate) the form x = k(v)(x − vt),
t = µ(v)t + ν(v)x
with three arbitrary functions k(v), µ(v) and ν(v). At this point, the decision is made between Classical Mechanics and the Special Theory of Relativity. Thus, if one presupposes the hypothesis H(T ) of the existence of a universal time, one will in a few steps arrive at the Galileo transformations x = x − vt,
t = t
of Classical Mechanics. If the hypothesis H(T ) is completely abandoned, on the other hand, one obtains, following a lengthy deduction, the Lorentz transformations that are decisive for the Special Theory of Relativity: 1
x = (x − vt)/(1 − v 2 /c2 ) 2 ,
1
t = (t − vx/c2 )(1 − v 2 /c2 ) 2
where the constant c enters into the transformation through an empirically satisfiable convention regarding the distant synchronization in the inertial frames I and I (v). It corresponds numerically to the speed of light in a vacuum. Hence, the spacetime of the Theory of Special Relativity is the pseudo-Euclidean Minkowski space M 4. These connections are summarized schematically in Fig. 1.
257
The Intuitiveness and Truth of Modern Physics
Space: free mobility of measuring rods H(E) ⇒ Euclidean character of space Time: a) Inertial frames I, ensemble Γ(k1 , k2 , . . .) of free bodies ki that move on Euclidean straight lines b) Topological time is tied to an arbitrary process ⇒ temporal order of events c) Metric time: Since ki are relatively uniform, ⇒ convention: ki move uniformly Space and Time: a) Convention of the simultaneity of distant points b) Transformations I → I: space transl. T3 , rotation O3 , time transl. T1 The relativity of the I frames follows from Γ, that is, I → I (v): x = k(v)(x − vt), t = µ(v)t + n(v)x ?
Hypothesis H(Z) ?
Galileo transformations x = x − vt t = t
?
?
Without H(Z) ?
Lorentz transformations x = (x − vt)/ (1 − v 2 /c2 ) t = (t − vx/c2 )/ (1 − v 2 /c2 )
Fig. 1 Schema of the reconstruction of (SR) on the basis of (CM)
258
Intuition and the Axiomatic Method
4.2
Reconstruction of the General Theory of Relativity on the basis of Classical Mechanics
The General Theory of Relativity leads to a Riemannian metric of the threedimensional local space R3 , and in general even to a pseudo-Riemannian metric of the spacetime R4 . It is difficult to discuss the alleged unintuitiveness of this result, since the geometry or metric of space cannot be perceived as such. It is useful, therefore, to focus on observable preconditions from which these metric structures are derived. Helmholtz’s theorems10 are very helpful in this regard: 1) If finitely extended measuring rods are freely mobile in space, then the geometry measured with these rods is elliptical, hyperbolical or Euclidean. 2) If the free mobility is guaranteed only for infinitesimally extended measuring rods, then the geometry measured with these rods is Riemannian. Ordinary experience leads to an intuition of space that is certainly not Euclidean in character. Even as the form of intuition as discussed by Kant, space is three-dimensional, topological and presumably also metrizable, but no more than that. The Euclidean character of space is an additional assumption, which is introduced by Classical Mechanics and which presupposes the hypothesis H(E) regarding the free mobility of finitely extended measuring rods. Within the domain of objects of Classical Mechanics, this hypothesis can certainly not be verified with the requisite precision. It is an empirically unverifiable additional assumption. We obtain the spatial metric of the General Theory of Relativity, not by abandoning the empirically unjustified hypothesis H(E), as in the case of hypothesis H(T), but rather by weakening it to the assumption H(R) that only infinitesimally extended measuring rods are freely mobile, which yields the Riemannian character of the metric according to Helmholtz’s second theorem. This weakening is certainly compatible with the spatial intuition of ordinary experience, but not with the spatial geometry of Classical Mechanics. Hence, here too, it is the spatial intuition of Classical Mechanics extended by H(E) that is unintuitive, not the spatial geometry of the General Theory of Relativity obtained through weakening H(E) to H(R). As a consequence of the Riemannian structure of local space, any additional constructions and definitions referring to time and the connection of space and time can only be carried out locally. A constitutive ensemble Γ of bodies freely thrust into space can only be used locally and presently for the constitution of inertial frames, which are therefore called local-momentary inertial frames. With these local-momentary inertial frames, it is nevertheless possible to define — locally and momentarily — a topological and a metrical time. The local inertial frames also allow for the local execution of the constructions required for the formation of spacetime, whereby a Minkowski space M4 is
The Intuitiveness and Truth of Modern Physics
259
constructed locally and momentarily at every point of the spacetime. These pseudo-Euclidean M4 spaces are tangent spaces of the finite spacetime R4 , which thus reveals itself as a pseudo-Riemannian space R4 . These relations are summarized schematically in Fig. 2. Space: Measuring rods are freely mobile only infinitesimally: Weakening of H(E) to H(R) ⇒ Riemannian space R3 ⇒ any additional constructions are possible only locally Time: a) Inertial frames I, ensemble Γ(k1 , k2 , . . .) local-momentary bodies ki that move locally on straight lines b) Topological time is tied to a local process ⇒ temporal order of local events c) Metric time: Since ki are relatively uniform, ⇒ convention: ki move uniformly Space and Time: a) No Helmholtz theorem for the spacetime R4 b) No universal time, i.e. no hypothesis H(T ) c) The same construction as for (SR) only leads locallymomentarily to a pseudo-Euclidean M4 (Minkowski space) ⇒ The spacetime R4 has M4 tangent spaces Fig. 2 Schema of the reconstruction of (GR) on the basis of (CM) The measurement of spacetime as described by means of measuring rods and clocks is not only very laborious, but is also unsatisfactory for methodological reasons, since clocks and measuring rods would enter the semantics of the theory as primitive entities. It is possible to avoid both drawbacks, however, if light beams and particle paths are used for measurement. The paths of particles as well as those of light are geodesics of the Riemannian metric and are thus themselves objects of the theory. An explicit execution of this program can be found in Wheeler and Marzke11 (light clocks) and in the axiomatic system of Ehlers, Pirani and Schild,12 which has been expanded by the radar geometry of Sch¨oter and Schelb.13 We shall not pursue this topic further, since it is of no consequence for our problem. The General Theory of Relativity is not only a theory of the pseudo-Riemannian spacetime R4 , but also a theory of gravitation. The influence of the Riemannian guiding field on the movement of test objects and light beams is locally indistinguishable from the influence of a gravitational field. Hence
260
Intuition and the Axiomatic Method
spacetime is also influenced by the material source of a gravitational field. The General Theory of Relativity describes this influence by means of the Einsteinian field equations, in which the Einstein tensor of spacetime is related proportionally to the energy-momentum tensor of matter. The factor of proportionality is the gravitational constant κ. We did not pursue this issue further, since the question of the intuitiveness of (GR) refers exclusively to the pseudoRiemannian space-time structure, and not to its connection to matter. This has the consequence, however, that our reconstruction of the General Theory of Relativity from Classical Mechanics made possible through the weakening of metaphysical premises only refers to the geometric component of (GR), i.e. to the pseudo-Riemannian spacetime, but not to its gravitational component, i.e. the Einsteinian field equations. Both in the Special and in the General Theory of Relativity, it is obvious that processes and structures, which are directly intuitive in the sense of ordinary experience — e. g. the nonexistence of universal time and the fact that finite measuring rods are not freely mobile — can lead to surprising and prima facie counterintuitive conclusions. The dilation of time and the Riemannian spatial metric are of this type. They are not unintuitive, however. Since via a long chain of logical steps it is always possible to see that these results can be traced back to very simple and intuitively obvious premises, these and other results of the Special and of the General Theory of Relativity are not unintuitive, but rather indirectly intuitive in the sense of the terminology introduced above.
4.3
Reconstruction of Quantum Mechanics on the basis of Classical Mechanics
Quantum Mechanics (QM) leads to the propositions listed in (3c), of which we quote the following two for special emphasis: a) An object does not have all properties either positively or negatively b) There are no distinguishable and reidentifiable individuals These results contradict the results of Classical Mechanics. But they do not directly contradict ordinary experience, which would neither be able to prove nor to refute propositions of such generality and rigor. Hence, the results (a) and (b) are not incompatible with ordinary experience and are thus not unintuitive. Classical Mechanics, on the other hand, is based on the following hypothetical assumptions H(a) and H(b), which stand in direct contradiction to the propositions (a) and (b):
H(a): Objects are “thoroughly determined”, i.e. they have all properties, positive or negative (corresponding to the points in the phase space)
The Intuitiveness and Truth of Modern Physics
261
H(b): There are individual, distinguishable and reidentifiable objects, which are the bearers of properties (corresponding to the mass points) We can obtain Classical Mechanics on the basis of these hypotheses H(a) and H(b), if we construct the mechanics operationally in the same way as the (QM) in the quantum-logical approach. One will not be able to claim, however, that the hypotheses H(a) and H(b) are directly or indirectly intuitive in the sense of ordinary experience. Rather, they clearly overstep the bounds of ordinary experience and are also not verifiable within this framework. If we weaken the hypotheses H(a) and H(b), however, in agreement with ordinary experience to Hq (a): At any given time, an object only has some properties in the sense that these are objectively present or not; and Hq (b): There are no strictly distinguishable and strictly reidentifiable objects, then, within the framework of a logical-operational approach, we obtain quantum logic, and, with a few additional assumptions, which in turn are weaker versions of classical assumptions, we obtain Quantum Mechanics in Hilbert space. Let us briefly mention the main steps of this approach. On the basis of the weakened premises Hq (a) and Hq (b), it is possible to construct a formal language of quantum-physical propositions. The formulation of a pragmatics (Q-pragmatics) is followed by a semantics of truth (Q-semantics) and a syntax (Q-syntax), in which the operations (connectives) and relations of the language are formulated. This allows for the integration of the totality of composite propositions that are formally true in the sense of the Q-semantics in a quantum logic (Q-logic). Q-pragmatics, Q-semantics, Q-syntax and Q-logic are all weaker versions of the corresponding classical structures, which we shall call C-pragmatics, C-semantics, C-syntax and Clogic. The Lindenbaum-Tarski algebra of Q-logic is an orthomodular lattice Lq , which can be further specified for single systems and which we shall then call L∗q . The lattice L∗q is atomic and fulfills the covering law. With the addition of Sol`er’s law,14 we obtain a lattice corresponding to the lattice LH of the subspaces of Hilbert space up to isomorphisms. According to a theorem by Piron,15 a Hilbert space is thus determined except for the number field, for which there are still the three possibilities of the real numbers R, the complex numbers C and the quaternions Q. For the complex numbers, we finally obtain the Hilbert space of Quantum Mechanics and hence Quantum Mechanics (QM ) itself. The lattices L∗q or LH in turn are weakened versions of the classical Boolean propositional lattice LB or of the lattice LΓ of the subsets of classical phase space. This is summarized schematically in Fig. 3.
262
Intuition and the Axiomatic Method
Hypotheses: H(a), H(b) ?
-
Hq (a), Hq (b)
-
Q-pragmatics
-
Q-semantics
-
Q-syntax
-
Q-logic
weakening
C-pragmatics ?
C-semantics ?
C-syntax ?
C-logic ?
LB
-
?
LΓ ?
Γ (phase space)
? ? ? ? ?
Lq , L∗q ?
LH ?
H (Hilbert space)
?
?
CM
QM
Fig. 3 Schema of the reconstruction of (QM) on the basis of (CM) In comparison to (CM), (QM) therefore depends only on the weaker, less hypothetical premises Hq (a) and Hq (b). These arguments demonstrate that, while (QM) contradicts Classical Mechanics, it does not contradict ordinary experience, since it follows via a long path of deduction from premises that are directly intuitive with respect to ordinary experience. As in the case of (SR) and (GR), in (QM), the direct intuitiveness of the premises can lead to conclusions that are at first glance surprising and counterintuitive. Since here too, however, one can convince oneself of the rationality of these results by tracing their origin from very simple, directly intuitive premises, the propositions of Quantum Mechanics are at least indirectly intuitive.
5.
The truth of modern physics
The three fundamental theories of modern physics, the theories (SR), (GR) and (QM) are derived from Classical Mechanics by a process which we have described as a reduction of ontological hypotheses. Classical Mechanics is a theory which is empirically accurate in the realm of objects of ordinary experience, but which is ontologically overladen with additional, metaphysically motivated hypotheses that are not empirically verifiable in ordinary experi-
The Intuitiveness and Truth of Modern Physics
263
ence. The possibility that these ontological hypotheses might conflict with experiences in a realm of objects beyond that of ordinary experience could not be excluded even at the time of the formulation of Classical Mechanics. In the meantime, these conflicts have emerged. The theories of modern physics remove these conflicts by doing without the problematic ontological hypotheses or by weakening them in a suitable manner. Hence, we can obtain the theories (SR), (GR) and (QM) from Classical Mechanics (CM ), if we omit or weaken the relevant hypotheses that are explicitly formulated in the structure of (CM ). It is important to note that no new assumptions are added in any of the new theories (SR), (GR) and (QM) — although in the case of the (GR) this is only true for the geometry of spacetime, but not for the Einsteinian field equations. This leads to intertheoretical relations that are very different from those predominately discussed in physics and in philosophy of science.16 In particular, Classical Mechanics turns out not to be the approximate special case resulting from the modern theories (SR), (GR) and (QM) through the unrealistic limit crossings c → ∞, κ → 0 and ¯ h → 0. Rather, the modern physical theories follow from (CM ) through the omission of the relevant ontological hypotheses. The domain of the application ∆OE of classical mechanics, which is given through ordinary experience, is extended in the theories of modern physics in various ways. (SR) includes processes at high velocities, thus extending the domain of application ∆OE to ∆SR . (AR) includes phenomena with strong gravitational fields, and (QM) includes processes and objects of microphysics, thus extending the domains of application from ∆OE to ∆GR and ∆QM . Hence, ∆OE ⊂ ∆SR , ∆OE ⊂ ∆GR , ∆OE ⊂ ∆QM . Thus, the theories of modern physics are “more correct” in their respective domains of application than Classical Mechanics, which is correct in ∆OE , but leads to incorrect results in the extended domains. Since the structures of the modern theories depend on assumptions that are less hypothetical, however, these theories are also “more true” than Classical Mechanics. What this means in particular can be shown as follows. Through the reduction and weakening of hypothetical premises that are not secured empirically, the premises PCM of Classical Mechanics imply the weaker premises PSR , PGR and PQM of the modern theories (SR), (GR) and (QM). That is to say, PCM ≤ PSR , PCM ≤ PGR , PCM ≤ PQM . The theories TSR , TGR and TQM derived from the weakened premises and understood as predicates lead to the true propositions TSR (∆SR ), TGR (∆GR ), TQM (∆QM ),
264
Intuition and the Axiomatic Method
where “true” is used in the sense of the correspondence theory of truth. Since the theories are constructed through a weakening of the premises, (SR), (GR) and (QM) are also more “true” than Classical Mechanics, i.e. closer to the truth than (CM). Following Popper,17 we say that a theory B is closer to the truth than a theory A, if fewer false and a greater number of true conclusions can be derived from B than from A.18 Due to the method of weakening the premises, this applies to our theories already theoretically and not only empirically. In this sense, the absolutely “true” theory would result from presuppositions that no longer contain any contingent, falsifiable premises, but only formal components such as mathematics and logic. The three theories of modern physics, therefore, are closer to the truth than Classical Mechanics, but not absolutely true.
6.
Conclusion
The two theses formulated in the Introduction, that classical physics is intuitive and comprehensible, while modern physics is unintuitive, abstract and difficult to understand, do not stand up to the arguments advanced in this paper and must be modified substantially. Classical Mechanics, originally conceived as the physical theory of ordinary experience, turns out to be enriched with many ontological hypotheses, which, although innocuous in the realm of ordinary experience, have proven to be very obstructive for the development of modern physics. These hypotheses are not intuitive in the sense of ordinary experience, which is why Classical Mechanics as a whole cannot be called intuitive either. That people have frequently had the opposite impression is due, on the one hand, to the fact that the phenomena explained in the Mechanics — pendulum, gyroscope, billiard balls — have a high degree of direct intuitiveness, precisely because they hail from the world of ordinary experience. On the other hand, it is due to the fact that the foundations of Classical Mechanics have seldom been investigated thoroughly enough to reveal the metaphysically motivated hypotheses. The notable exceptions are the works of Mach19 and Poincar´e.20 The theories of modern physics, which comprise a larger domain of objects compared to ordinary experience, do without the ontological hypotheses that are found in Classical Mechanics but that cannot be verified in ordinary experience. Fundamentally, the theories of modern physics are closer to ordinary experience than Classical Mechanics. Nevertheless, they are not directly intuitive. The theories of modern physics primarily explain phenomena that do not occur in ordinary experience, e. g. the dilation of time, the particle-wave dualism, gravitational red-shift. The intuitiveness of these phenomena is revealed to the observer only after tracing them over a long path of arguments back to the directly intuitive fundamental principles. The phenomena of modern physics, therefore, are generally only indirectly intuitive. For our purposes, however, it is important to note that they are in no way unintuitive.
The Intuitiveness and Truth of Modern Physics
265
Since they result from Classical Mechanics through the reduction of ontological premises, moreover, the three theories of modern physics investigated here are also more true than Classical Mechanics. Each of the theories (SR), (GR) and (QM) makes do without certain metaphysically motivated premises of Classical Mechanics, without substituting for them new premises. Hence, the strong premises of Classical Mechanics imply the corresponding weaker premises of the three theories of modern physics. These theories are therefore closer to the truth than Classical Mechanics, i.e. they yield a greater number of true and fewer false conclusions than Classical Mechanics.
Notes 1. Hentschel (1990), Huber (2000), K¨onneker (2001). 2. Falkenburg (2004). 3. Huber (2004). 4. Renate Huber (ibid.) speaks of “sensibly intuitive” and “rationally intuitive” (see this volume pp. 293–324). 5. Falkenburg (2004). 6. Kant, Critique of Pure Reason B 600. 7. Mittelstaedt (1970), Sudarshan (1974), Levy-Leblanc (1976), Piron (1976). 8. Mittelstaedt (1995). 9. Mittelstaedt (1995). 10. Laugwitz (1960). 11. Wheeler and Marzke (1964). 12. Ehlers, Pirani and Schild (1972). 13. Sch¨oter and Schelb (1992/94). 14. Sol`er (1995). 15. Piron (1976). 16. Scheibe (1997/98). 17. Popper (1963), Appendix; (1972), p. 330 ff. 18. The difficulties that arise when this conceptualization is applied to false theories are discussed in detail in Weingartner (2000), Chapter 9. 19. Mach (1901). 20. Poincar´e (1898).
References Ehlers, J., Pirani, F. A., Schild A. and E. Schild (1972), “The Geometry of Free Fall and Light Propagation” in: O’Raiffeartaigh (ed.), General Relativity, Clarendon, Oxford, 63–84. Falkenburg, B. (2004), “Functions of Intuition in Quantum Physics”, this volume pp. 267–292. Hentschel, K. (1990), Interpretationen und Fehlinterpretationen der speziellen und allgemeinen Relativit¨atstheorie durch Zeitgenossen Albert Einsteins, Birkh¨auser, Basel. Huber, R. (2000), Einstein und Poincar´e: die philosophische Beurteilung physikalischer Theorien, Mentis, Paderborn. Huber, R. (2004), “Intuitive Cognition and the Formation of Theories”, this volume pp. 293– 324. Kant, I. (1787), The Critique of Pure Reason, translated by P. Guyer and A. Wood, The Cambridge Edition of the Works of Immanuel Kant, Cambridge 1998. K¨onneker, K. (2001), Aufl¨osung der Natur – Aufl¨osung der Geschichte, J. B. Metzler, Stuttgart. Laugwitz, D. (1960), Differentialgeometrie, B. G. Teubner, Stuttgart. L´evy-Leblond, J. M. (1976), American Journal of Physics 44, 271. Mach, E. (1901), Die Mechanik in ihrer Entwicklung, 4th Edition, F. A. Brockhaus, Leipzig.
266
Intuition and the Axiomatic Method
Martzke, R. and J. A. Wheeler(1964), “Gravitation as Geometry I: The Geometry of SpaceTime and the Geometrodynamical Standard Meter” in: Chiu, H. Y. and W. F. Hoffman (eds.), Gravitation and Relativity, W. A. Benjamin, New York, 40–64. Mittelstaedt, P. (1970), Klassische Mechanik, 1st ed. (1995), 2nd ed. BI-Wissenschaftsverlag, Mannheim. Mittelstaedt, P. (1976/89), Der Zeitbegriff in der Physik, BI-Wissenschaftsverlag, Mannheim. Piron, C. (1976), Foundations of Quantum Physics, W. A. Benjamin, Reading, Massachussetts. Poincar´e, H. (1898), “La Mesure du Temps” in: Revue de M´etaphysique et de Moral VI, 1–13. Popper, K. (1963), Conjectures and Refutations, Routledge & Kegan Paul, London. Popper, K. (1972), Objective Knowledge: An Evolutionary Approach, Clarendon Press, Oxford. Scheibe, E. (1997/99), Die Reduktion physikalischer Theorien, Vols. I and II, Springer, Heidelberg. Schr¨oter, J. and U. Schelb (1992), Reports on Mathematical Physics 31, 5. Schr¨oter, J. and U. Schelb (1994), “A Space-Time Theory with Rigorous Axiomatics” in: Majer U. and H.-J. Schmidt (eds.), Semantical Aspects of Spacetime Theories, BI-Wissenschaftsverlag, Mannheim. Sol`er, M.-P. (1995), “Characterisation of Hilbert Spaces by Orthomodular Lattices” in: Communications in Algebra 23 (1), 219–243. Sudarshan, E. C. G. and N. Mukunda (1974), Classical Dynamics: A Modern Perspective, John Wiley & Sons, New York. Weingartner, P. (2000), Basic Questions on Truth, Kluwer Academic Publishers, Dordrecht.
FUNCTIONS OF INTUITION IN QUANTUM PHYSICS Brigitte Falkenburg Universit¨at Dortmund, Germany
According to Kant, intuition is a non-logical mental form of representation, one that serves partly to idealize conceptual contents and partly to render them concrete. In performing this task, intuition fulfills the semantic function of giving sense and significance to formal concepts. With reference to Kant, Bohr and Heisenberg demand that abstract quantum theory be interpreted intuitively by means of complementary classical descriptions of phenomena. This interpretation rests on Bohr’s correspondence principle, which gives sense and significance to abstract quantum theory. Since today’s physics does not have consistent axiomatic foundations, Bohr’s principle is required as a bridge principle, in order to give a physical interpretation of quantum mechanical states with recourse to classical magnitudes. In the case of the Feynman diagrams of quantum field theory, by contrast, intuition merely has the function of making a complex theoretical calculus comprehensible by means of iconic representation.
1.
Pure intuition as form of representation
The foundations of Kant’s theory of intuition are not restricted to a Euclidean structure of space-time. They are indeed very general. For Kant, pure intuition is a form of mental representation, according to which our consciousness arranges mental phenomena.1 This form of representation is non-conceptual. It produces singular rather than general representations. These singular representations are either coexistent or successive. In the first case, the intuition is spatial, while in the second, it is temporal. The characteristics, according to which Kant distinguishes space and time as pure forms of intuition from concepts, first appear in the Dissertation of 1770. For the identification of the representation of space or of time with a pure intuition, two characteristics are necessary and sufficient:2 Space and time are 267 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 267–292. © 2006 Springer. Printed in the Netherlands.
268
Intuition and the Axiomatic Method
(1) a priori: they are not obtained through abstraction, but are prerequisite to all impressions of sense or sensations; and (2) singular: they are unlimited individual representations, the limitations of which are not contained under, but within them. As singular representations a priori, space and time are pure intuitions. According to (1), the representation of space or of time is not an empirical concept, while according to (2), it is also not a singular concept in the Leibnizian sense of a complete concept or in the modern sense of a definite description, but rather a boundless singular representation that contains limited individual representations within it. The remarks about the nature of space and time in the Dissertation run parallel to the Transcendental Aesthetic of the CPR and also correspond to the division of the types of representation in the lectures on logic from 1772 onward.3 All of these passages show that, from 1770, intuition is for Kant an individuating faculty, which, in contrast to the understanding, does not furnish general, but rather individual representations. The representation of space, being of primary interest in this regard, is further characterized in 1770 as follows:4 (3) it is pure intuition, in which the axioms, postulates and problems of geometry can be represented in concreto, as can the three dimensions of space as well as the spatial orientation of right-left-asymmetric bodies such as a screw; (4) it is not objective and real, but subjective and ideal — a schema of the coordination or correlation of sense perceptions. (5) it is the formal condition of the possibility of our sense perceptions; and hence it is the absolutely first formal ground of empirical truth. According to (3), the representation of space is a medium of the representation of geometrical figures and properties in concreto. According to (4), it is neither an object nor a property nor a relation — in Kant’s words: “neither a substance nor an accident nor a relation”5 —, but rather an epistemic faculty to produce the order of the coexistence of correlated representational contents. A comparison with the later parallel passages makes it clear that (1)–(2) are already sufficient for characterizing the representation of space and time in accordance with (3) as pure intuitions and that (4) and (5) are likewise not independent of (1) and (2).6 Kant’s concept of intuition must be generalized, if it is to become fruitful for modern philosophy of science. As Neo-Kantianism already stressed, Kant’s concept must be detached from a theory of space and time bound to Euclidean geometry and a finitistic arithmetic. Cassirer proposed to generalize the metric
Functions of Intuition in Quantum Physics
269
magnitudes that lie at the foundation of physical geometry, while the early Reichenbach advocated the relativization of Kant’s a priori.7 In the empiricist philosophy of science of the 20th century, however, a logicist interpretation prevailed, according to which intuition is a now superfluous surrogate for certain formal achievements of modern logic, in particular for the quantification over infinite domains of individuals, which were not yet available to traditional logic.8 Kant, however, teaches that intuition performs certain epistemic and semantic tasks that cannot be performed by formal logic and the axiomatic method of the exact sciences. In order to bring out this point, one must ask, on the basis of (1)–(2), what specific functions Kant ascribes to intuition as a nonconceptual or non-logical form of mental representation in the exact sciences. In this way, one will obtain a concept of intuition that is sufficiently general to overcome the limitations of Kant’s theory of nature and sufficiently specific to capture certain non-logical aspects of theory formation in the exact sciences. According to (1), pure intuition is a form of representation a priori. This is an epistemic feature, which intuition shares with the pure concepts of the understanding.9 It contains two things. (1.1) Intuition is formal rather than material; thus, it belongs to the area of logic and mathematics. (1.2) The mental faculty, on which it rests, does not consist in abstraction, but rather serves construction. For Kant, the distinguishing features (2) for intuitions and concepts mark the boundary between mathematics and logic. (2.1) The primary feature of a non-logical form of representation is the fact that it is not general, but singular. (2.2) As a singular form of representation that is unlimited, intuition does not contain its limitations under itself, but within itself. As a non-logical form of representation, intuition has altogether four characteristic features: it is (1) a priori: (1.1) formal, (1.2) serves construction; (2) concrete: (2.1) singular, (2.2) contains its limitations within itself. According to (2.2), intuition contains a multiplicity or manifold within itself. Thus, it is the basis for quantitative concepts. At the same time, since Kant regards sets as concrete multiplicities, intuition fulfills a similar role in his theory of nature as does the concept of a set in modern mathematics. Caution is called for, however, in this comparison. The relationship between the spatial or temporal intuition and its limitations, i.e. parts of space and periods of time, is, according to Kant, not an abstract relation of a membership of elements, but rather a parts-whole relation between concrete entities.10 For this reason alone, it is problematic to regard the function of intuition in Kant as an inadequate
270
Intuition and the Axiomatic Method
surrogate for achievements of modern mathematical logic. Rather, intuition has the semantic function of furnishing a concrete, still unstructured manifold, from which geometrical and arithmetical domains of individuals may be produced through suitable limitations. As a singular a priori form of representation, intuition for Kant has two constructive aspects that are constitutive of the conceptual grasp of objects of cognition. It serves (i) to structure perceptual contents through ordering relations and (ii) to schematize pure concepts of the understanding. The schema of a concept is an instruction on how to generate in intuition the representation of a concrete object to which a concept refers. Through the schematization of concepts, pure intuition thus makes it possible to construct a formal yet concrete domain of individuals, within which logical structures may be modelled formally and interpreted empirically. This is clearly a semantic function. In mathematics, this domain of individuals is obtained through limitations of space and through iteration in time. It consists of geometrical objects as well as of numbers. In physics, the mathematical construction is supplemented by an empirical interpretation. The function of space and time to produce ordering relations of coexistence and succession, on the other hand, seems at first glance to be exclusively of a logical nature. In the Dissertation of 1770, however, Kant makes the semantic aspects of these ordering relations the basis of the argumentation. From his perspective, the relations of coexistence and succession are always already interpreted in spatio-temporal terms. With regard to time, Kant emphasizes that, without the representation of time, one does not know what is earlier and what is later. The direction of the arrow of time is thus regarded as produced through pure intuition, while from a logical point of view, it is mere convention. In a similar sense, with regard to space, Kant falls back on his argument of the incongruent counterparts, on the basis of which he demonstrated in 1768 that space and time cannot be reduced to relations of real things.11 The spatial orientation of geometric bodies with a sense of helix is for him not a relationally definable property, but rather an absolute property that lies a priori in our spatial intuition of such a body. From a logical point of view, on the other hand, the orientation of a body in three-dimensional Euclidean space is determined only up to the arbitrarily fixed sense of rotation of the coordinate system.12 The distinctions of earlier and later as well as of right-hand and left-hand rotation are intrinsic properties possessed, according to Kant, in concreto by space and time and all spatio-temporal entities.
2.
Functions of intuition
The general concept of intuition provided above is obviously not subject to the limitations of the theory of space and time defended by Kant, a theory that was bound to Euclidean geometry as well as Newtonian physics. Here, intuition is only defined as a non-conceptual form of representation performing
Functions of Intuition in Quantum Physics
271
certain non-logical tasks of representation. Within the framework of modern semantic approaches, I now want to work out what this means. For this purpose, I shall first compare Kant’s distinction of intuitions and concepts, or of intuitive and discursive representations, with semiotic distinctions made by Peirce. Subsequently, I shall map the semantic functions ascribed by Kant to intuition onto the modern semantic categories of sense and reference or of intension and extension, as they derive from Frege and Carnap. In his theory of signs, Peirce distinguishes between icon, index and symbol.13 An icon is a sign that represents the signified by virtue of its concrete properties. It serves in the pictorial or figurative representation of objects of a certain type. In an idealizing or typifying manner, an icon depicts certain structural traits of the represented object in a concrete diagram or model. An index is a sign that stands in a causal relation to what is signified. A symbol, by contrast, is a sign that represents the signified according to an abstract idea. It denotes on the basis of an arbitrary assignment to objects of a certain type, which structurally do not have to resemble the sign in any respect. A symbol serves merely to label an abstract class of objects. This distinction between icon, index and symbol corresponds to Kant’s distinction between pure intuition, empirical intuition and concept. A sign that is an icon provides intuitive representation by means of the form of a picture; a sign that is an index refers to an empirical content of intuition; a sign that is a symbol provides logical representation by means of a concept. Iconic and symbolic representation are formal, and this means typifying. An index sign, by contrast, refers to the material of intuition. It has reference in empirical reality. According to Kant, pure intuition makes possible a formal yet concrete representation of objects through singular representations that depict certain traits of these objects in an idealizing manner. The formal yet concrete representation of an object in intuition is iconic. Conversely, the understanding performs the subsumption of singular representation under abstract concepts denoting classes of objects. The representation of an object through a concept is symbolic. Thus, pure intuition serves iconic representation and stands in contrast to symbolic representation through concepts. The specifically non-logical function of pure intuition as an iconic form of representation in turn has two aspects. On the one hand, pure intuition serves to typify, on the other hand to concretize. In accordance with the first aspect, Parsons regards intuition as a faculty of “type perception”.14 This definition not only takes up the idealizing character of iconic representation, but also the receptive character that intuition has as a non-discursive cognitive faculty. Following Parsons and my preceding analysis of Kant’s concept of intuition, one could also say: intuition is partly a receptive faculty of the perception of structure and partly a constructive power of the formal generation of structures in concreto. As “type perception”, it serves in idealization. As a constructive power, it fulfills the semantic function of lending concrete meaning to abstract concepts.
272
Intuition and the Axiomatic Method
A form of representation that performs a typifying, yet figurative representation of objects and thus gives concrete meaning to abstract concepts, can obviously serve in model formation in the exact sciences. Indeed, starting from Kant’s theory of nature, it can be shown that this is a decisive semantic function of intuition in the modern sciences. Kant himself distinguishes three stages in the process of rendering a concept concrete: the schematism of the pure concepts of understanding, the principles of the pure understanding and the foundations of physics in the Metaphysical Principles of Natural Science. At the first stage, he formulates for each category a temporal schema, i.e. a rule for generating concrete instances of the categories in space and time. At the second stage, he provides general laws of nature, according to which mathematics can be applied to natural phenomena, that is, according to the principles that intuitions are extensive magnitudes and perceptual contents intensive magnitudes. At the third stage, we are given the construction of an example in concreto, which shows that the general laws of nature obtained from the schematized categories are not empty. Only at this stage do we have a concrete model of the general theory of nature with a spatio-temporal and empirical interpretation.15 In this process of concretization, according to Kant, intuition has the role of giving sense and significance to the pure concepts and principles of the understanding. It must be noted that Kant uses the terms sense (Sinn) and significance (Bedeutung) quite indiscriminately.16 When speaking of Sinn and Bedeutung, one should, following Frege’s and Carnap’s semantics, distinguish between sense (intension) and significance or better reference (extension).17 Sense, according to Frege, is the way in which something is given, for example, through a certain way of denoting an object, while reference lies in objects, ranges of functions or truth values. In Kant, sense and significance may be distinguished in this way at least to an extent. Sense and reference correspond to the traditional distinction between the content of a concept and its extension. The content of a concept consists traditionally of the predicates which define it, while the extension is the class of objects falling under these predicates. Kant’s requirement of filling the abstract concepts of the understanding with sense and significance does not respect this distinction explicitly. However, Kant presumably associates the “sense” of a concept literally with its sensible conceptual content which is given as an empirical intuition, while he explicitly defines its “significance” in terms of the relation to an object.18 Kant’s stages of making concepts concrete lead to the following distinction of sense and significance, or reference. (1) The sense of formal concepts is constituted by schematizing the categories in time, and by identifying intuitions and perceptual contents with extensive or intensive magnitudes. (1.1) The schematization of a concept aims at the way in which a conceptual content is given in space and time. Through the associated schema, an abstract concept is interpreted in spatio-temporal terms, e. g., ‘substance’ in the sense of something persisting in space and time. This also brings the above-mentioned non-eliminable intensional aspects of spatial orientation as well as the tempo-
Functions of Intuition in Quantum Physics
273
ral order of ‘earlier’ and ‘later’ into play. The spatial order and the orientation of coexisting things and the temporal order of successive events are the ways in which appearances, which we want to grasp under concepts, are given to us in space and time. (1.2) To identify intuitions with extensive magnitudes and perceptual contents with intensive magnitudes, on the other hand, serves to mathematize the appearances. In this way metric concepts are established. Extensive magnitudes, according to Kant, are magnitudes where the representation of the parts precedes the representation of the whole.19 They are quantitative representations constructed successively in space and time, and they make it possible to apply geometry and arithmetic to natural phenomena. Intensive magnitudes in turn have degrees. They can be compared with one another on the basis of experience and they are quantified in analogy to extensive magnitudes.20 Extensive and intensive magnitudes make it possible to measure the phenomena. (2) The reference of the concepts and principles of the pure understanding is finally guaranteed by the construction of a spatio-temporal and empirical model. Only at this stage of making the transcendental concepts and principles concrete, according to Kant’s theory of nature, can we speak of a concrete domain of individuals, to which the pure concepts of the understanding refer. Here too, one can in principle distinguish two stages: (2.1) the formal modelling of individuals in space and time and (2.2) their empirical interpretation by means of an empirical concept of matter. Pure intuition thus fulfills three semantic functions, according to Kant. It serves in the spatio-temporal interpretation of formal concepts, the quantification of phenomena and the individuation of objects of cognition. The third function also has an empirical aspect, the empirical interpretation of the individuals constituted in space and time. From a modern perspective, these functions may be expressed by means of the Frege-Carnapian distinctions between sense (intension) and reference (extension) as follows. The first function aims at the way in which conceptual contents are given in space and time, that is, at Fregean sense. The second function concerns the operational and axiomatic sense of metric concepts. According to the empiricist theory of measurement, through the use of extensive and intensive magnitudes, an axiomatic structure with an operational basis is superimposed on the phenomena, which serves to represent quantitative relations.21 According to Hilbert and Bernays, an axiomatic structure defines metric concepts implicitly (and not, as Frege requires, explicitly).22 To establish an extensive or intensive magnitude operationally by means of a measurement, according to Carnap, is to provide a pragmatic anchoring of the intension of a concept.23 And finally, to individuate conceptual contents through the construction of a spatio-temporal and empirical model aims at reference or extension. Thus, according to Kant, intuition constitutes (1) sense (intension) through: (1.1) schematization: the spatio-temporal interpretation of formal concepts,
274
Intuition and the Axiomatic Method
(1.2) quantification: the metric properties of extensive and intensive magnitudes; (2) reference (extension) through: individuation: construction of examples in concreto, i.e. of (2.1) spatio-temporal models with (2.2) empirical interpretation. In the following, I shall demonstrate that formal methods cannot completely shoulder these semantic functions of intuition in modern physics, even when they are supplemented by empirical rules of correspondence. To this day, one comes up against persistent limits of logic and the axiomatic method at the interface of classical physics and quantum physics. These limits are revealed in particular in the physical interpretation of formal concepts, and they are due to the current pluralism of theories.
3.
Intuitiveness according to Bohr and Heisenberg
Today’s physics at the large scale and at the small scale cannot be reduced to a unified axiomatic basis. The physical content of fundamental theories therefore rests to a large extent on bridge principles that do not function in the sense of empirical correspondence rules.24 In order to be able to apply the formal concepts of a quantum theory to experimental phenomena, one must interpret them in terms of physical magnitudes — and these generally derive from classical physics, even though classical and quantum concepts are incompatible, or incommensurable in the sense of Thomas S. Kuhn. Bohr and Heisenberg in particular devoted themselves to this problem of giving a physical interpretation to quantum theory. In their endeavor, they fell back on a concept of intuitiveness, which, while it is partly modelled on Kant, also tries to weaken the semantic functions of intuition by contrast with Kant. Their basis was Bohr’s correspondence principle, which works as an indispensable semantic bridge principle in quantum physics. If one refers to classical physics as intuitive and quantum physics as nonintuitive, one usually means the following. We are able to represent the objects of classical mechanics (say, billiard balls or planets) as objects in space and time with well-defined properties, but this is not the case with the intended objects of reference of quantum mechanics, i.e. subatomic constituents of matter such as electrons or protons. “Intuitiveness” means something like “concrete representability of objects in space and time”, i.e. what is meant are the referential aspects of intuition in Kant’s sense. Intuitiveness in this context implies in particular that a classical particle is reidentifiable in terms of its space-time trajectory. Accordingly, “non-intuitiveness” means that the quantum descriptions of systems reveal a referential deficiency, since they are incompatible with the existence of a classical space-time trajectory.
Functions of Intuition in Quantum Physics
275
The founders of the Copenhagen interpretation of quantum theory regard the system descriptions of quantum mechanics in yet another respect as nonintuitive. According to Bohr and Heisenberg, quantum theory requires an interpretation in terms of intuitive classical concepts such as ‘position’ and ‘momentum’. Bohr in particular holds to the idea that the language of classical physics is needed, in order to make a quantum theory apply to quantum phenomena. Abstract quantum mechanics in Hilbert space is for Bohr an abstract and symbolic formalism without sense and significance (in a similar way as, for Kant, the pure, non-schematized categories are). This is why Bohr and Heisenberg strove for the “intuitive” interpretation of abstract quantum mechanics. The underlying concept of intuitiveness follows the tradition of Kant, Helmholtz, Hertz and Mach. Bohr attaches intuitiveness to the existence of spatio-temporal pictures25 , while Heisenberg regards it as tied to the qualitative understanding of the experimental consequences of a theory.26 Both agree that the failure of spatio-temporal pictures or intuitive referential hypotheses in the domain of quantum theory must not lead to a renunciation of every intuitive interpretation of quantum physics. According to Heisenberg, the intuitive classical concepts must be operationally reinterpreted in quantum physics, something that applies equally to the concept of a trajectory as it does to the concepts of position and momentum. This corresponds to the common operational interpretation of quantum mechanics, the probabilistic interpretation of Born and von Neumann, which (in the limit of infinitely many measurements) assigns the relative frequency of the results of position measurements to the squared amplitude of the wave function | Ψ(r) |2 27 or the relative frequency of individual measurement results to the quantum mechanical expectation value of a magnitude.28 The quantum mechanical wave function itself remains uninterpreted. It has no reference. Bohr’s interpretation of complementarity is based on Heisenberg’s operational reinterpretation of the intuitive classical concepts. According to Bohr, in subatomic physics, the classical pictures of a wave or of a particle still have a limited validity at a phenomenological level. They can be used as complementary descriptions of phenomena, which, taken together, provide a “natural generalization” of the classical description.29 From Bohr’s point of view, one must maintain the classical language in such a description, although, due to Heisenberg’s indeterminacy relations, it is not possible to construct objects in a classical sense from subsequent position and momentum measurements.30 Consequently, in the transition to quantum physics, the semantic unity of physics is partially, but not completely lost. It rests, according to Bohr, no longer on the representability of physical objects in intuition, but rather only on the language of classical physics, in which all measurement results of subatomic physics must be expressed. The classical language describes quantum phenomena by means of complementary concepts that mutually exclude and complement each other in the application to quantum phenomena.
276
Intuition and the Axiomatic Method
Bohr’s complementarity interpretation of quantum mechanics has an extensional and an intensional aspect. The language of classical physics comprises two types of terms: referential expressions such as “particle” and “wave” and concepts of physical magnitudes such as position and momentum. The former constitute reference; they are extensional, insofar as they refer to concrete objects in space and time. The latter constitute sense; they are intensional, insofar as they stand for physical properties. Both types of classical concepts fail when applied to fictitious quantum objects, but they may in an operational interpretation still be applied to quantum phenomena. The referential expressions “particle” and “wave” denote complementary quantum phenomena with corresponding incompatible classical models. Bohr and Heisenberg speak of classical “pictures”. According to Bohr, the magnitudes of position and momentum serve in the spatio-temporal or causal description of subatomic processes. He regards both descriptions likewise as complementary. The spatio-temporal description is given in terms of position and time, the causal description in terms of momentum and energy. At the level of quantum phenomena, an intuitive description of nature can be saved in this manner. In an operational reinterpretation, the classical concepts give intuitive sense and intuitive reference to the formal concepts of quantum theory. Compared to classical physics, however, the semantic functions of intuition are drastically limited in two respects. They can now only be exercised in a complementary way, and they are no longer reconcilable with the reference to unique causes of the phenomena. The functions providing sense and reference, however, as Kant had ascribed to intuition, essentially reappear at the level of quantum phenomena. Aside from the spatio-temporal description, Bohr also regards the causal description given in terms of momentum and energy as one of the “intuitive” complementary descriptions of phenomena. To interpret quantum mechanics in terms of intuitive complementary descriptions of the phenomena establishes (1) sense (intension) through: (1.1) operational concepts (Heisenberg): operational meaning of physical magnitudes such as position and momentum; (1.2) complemetary schemas (Bohr): either spatio-temporal description of quantum phenomena in terms of position and time, or causal description in terms of momentum and energy; (2) and reference (extension) through: individuality (in Bohr’s sense): (2.1) application of the classical particle or wave picture to
Functions of Intuition in Quantum Physics
277
(2.2) complementary quantum phenomena, which Bohr calls “individual”. 31 Here, sense and reference are established at the level of quantum phenomena to interpret abstract quantum theory in terms of complementary concepts that stem from classical physics. Through the application of the classical language to quantum phenomena, a partial semantic unity of physics is thus achieved, but it is obtained at the price of referential disunity. This disunity shows itself horizontally at the level of quantum phenomena as well as vertically in the relation between classical and quantum mechanical system descriptions.32 The complementary classical pictures of wave and particle are incompatible for one and the same quantum system under given experimental conditions. The intuitive referential hypotheses of classical physics cannot be applied to the description of quantum systems. The background of the complementarity interpretation is Bohr’s correspondence principle, which, prior to the quantum mechanics of 1926, in old quantum theory, served in the application of classical concepts to the subatomic domain. It was a central principle of model formation in old quantum theory. For quantum mechanics, Heisenberg in 1930 gave a generalized version. His generalized correspondence principle refers to the inter-theoretical relations holding between the classical theories and the quantum mechanics of 1926: In its most general version, Bohr’s correspondence principle states that between quantum theory and the classical theory belonging to the respective picture employed, there exists a qualitative analogy that can be carried out in detail. This analogy not only serves as a signpost for finding formal laws, rather, its particular value lies in the fact that it provides at the same time the physical interpretation of the discovered laws.33
According to this, the concepts of classical physics stand in a precise correspondence in terms of mathematical structure and physical content to the concepts of quantum mechanics. On the one hand, this correspondence has a heuristic function in the search for the new theory. With the help of quantization rules, one is able to construct quantum mechanical models and laws from a classical theory, where the statements of the classical and the quantum theory stand in well-defined relations of approximation. On the other hand, the generalized correspondence principle has a semantic function in providing the “physical interpretation” of an abstract quantum theory and its models. The principle states that the abstract quantum theory obtained through the quantization of a classical theory may be interpreted at least partially in terms of the familiar classical concepts. The “qualitative analogy” between quantum mechanical and classical descriptions ensures that quantum mechanics is intuitive in the sense required by Heisenberg, i.e. that its experimental consequences can be understood on an operational basis. Conventional wisdom among many physicists has it that the generalized correspondence principle is merely a heuristic rule for the lead up to a quantum
278
Intuition and the Axiomatic Method
theory, which one can then throw away like a ladder that has had its day, as soon as one has obtained the formalism of the quantized theory. The analysis of prominent applications of quantum mechanics shows, however, that it is needed to the present day as a bridge principle, in order to obtain concrete physical system descriptions from abstract quantum mechanics.
4.
Semantic bridging functions of the correspondence principle
Heisenberg ascribes to the generalized correspondence principle the function of giving a physical interpretation of abstract quantum mechanics. This means in particular that the formalism is interpreted in such a way that it can be applied against the background of classical physics and on the quasiclassical conditions, under which the experiments of atomic, nuclear and particle physics are performed. The principle is a bridge principle connecting quantum mechanics (and more recent quantum theories such as quantum field theory) with the classical theories. Unlike a rule of correspondence in the sense of empiricist philosophy of science, the principle does not serve to assign concepts of observation to theoretical concepts. Rather, it creates quantitative and qualitative relations between two theories. It establishes quantitative relations of approximation between certain statements and models of a classical and a quantized theory. In so doing, at the same time it links the language of classical physics with the language of a quantum theory. In the following, I want to show that this bridge principle does indeed constitute sense and reference — and the latter not only with respect to quantum phenomena, but also with respect to their non-observable causes, i.e. atoms and their constituent parts. Its semantic strength turns out to be greater than one might at first suspect given the usual operational understanding of the Copenhagen interpretation of quantum mechanics. Neither the physical sense of the magnitudes entering into the laws of a quantum theory nor the quasi-classical conditions of application of a quantum theory of subatomic structure can be obtained from abstract quantum theory. In this respect, abstract quantum theory is without sense and reference, just as Bohr believed. Only the generalized correspondence principle makes its physical interpretation possible. In so doing, the principle fulfills several semantic bridging functions. In particular, it is constitutive of (1) sense (intension) through (1.1) giving a quasi-classical interpretation of the observables: The generalized correspondence principle establishes formal and
Functions of Intuition in Quantum Physics
279
qualitative analogies between quantum concepts and classical observables. (1.2) standardizing the metric framework: It enables us to construct standardized scales of physical magnitudes, i.e. for length, time and mass. (2) reference (extension) through individuation on classical conditions: (2.1) It supports semi-classical models of bound quantum mechanical systems within an atomic lattice. (2.2) It enables us to explain scattering experiments for the measurement of subatomic structures. Starting from Bohr’s original version of the correspondence principle, I now want to analyze these semantic functions more precisely. Here, we shall have to examine to what extent these are semantic achievements of intuition in Kant’s sense.
4.1
Quasi-classical interpretation of the observables
Bohr’s original version of the correspondence principle referred to the frequencies of atomic radiative transitions.34 According to the quantization rule introduced ad hoc through Bohr’s atomic model of 1913, the electrons in the atomic shell are ‘forbidden’ from releasing energy by radiation like classical charges. Otherwise, they would crash into the atomic nucleus within a very short time. According to Bohr’s quantum postulates, atoms only emit radiation, when the electrons jump from a quantized energy level to another level with lower energy. The quantized radiative transitions violate classical radiation theory. The correspondence principle now establishes an analogy between classical and quantized radiation. This analogy refers to the frequency of the light that atoms radiate. It states that for the limiting case of high energy levels, the light frequency in an atomic radiative transition corresponds to the radiative frequency of a classically charged particle orbiting around a charge center. In 1920, Bohr formulated the correspondence principle as follows: . . . there is found to exist a far-reaching correspondence between the various types of possible transitions between the stationary states on the one hand and the various harmonic components of the motion on the other hand. This correspondence is of such a nature, that the present theory of spectra is in a certain sense to be regarded as a rational generalization of the ordinary theory of radiation.35
The correspondence ensures that the quantum theory of radiation can in a certain sense be joined seamlessly to classical electrodynamics. It permits the ap-
280
Intuition and the Axiomatic Method
plication of the concepts of classical radiation theory to atomic spectra through analogization. For Bohr to be able to regard the theory of the atomic spectra in this sense as a rational generalization of the classical theory of radiation, the analogy between classical and quantized radiation must have two aspects, a formal aspect and a qualitative aspect. The former concerns formal expressions, which correspond approximately, while the latter concerns the qualitative interpretation of these expressions in terms of the of the same kind of magnitude. Formal analogy: The analogy consists in a quantitative relation of approximation between the quantum and the classical law for electromagnetic radiation. It states under which formal conditions (n → ∞, ∆ν → 0) the quantized frequencies of radiation of subatomic physics pass over approximately into the classical frequencies. Qualitative interpretation: The analogy fills a quantum concept — the formal expression ∆E = hν for the radiation frequencies of an atom — with classical physical sense. It states that the discontinuous radiation frequencies (and light wavelengths) from the line spectrum of an atom represent the same type of physical property as the continuum of the radiation frequencies (or electromagnetic wavelengths) of oscillating classical charges. Only both aspects taken together permit to relate the frequencies (and wavelengths) beyond the boundary between quantum theory and classical electrodynamics to a homogeneous class of physical properties — and conversely allow for the non-optical spectrum of electromagnetic microwave and longwave radiation to be regarded as an extension of the optical spectrum of light emitted by atoms. Bohr’s correspondence principle is not an empirical rule of correspondence, which assigns the empirical concept of a line spectrum to the formal law of radiation ∆E = hν that associates the electromagnetic wave emitted by an atom with the quantum jumps between the permissible electron orbits.36 It does not tie together the concepts of a theoretical language and an observational language, but rather establishes a quantitative and qualitative relation between classical radiation theory and quantum theory that allows for the continued use of the classical concepts of frequency and wavelength in the atomic model of Bohr and Sommerfeld. Even if the correspondence principle had the status of an internal principle in old quantum theory, it rests on an inter-theoretical relation. Its basis is a bridge principle that has a heuristic function in theory formation on the one hand, while it allows for the physical interpretation of the models of old quantum theory on the other. In 1930, Heisenberg generalizes the correspondence principle in such a way that the formal and the qualitative analogies between classical and quantized radiation frequency may be extended to all physical magnitudes.37 The generalized correspondence principle is thus a semantic principle of continuity that
Functions of Intuition in Quantum Physics
281
ensures that the predicates for physical properties such as position, momentum, mass, energy etc. can also be defined in the domain of quantum mechanics, and that one may interpret them in an operational way on the basis of classical measurements. It provides a great number of inter-theoretical relations, by means of which the formal concepts and models of quantum mechanics can be filled with intuitive physical sense. This physical sense in turn has two aspects: one formal and one qualitative. As in Bohr’s original version, the formal sense lies in approximation relations between classical and quantum laws that define formal concepts. The sense, which a formal concept obtains through an implicit definition, is axiomatic. The sense, which a quantum-theoretical concept obtains through the formal correspondence with a classical concept, however, cannot be axiomatic, if both concepts have incompatible axiomatic foundations. In the framework of a formal approach to theory reduction, of course, classical and quantum concepts may be partially joined through a standard method of observable identification that confirms correspondence.38 The qualitative sense of quantum concepts provided by the correspondence principle is determined operationally, according to Bohr and Heisenberg. It lies in the manner, in which the corresponding magnitudes are measured with a classical (or quasi-classical) measurement method, that is, in the manner, in which the intension of a concept is pragmatically anchored, according to Carnap. Bohr and Heisenberg have called this qualitative sense “intuitive”. As an operational meaning of physical magnitudes, it in the end results, according to Kant, from the way in which extensive and intensive magnitudes are generated in pure intuition. It is true that one does not need Kant’s axioms of intuition and anticipations of perception within the framework of physics in order to establish the magnitudes of a quantum theory operationally. The qualitative sense of these magnitudes, however, is not exhausted by this operational meaning. It also results from a unified theory of measurement that in turn is based on the generalized correspondence principle.
4.2
Standardization of the metric frame
In order to determine the physical properties of systems of various orders of magnitude, one defines in physics and technology standardized scales of physical magnitudes, which extend beyond the limits of validity of all current physical theories. The operational meaning of the definition of these scales always lies in a chain of measurement methods, the empirical domains of which overlap, but which rest on different, in general incompatible axiomatic theories. The physical meaning of the scales does not exhaust itself in establishing the
282
Intuition and the Axiomatic Method
magnitudes operationally through particular measurement methods, but rather lies in joining the possible results of all measurement methods into one unified physical class of properties. The dimensions of ‘length’, ‘time’ or ‘mass’, of course, are fundamental for physics on the large as well as on the small scale. In the subatomic domain, the scales extend to quantum systems, even though the associated measurement methods are in general classical or quasi-classical — for example, if one measures the wavelength of the light emitted from an atom by means of an interferometer or the mass of an atomic nucleus by means of a mass spectrograph. The application of such quasi-classical measurement methods in turn rests on the generalized correspondence principle discussed above, for the measured system as well as the measurement process are not classical. Conversely, one measures periods of time in the order of magnitude of seconds, minutes and hours on a subatomic scale, if one uses an atomic clock for a time measurement. Length, duration and mass are extensive magnitudes in Kant’s sense. In physics, it is assumed that their scales can in principle be constructed from 0 to ∞ and that they can be calibrated through arbitrarily chosen units from any part of the scales. Since current physics, however, does not have consistent axiomatic foundations, the construction of standardized scales of physical magnitudes has neither an axiomatic basis nor is it sufficiently grounded in operational meaning. From the strictly operational standpoint supported by Bridgman,39 every new measurement method strictly speaking establishes a new kind of magnitude. The assumption that a mass spectrograph and a balance as well as the angular velocities of stars serve to measure values of the same kind of magnitude ‘mass’, or that by means of an atomic clock, a pendulum or the earth’s rotation one can measure the same periods of time with various degrees of precision, obviously rests on the construction of the respective scales in intuition rather than on a secured operational and axiomatic basis. According to Kant, the scales of length and time rest on the axioms of intuition, while that of the measurement of mass is based on the schema of the persistence of substance.
4.3
Individuation on classical conditions
The generalized correspondence principle, however, does not merely provide physical sense, but also physical reference. It establishes the referential assumptions of atomic, nuclear and particle physics. To the extent to which it serves to make quantum theory concrete in a spatio-temporal model, the individuating function of intuition, according to Kant, is also at play. The referential assumptions of subatomic physics supported by the generalized corre-
Functions of Intuition in Quantum Physics
283
spondence principle refer to semi-classical atomic models with local, classical potentials. These models describe atoms in the macroscopic atomic lattice. In these models, the charge structure of a quantum mechanical multiparticle system is given a spatial interpretation. Only once this basis is established is it possible to model the scattering processes of charged particles in matter, on the basis of which Born established the probabilistic interpretation of quantum mechanics in 1926.40 Quantum mechanics in the usual operational interpretation refers to the results of measurements on certain experimental conditions. These experimental conditions are laboratory conditions. They distinguish the coordinate representation of quantum mechanics against the momentum representation. An assumption regarding locality enters into them a priori, which first makes possible the application of the theory to the concrete events in a physical laboratory. Quantum mechanics itself does not furnish a distinction of the coordinate representation, and thus also no spatially localized component parts of matter. Taken by itself, it says nothing regarding the structure of matter consisting of molecules, atoms, atomic nuclei and elementary particles in macroscopic atomic unions. It is seldom realized that the operational interpretation according to Bohr, Heisenberg and Born-von Neumann cannot be sufficient for the physical interpretation of many experiments of atomic, nuclear and particle physics. This is particularly true of scattering experiments, by means of which the spatial charge distribution within atoms and atomic nuclei is investigated. Their prototype is the classical Rutherford scattering — the scattering of α particles at a thin gold foil, during which unexpected backward scattering occurred. Rutherford interpreted this result in 1911 by means of the model of Coulomb scattering of a classical charge at a pointlike charge center. In high energy physics, this type of experiment has been conducted for decades. In such experiments, a high energy particle beam from a particle accelerator is directed at high scattering energy onto a block of matter with integrated particle detectors (“fixed target experiment”) or is crossed within a tray of particle detectors with a second particle beam (“collider experiment”). In the 1950s, they investigated the charge structure of atomic nuclei down to protons in this manner and measured form factors for non-pointlike charge structures. In 1968, in the course of electron-nucleon scattering — a new edition, so to speak, of Rutherford scattering — unexpected events were again discovered in the case of large scattering angles as well as further experimental indications for pointlike structures within the nuclear building blocks of protons and neutrons. This was interpreted as experimental confirmation of the so-called quark parton model of nucleons.41 In order to be able to interpret the results of scattering experiments in terms of quantum theory, semi-classical system descriptions of atoms are required,
284
Intuition and the Axiomatic Method
which are localized in macroscopic atomic lattices. The charge structure of an atom, on which a charged particle is scattered, is then described approximately by means of a central-symmetrical potential V (r). Deviations from Coulomb scattering are expressed by means of a form factor F (q), which equals (in the non-relativistic case) the Fourier transform of a quasi-classical charge density ρ(r), and which is interpreted as the measure for the deviation of the charge structure of a scattering center from the point shape. This description is formally compatible with a quantum mechanical atomic model, but it is not derivable from the latter. The description rests on the generalized correspondence principle and is obtained in several steps. 1. Classical representation on a non-classical probability density. Quantum mechanics describes the electron shell of a complex atom through a totally antisymmetrical many-particle wave function in abstract N -particle Hilbert space. According to the probabilistic interpretation, the square | Ψ |2 of this wave function in coordinate representation determines the probability density of the electrons in the atom. |Ψ|2 is a quantum mechanical probability density referring to the results of position measurements at the atom and normalized with respect to the number of electron charges in the atomic shell. Since there are no non-demolition measurements of the inner-atomic electron position, the correct operational meaning of |Ψ|2 is counterfactual. It means the probability density of the possible measurement results in case we would measure the position of an electron by ionization or other processes that get a single electron out of an atom. Only according to the generalized correspondence principle, |Ψ|2 is identified with a quasi-classical charge density, the effect of which is described approximately through a classical central field, in the usual approximation methods of physics. In this manner, one arrives at the description of the charge structure of an atom through a central potential V (r). Its deviation from the Coulomb potential is expressed through a classical charge distribution ρ(r). Here, ρ(r) stands in a relation of formal and qualitative correspondence to the quantum mechanical probability density |Ψ|2 for position measurements at the electron shell that would localize individual electrons. 2. Operational interpretation of the scattering amplitude. The quantum mechanics of scattering combines the classical potential V (r) with the description of a scattered particle beam through a quantum mechanical wave function φ(r). The solutions to the Schr¨odinger equation are determined in Born approximation. The diffracted spherical wave is characterized by the quantum mechanical scattering amplitude f (θ), which is a measure of the probability of the scattering of the particles in the direction θ. The square |f (θ)|2 is proportional to the characteristic observational magnitude of the quantum mechanics of scattering, the differential cross section dσ/dΩ. The operational meaning
Functions of Intuition in Quantum Physics
285
of this magnitude is based on the usual probabilistic interpretation of quantum mechanics, where dσ/dΩ corresponds to the relative frequency of scattering events of a certain type for a large number of scattered particles. 3. Correspondence to classical Rutherford scattering. For scattering at the Coulomb potential, quantum mechanics of scattering yields the formula of the classical Rutherford cross section both in Born approximation as well as in exact solution. This correspondence establishes the following correspondence interpretation of the results of scattering experiments. Measured cross sections that deviate from Rutherford scattering are interpreted in the sense of a non-pointlike charge structure. Their experimental deviation from Rutherford scattering is described in Born approximation through a form factor F (q). In the quasi-classical atomic model, F (q) is interpreted as the Fourier transform of a charge distribution ρ(r). Within certain limits, this model and its physical interpretation can be generalized to relativistic quantum mechanics. The assignment of a measured form factor to a non-pointlike charge structure of the scattering center rests on a chain of intertheoretical relations of correspondence. In the case of the form factor of the proton, this chain extends from classical Rutherford scattering to its quantum mechanical analogue and Mott scattering, all the way to Dirac scattering.42 The generalized correspondence principle is employed in the first and the last step. In both steps, a qualitative analogy in Heisenberg’s sense supports the replacement of a quantum concept by a classical concept. Firstly, the squared many-particle wave function in coordinate representation is given the sense of a spatial charge distribution, which is localized in a macroscopic atomic lattice. This charge distibution enters the quantum mechanics of scattering which is given the usual operational interpretation, secondly. Finally, the measured deviations from the calculated cross section are described through a classical form factor, in correspondence to deviations from classical Rutherford scattering. The physical interpretation of a measured form factor in terms of a charge distribution is supported by the correspondence principle as long as the Born approximation is valid, and as long as no spin and exchange effects come into play in the scattering. Both conditions demand that the scattered “probe” particles and the scattering center are separable. If a scattering experiment is performed with identical particles (i.e. proton-proton-scattering), or if the interaction is so strong that higher orders of perturbation theory can no longer be neglected, this separability condition is violated. In this case, the physical interpretation of the measurement results in terms of form factors becomes model-dependent and is no longer unambiguously possible. As long as the separability condition holds, however, the many-particle wave function of a compound system attains reference beyond the usual operational interpretation of
286
Intuition and the Axiomatic Method
quantum mechanics. It refers to a non-pointlike scattering center which interacts like a classical charge distribution. The physical interpretation of quantum mechanics of scattering by means of the generalized correspondence principle makes it possible to individuate the subatomic constituents of a macroscopic atomic lattice in terms of localized compound systems, even if it is no longer possible to individuate all of their constituent parts. The interpretation of the quantum mechanical multiparticle wave function in the sense of a spatial charge distribution, which is localized in a macroscopic atomic union and is described through a measurable classical form factor, takes place only via the correspondence principle. This bridge principle establishes the semi-classical models needed for applying quantum mechanics to macroscopic atomic lattices. Only in terms of correspondence, the quantum mechanics of many-particle systems attains reference. Here, the correspondence principle makes abstract quantum mechanics intuitive in a sense that goes beyond Heisenberg’s and Bohr’s operational view of the meaning of subatomic magnitudes. The intuitive interpretation of abstract quantum theory by means of the generalized correspondence principle rests on the individuating function of intuition in Kant’s sense. This function still supports the localization of a bound quantum mechanical many-particle system in the macroscopic atomic lattice, even if it is no longer possible to individuate its individual constituent parts. To this extent, the semi-classical atomic model founded on correspondence still serves in a concrete spatio-temporal representation of the constituent parts of matter, and it makes possible iconic representations of the subatomic charge structure. The usual representations of the charge structure of atoms or molecules found in writings of popular science or in schoolbooks are based on semiclassical approximation methods of the quantum mechanics of many-particle systems. The charge structure of molecules, as it is depicted in chemistry textbooks, can also be calculated with semi-classical methods. In the experimental analysis and pictorial representation of the charge structure of individual atoms in a crystal by means of an electron microscope, the correspondence of quantum mechanical probability density and classical charge distribution is also presupposed. And thus, the generalized correspondence principle is also at the basis of the metaphorical language usage, according to which one can ‘see’ the spatial structure of individual atoms within a solid state body by means of an electron microscope, or the inner structure of the atomic nucleus as well as its constituent parts by means of a particle accelerator. ‘Seeing’ here means: depicting a structure that is interpreted, in accordance with the correspondence principle as a spatial structure.43
Functions of Intuition in Quantum Physics
5.
287
Comprehensibility and iconic representation
The semantic bridging functions of the generalized correspondence principle thus have a larger extension than Bohr’s and Heisenberg’s conception of the “intuitiveness” of physical concepts, and they can be traced back to semantic functions of intuition according to Kant. They (and with them the interpretation of quantum phenomena in complementary classical terms) do not extend nearly as far, however, as Bohr’s complementarity philosophy suggests. But even when one tries to make quantum phenomena without classical correspondence intuitive, one still makes use of a concept of intuitiveness that stands in a Kantian tradition. This concept is the comprehensibility of cognition, which is based on examples in concreto and is achieved through, among other things, iconic representation. Modern popularizations of quantum theory attempt to make genuinely nonclassical concepts and typical quantum phenomena intuitive, e.g., the concept of spin and the double slit experiment or the Einstein-Podolsky-Rosen correlations. They make flexible use of informal concepts,the origin of which in classical physics is evident. This is especially true of the talk of subatomic particles — even though the particle concept of classical physics is completely inadequate in particular with regard to the referential objects of a quantum field theory. The expression ‘particle’ denotes, for example, the energy or charge of a quantized field, which are localized in discrete quanta through a particle detector. The expression ‘particle track’ denotes the results of repeated position measurements, and the expression ‘virtual particles’ means contributions, which in principle cannot be isolated experimentally, to the perturbational expansion of an abstract scattering amplitude. The question, to what extent the generalized correspondence principle still supports this language, or to what extent these expressions still have physical sense and reference, must be settled in each particular case. At the basis of the pictorial quasi-classical language of physicists, there is obviously an understanding of intuitiveness that is substantially weakened compared to Kant’s or Bohr’s requirement that the contents of our physical theories be representable in operational concepts and concrete models. But even this weakened understanding of intuition can be rendered more precise with reference to Kant. In the introduction to his lectures on logic, Kant presented the doctrine of the perfections in cognition by means of which he completed the corresponding doctrine of the Leibniz-Wolffian school according to his own epistemological principles. It comprised a canon of cognitive ideals, which he regarded partly as constitutive of cognition in general and partly as regulative principles for the expansion of cognition.44 This canon is system-
288
Intuition and the Axiomatic Method
atized in accordance with the table of categories of the CPR into (i) generality, (ii) distinctness, (iii) truth and (iv) certainty of cognition. Aside from the “logical” ideals of cognition of the rationalist tradition, this systematization also includes “aesthetic” ideals based on Kant’s theory of intuition. These concern the subjective comprehensibility of cognition. According to Kant, concepts, judgments or theories are “aesthetically general”, if they are applicable in paradigmatic cases that are generally accessible or popular. They are “aesthetically distinct”, if there are examples of them in concreto, which are given as concrete representations in intuition. They are (only) “aesthetically true”, if they have mere plausibility which in certain cases can also be deceptive. And finally, they become “aesthetically certain” through sense perception, which in physics must be tied to experimental phenomena. These four aesthetic ideals of cognition require that the concepts and propositions of a theory at least partially refer to something that is familiar, intuitive, plausible and empirically confirmed. Kant knew well that these ideals most often stand in conflict with the logical ideals of cognition, which demand that a theory describes its objects in a logically complete and adequate way. The non-intuitive models of a quantum theory may be logically true, but they lack aesthetic distinctness. A quantum mechanical many-particle system is not representable in concreto in intuition. The presentation of popular examples such as the double slit experiment serve to promote familiarity with the so-called paradoxical traits of quantum theory, but they achieve the aesthetic generality of cognitions that are non-intuitive or without aesthetic distinctness. The pictorial language of physicists, on the other hand, often merely functions as a surrogate for intuitive concepts and models of classical physics that fail in quantum physics. Pictorial expressions suggest intuitive objects. Relative to the background of classical ideas, they give the semblance of truth, i.e. they feign the reference to quantum objects as concrete objects in space and time. As was shown in the last section, such a reference, however, is only possible on quasi-classical conditions and on the basis of the generalized correspondence principle. This point can also be explained with recourse to Peirce’s theory of signs. Pictorial expressions suggest that their referential objects can be represented iconically. They can be grasped, insofar as they produce the semblance of aesthetic distinctness, without something having to correspond to them in concreto. Yet, to the extent to which the correspondence principle does not support the appropriate referential assumptions, they stand for mere icons in Peirce’s sense that do not denote anything. They are symbols that stand for pictorial representations. They lack the indexical character of signs with reference and do not refer to empirical reality.
Functions of Intuition in Quantum Physics
289
The Feynman diagrams of quantum field theory are a good example of making a theory plausible in a way that is comprehensible in the sense of an iconic representation without any corresponding concrete reference. The formalism of a quantum field theory is not only non-intuitive, but it is also extremely hard to handle. Feynman diagrams are graphs precisely denoting certain parts of it and thus making it much more manageable. They do not stand for complete physical system or state descriptions, but rather for formal expressions that belong to the perturbational expansion of quantum field theory. Their intuitiveness suggests that they represent concrete processes in space and time, even though the contributions to an interaction, which they symbolize, cannot be isolated experimentally. Every Feynman diagram stands for a formal contribution to the perturbational expansion of a transition amplitude, the square of which provides the probability for a scattering process of elementary particles. The perturbational series as a whole is represented by an infinite sum of Feynman diagrams. In each diagram, the graphical representation suggests spatiotemporal events, in which particles scatter against one another, are destroyed into vacuum and created from vacuum. In such a diagram, every particle is represented through an open or a closed line. Every line or loop can be translated, according to precise rules, into an algebraic expression that enters into the calculation of the perturbational series. In this manner, the calculation is facilitated immensely, and the intuitiveness of the iconic representation makes the operations of calculation easily graspable. The incoming and outgoing lines represent the particles scattered against one another and the products of reaction. These lines have reference in the sense of the usual operational interpretation of a quantum theory. The virtual states of the individual diagrams, on the other hand, have no operational content. The individual Feynman diagrams only serve as instruments of calculation. Whoever interprets them literally, i.e. as a representation of concrete scattering processes, goes astray. The iconic representation here indicates an abstract (axiomatic) sense without operational meaning and without reference supported by way of correspondence.
Notes 1. CPR, A 20/B 34. In the following, we are always dealing with pure intuition, if it is not indicated that empirical intuitions are meant. I do not always differentiate explicitly between intuition in the sense of a representation and in the sense of a cognitive faculty. 2. Cf. § 14, 1 and 2, Akad. 2.389 f., as well as § 15, A. and B., Akad. 2.402. 3. Cf. Akad. 9.091 with Akad. 2.399 and the “metaphysical discussion” of space and time in the CPR. 4. Cf. § 15, C., D. and E., Akad. 2.403 ff. 5. Akad. 2.403.
290
Intuition and the Axiomatic Method
6. In the two editions of the CPR, these characteristics surface as “conclusions” drawn from the concepts of space and time; B 42 ff./A 26 ff. as well as B 49 ff./A 33 ff. — By contrast with the versions of 1770 and 1781, the second edition of the CPR emphasizes as a new characteristic feature formulated independently of (2) that space and time are given infinite magnitudes. 7. Cassirer (1921), Reichenbach (1921). 8. Friedman (1992), following Russell (1903). 9. Nevertheless, Kant does not consider the representations of space and time as innate ideas, but rather as acquired through experience; cf. Kant (1770), § 8, Akad. 2.395. 10. Kant has a concretist concept of a set, in the sense of a multiplicity (= Menge) and frequently equates a multiplicity with the measured number of a magnitude. Cf., for example, his footnote on the mathematical infinite, Akad. 2.388. For this reason, he does not recognize an abstract relation of membership of elements, but rather distinguishes the parts-whole relationship obtaining between the members of a concrete multiplicity from the relationship between concrete things and logical concepts or abstract classes. If both types of relationship are confounded in the science of nature, one gets caught up in the cosmological antinomy, according to Kant. One becomes guilty of confusing the logical concept of a sum total with a concrete object. Cf. Falkenburg (2000), Chapter 5. 11. Kant (1768). On this issue, cf. pp. 157–180 and Falkenburg (2000), chapter 3. 12. Weyl (1949), end of chapter 14. 13. Cf. Peirce, Elements of Logic, §247 ff. 14. Parsons (1992). 15. CPR B 176 ff./A 137 ff. and B 202 ff./A 162 ff.; Akad. 4.478. 16. Cf. B 299/A 240: “Hence it is also requisite for one to make an abstract concept sensible, i.e. display the object that corresponds to it in intuition, since without this the concept would remain (as one says) without sense, i.e. without significance.” 17. Frege (1891); Carnap (1947). Following Peirce and Morris, modern semiotics distinguishes in a similar way between the interpretant and the object of a sign. 18. Cf. B 300/A 241. The fact that in the passage B 299/A 240 cited above Kant nevertheless does not distinguish between sense and significance is presumably due to the goal and result of making concepts concrete, i.e. the generation of objects of cognition. A concept is only rendered completely concrete, when its object is constructed in intuition in such a way that the concept has sense and significance as the concept of a concrete object. It is then no longer possible to keep its sense and its reference apart. 19. B 203 / A 162. 20. B 207 / A 106 ff. 21. Cf. Krantz et al. (1971). 22. Hilbert (1918), Bernays (1922). 23. Carnap (1947), (1966). 24. On this issue, cf. also Cartwright (1983), (1999). 25. Bohr (1926). 26. Heisenberg (1927). 27. Born (1926a, b). 28. Von Neumann (1932). 29. Bohr (1927). 30. Cf., for example, Bohr (1948), p. 315: “These so-called indeterminacy relations explicitly bear out the limitation of causal analysis, but it is important to recognize that no unambiguous interpretation of such relations can be given in words suited to describe a situation in which physical attributes are objectified in a classical way.” 31. ‘Individuality’ for Bohr means the indivisibility of an integral whole that experimentally cannot be analyzed further. Cf. Bohr (1928). Individual quantum phenomena are observable in an experiment and are thus naturally also individuated in space and time. Their non-observable causes, by contrast, are not capable of individuation.
Functions of Intuition in Quantum Physics
291
32. Cf. my analysis of the concept of complementarity in Falkenburg (1998), p. 111 ff., an analysis that takes its starting point from Darrigol (1992). 33. Heisenberg (1930), p. 78. 34. Bohr (1913). — For the following, cf. also Falkenburg (1997), (1998). 35. Bohr (1920), BCW 3, p. 248 f. 36. Empiricist philosophy of science has interpreted the principle in this sense; cf. Nagel (1961), p. 94 f. 37. Heisenberg (1930); cf. above footnote 33. 38. Scheibe (1999), p. 174 ff. 39. Bridgman (1927). 40. Born (1926b, c). 41. Cf. Cahn and Goldhaber (1989), p. 217 ff., and Riordan (1987). 42. A detailed analysis is given in Falkenburg (1993). The correspondence interpretation of form factors is possible, as long as no spin and exchange effects come into play in the interaction between scattered particles and scattering center. In the relativistic case, the form factor depends on the four-momentum (q, E). The interpretation of the form factor as the Fourier transform of a charge distribution refers then only to a special frame of reference, the so-called Breit frame, in which the energy transfer of the scattering disappears. 43. Cf. Falkenburg (1995), p. 140 ff. 44. For the following, cf. the J¨asche Logic in: Akad. 9, especially p. 36 ff., as well as the parallel passages in the lecture notes in Akad. 24, Volume One.
References Bernays, (1922), “Die Bedeutung Hilberts f¨ur die Philosophie der Mathematik” in: Naturwissenschaften 10, 93–99. Bohr, N. (1913), “On the Constitution of Atoms and Molecules I” in: Philosophical Magazine 26, 1–15. Bohr, N. (1920), Collected Works 3, edited by L. Rosenfeld, North Holland Publ. Co., Amsterdam 1976. Bohr, N. (1926), “Atomic Theory and Mechanics” in: Nature 116 (1925), 845–852. Bohr, N. (1927), “The Quantum Postulate and the Recent Development of Atomic Theory”, Como lecture 1927, modified version in: Nature 121 (1928), 580–590. Bohr, N. (1948), “On the Notions of Causality and Complementarity” in: Dialectica 2, 312–318. Born, M. (1926a), “Zur Quantenmechanik der Stoßvorg¨ange” in: Zeitschrift f¨ur Physik 37, 863– 867. Born, M. (1926b), “Quantenmechanik der Stoßvorg¨ange” in: Zeitschrift f¨ur Physik 38, 803. Bridgman P. W. (1927), The Logic of Modern Physics, MacMillan, New York. Cahn, R. N. and G. Goldhaber (1989), The Experimental Foundations of Particle Physics, Cambridge University Press, Cambridge. Carnap, R. (1947), Meaning and Necessity, University of Chicago Press, Chicago. Carnap, R. (1966), Einf¨uhrung in die Philosophie der Natrurwissenschaften, Nymphenburger Verl.-Handlung, M¨unchen; engl.: Philosophical Foundations of Physics, Basic Books, New York, 1969. Cartwright, N. (1983), How the Laws of Physics Lie, Clarendon Press, Oxford. Cartwright, N. (1999), The Dappled World, Clarendon Press, Oxford. Cassirer, E. (1921), Zur Einstein’schen Relativit¨atstheorie, Bruno Cassirer, Berlin; reprinted in: Zur modernen Physik, Wissenschaftliche Buchgesellschaft, Darmstadt 1957. Darrigol, O. (1992), From c-Numbers to q-Numbers, University of California Press, Berkeley.
292
Intuition and the Axiomatic Method
Falkenburg, B. (1993), “The Concept of Spatial Structure in Microphysics” in: Philosophia naturalis 30, 208–228. Falkenburg, B. (1995), Teilchenmetaphysik. Zur Realit¨atsauffassung in Wissenschaftsphilosophie und Mikrophysik, 2nd edition, Spektrum Akademischer Verlag, Heidelberg. Falkenburg, B. (1997), “Incommensurability and Measurement” in: Theoria 30, 467–491. Falkenburg, B. (1998), “Bohr’s Principles of Unifying Quantum Disunities” in: Philosophia naturalis 35, 25–120. Falkenburg, B. (2000), Kants Kosmologie. Die wissenschaftliche Revolution der Naturphilosophie im 18. Jahrhundert, Klostermann, Frankfurt am Main. Frege, G. (1891), Funktion und Begriff: Vortrag, gehalten in der Sitzung vom 9. Januar 1891 der Jenaischen Gesellschaft f¨ur Medizin und Naturwissenschaft, Hermann Pohle, Jena. Friedman, M. (1992), Kant and the Exact Sciences, Harvard University Press, Cambridge, MA. ¨ Heisenberg, W. (1927), “Uber den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik” in: Zeitschrift f¨ur Physik 43, 172–198. Heisenberg, W. (1930), Die physikalischen Prinzipien der Quantentheorie, Hirzel, Leipzig; also: S. Hirzel, Stuttgart 1958. Hilbert, D. (1918), “Axiomatisches Denken” in: Mathematische Annalen 78, 405–415; reprinted in: D. Hilbert, Gesammelte Abhandlungen III, Springer, Berlin 1935, 146–156. Kant, I. (1768), “Von dem ersten Grunde des Unterschiedes der Gegenden im Raume” in: W. Weischedel (ed.), Vorkritische Schriften bis 1968. Kants Werke Band 1, Insel, Wiesbaden 1960, 993–1000. Kant, I. (1770), “De mundi sensibilis atque intelligibilis forma et principiis” in: W. Weischedel (ed.), Schriften zur Metaphysik und Logik. Kants Werke Band 3, Insel, Wiesbaden 1958, 12–107. Kant, I. (1781), Kritik der reinen Vernunft, Johann Friedrich Hartknoch, Riga 1781 (= edition A), 1787 (= edition B); reprinted as: Kants Werke, Band 2, edited by W. Weischedel, Insel, Wiesbaden, 1956. Kant, I. (1966), Kant’s gesammelte Schriften, edited by the Deutsche Akademie der Wissenschaften, Vol. XXIV, first volume: “Vorlesungen u¨ ber Logik”, Walter de Gruyter, Berlin. Krantz, D. H. R. Suppes and A. Tversky (1971), Foundations of Measurement, Vol. 1, Academic Press, New York. Nagel, E. (1961), The Structure of Science, Routledge & Paul, London. Parsons, C. (1992), “The Transcendental Aesthetic” in: P. Guyer (ed.), The Cambridge Companion to Kant, Cambridge University Press, 62–100. Peirce, C. S. (1960), Elements of Logic, Vol. 2 of The Collected Papers of Charles Sanders Peirce, edited by C. Hartshone and P. Weiss, Harvard University Press, Cambridge. Reichenbach, H. (1921), Relativit¨atstheorie und Erkenntnis a priori, Springer, Berlin 1920; reprinted in: Gesammelte Werke, edited by M. Reichenbach and A. Kamlah, Vieweg, Braunschweig 1977 ff., Vol. 3, 191–302. Riordan, M. (1987), The Hunting of the Quark, Simon & Schuster, New York. Russell, B. (1903), The Principles of Mathematics, 2nd edition, Allen and Unwin, London 1937. von Neumann, J. (1932), Mathematische Grundlagen der Quantenmechanik, Springer, Berlin. Weyl, H. (1949), Philosophy of Mathematics and Natural Sciences, 3rd and expanded edition, Princeton University Press, Princeton.
INTUITIVE COGNITION AND THE FORMATION OF THEORIES∗ Renate Huber Universit¨at Dortmund, Germany
1.
The problem
Ever since humanity began reflecting on the process of cognition, intuitive cognition has been thought to play an important role. This is true of philosophers as well as of natural scientists. But when philosophers and natural scientists speak of intuitive cognition, it is not a given that they are always speaking about the same concept. Indeed, a closer look at the various uses of the concept reveals that there are different conceptual variants at play. The explication of the concept depends in particular on the basic epistemological position of the user and is consequently anything but uniform. Nevertheless, three main uses can be discerned: (α) the phenomenally oriented sensible intuition of the empiricists, (β) the structurally oriented rational intuition of the rationalists, and (γ) intuition in the sense in which most physicists use the term following Kant. Since it remains quite unclear, however, which of the various concepts of intuition is essential to the analysis of the process of cognition, it seems useful to redefine the basic epistemological position and to reformulate the concept of intuition. Both ancient and early modern philosophy attribute to intuitive cognition a constitutive and hence indispensible function in the cognitive process. This type of cognition was mainly to secure the certainty of cognition. It is the great achievement of analytical philosophy of science, however, to have recognized that absolutely certain knowledge of nature is impossible, and that intuitive cognition is not capable of providing the infallibility attributed to it. Although ∗ Translation
by Hans-Jakob Wilhelm.
293 E. Carson and R. Huber (eds.), Intuition and the Axiomatic Method, 293–324. © 2006 Springer. Printed in the Netherlands.
294
Intuition and the Axiomatic Method
philosophers of science are too one-sided in their critique when they try to deny that intuitive cognition serves any function, they are nevertheless right to point to the decline of Newtonian physics and to call into question important aspects of Kant’s doctrine of intuition and the categories. For Kant, intuition had the function of constituting reality, a claim that was later denied by Helmholtz, Poincar´e and others and completely shattered in connection with the general theory of relativity and quantum theory. Nevertheless, a considerable number of twentieth century physicists still ascribe to intuitive cognition an important role in the construction and interpretation of theories. They are precisely those theorists who have contributed to the formulation of physical theories that have the reputation of being particularly counterintuitive. These physicists include Planck, Einstein and Bohr, although the concept of intuition they employ usually remains unclear. Thus there exists a sharp dissent between the various positions: (α) traditional philosophy has too much confidence in intuitive cognition; (β) analytical philosophy of science has too little confidence in it and (γ) the physicists who developed the modern physical theories are frequently too vague in their statements. A notable exception is Poincar´e. There is still another aspect to the problem: Modern physical theories — theory of relativity, quantum theory, chaos theory — are considered difficult to understand. Their results seem counterintuitive and for many students therefore the study of physics is laborious rather than enjoyable. But the problem is more serious than that: Studies at American colleges have shown that many students — in spite of instruction in physics — continue to think in terms of the concepts and principles of a pre-Newtonian physics that is close to ordinary experience.1 According to their intuition, moving objects should follow quite different trajectories from those prescribed by the laws of Newtonian physics. These misjudgments are surprisingly similar to the mentality of the late Middle Ages. In their efforts to achieve a better understanding of the increasingly abstract theories of nature, students must evidently overcome the same obstacles that early modern philosophers had to conquer. This involves the renunciation of intuitive descriptions of natural phenomena in favor of quantitative laws of an idealized nature, the phenomena of which must first be produced experimentally and made accessible to a mathematical description. The intuitive view of nature often stands in the way of an adequate understanding of the phenomena. It is an issue that raises the question regarding the limits of intuitive cognition. There are five aspects to this question that must be investigated: (α) On what epistemological basis should the process of cognition be analyzed? (β) How is the concept of intuitive cognition to be understood? (γ) What are the functions of this type of cognition? (δ) Where might limits be found? ( ) To what extent have the standards for what counts as an intuitive physical
Intuitive Cognition and the Formation of Theories
295
theory changed in the development of natural philosophy. On the one hand, the change of standards should make us understand how the axiomatic method is connected to the structurally oriented rational intuition of the rationalists. On the other hand, this change of standards should make it clear to what extent Newtonian physics is unintuitive for an Aristotelian and how in a completely different sense the theory of relativity and quantum theory appear unintuitive to a Newtonian.
2.
Elements of an empirically grounded theory of knowledge
The explication of the concept of intuitive cognition depends on the assumed basic epistemological position; and this is what we first need to specify. The foundation upon which the concept of intuition is to be explicated should be an empirically grounded theory of knowledge. The reasons for this lie in the various failures attributable to traditional epistemologies: (α) their ignorance of relevant issues (phylogenetic, ontogenetic and culturally determined aspects of the cognitive process), (β) their attempt to solve pseudo-problems (absolute knowledge, final justifications) and (γ) their contradictory solutions to fundamental problems (the existence / non-existence of innate ideas). Thus it seems useful to formulate a new concept of intuition on a new epistemological foundation. The result — and this must be emphasized — is not to be understood in terms of a concept explicitly articulated and used in philosophy or physics. Rather, the idea is to tailor a new concept of intuitive cognition on the basis of an empirically grounded theory of knowledge and then to demonstrate its fruitfulness and indispensibility with regard to the construction and interpretation of theories. In addition, it must then be shown in what sense the traditional variants of the concept can be seen as special cases of the newly explicated concept of intuition and to what extent they remain one-sided. The basic traits of an empirically grounded theory of knowledge emerge from the constructive collaboration of various empirical disciplines and traditional epistemological approaches. Such a theory of knowledge is always provisional, open to further development and never complete. Consequently, an empirically grounded theory of knowledge is indebted to the empirical disciplines for the findings they supply. Important insights of traditional theories of knowledge, however, must not be ignored, and well-established existing approaches at developing an empirically grounded theory of knowledge may and must be considered as well. These approaches include: (i) Evolutionary theory of knowledge2 distinguishes between a mesocosmic and a scientific structuring, where the mesocosmic structuring forms the starting point of all scientific theories. The investigation of the charac-
296
Intuition and the Axiomatic Method
teristic features of this type of structuring will therefore be fundamental to the discussion of the break of modern sciences with mesocosmic concepts and principles. Evolutionary theory of knowledge emphasizes the adaptive aspect of the process of cognition and points to the fact that human beings are adapted only to the mesocosm. Our sense data of this sector of reality are structured in such a way as to allow us to act in a manner that is adequate for our survival. This approach, of course, is incompatible with a reality-constituting function in Kant’s sense. (ii) The genetic theory of knowledge3 of Piaget and Case revolves around the idea of ontogenetic stages of development. It is characterized by the fact that special structures of reality are constructed depending on the individual stage of development that has already been reached. Hence, the genetic theory of knowledge emphasizes the constructivist aspect of the cognitive process (without, however, assuming a radically constructivist position) and points to the fact that although a child already possesses certain dispositions for structuring, she builds up these structures only through an active engagement with reality. These structures relate particularly to concepts such as “substance,” “space” and “causality” — concepts, that is, which have fallen into disrepute especially in connection with the modern physical theories. The concepts of “time” and “number” are special in this regard: In the genetic theory of knowledge, they appear as derived concepts that are constituted late in ontogenetic development. The genetic theory of knowledge argues for a concept of cognition that emphasizes the operative aspect. According to this view, mathematical cognitions are closely related to transforming actions. (iii) Gestalt psychology4 assumes that every cognitive system structures sense data intuitively. From the multitude of possible structures, one variant is chosen automatically. It is revealed that phenomena are not merely given according to the three dimensions of space, but that the manner in which they are given is additionally subject to special hierarchical structuring laws. Stated the other way around, it means that a cognitive system is not able to modify its intuitive structuring as it pleases, if the structure turns out to be faulty. These structuring laws — in particular the laws of “good Gestalt” — are closely related to the ideals of simplicity and perfection, which have shaped the course of the knowledge of nature substantially. (iv) The basic ideas of neuroscience5 are the distinction between different kinds of information processing — described as “sequential” and “parallel” — and the specification of the structural features characteristic
Intuitive Cognition and the Formation of Theories
297
of each. Simulations in neural networks,6 which are characterized by distributed encoding and parallel information processing as well as by error tolerance and the capacity for learning, allow researchers to learn about the mechanisms of pattern classification, pattern completion and the formation of prototypes of objects and causal processes. These findings support the thesis of the genetic theory of knowledge, according to which the concepts of “time” and “number” are intuitive concepts only in a limited sense. A powerful empirically grounded theory of knowledge must bring together the adaptive aspect of the evolutionary theory and the constructivist aspect of the genetic theory, test their inner consistency and complement or modify the resulting theory with the findings of Gestalt psychology and neuroscience. These results will contribute to a new explication of intuitive cognition, its functions and its limits.
3.
The cognitive process
There are three aspects to the analysis of the process of cognition: (α) the specification of the ontological and epistemological postulates on which the explication is based, (β) the explication of knowledge as the result of a structuring process that is possible in several ways but that is subject to characteristic transcendental laws of structuring and (γ) the explication of various aspects of of the structuring process and of the transcendental forms of structuring. The cognitive process is to be explicated against the background of the following ontological and epistemological postulates. (i) The postulate of reality: There exists a reality independently of the existence of cognitive systems capable of perceiving and investigating this reality. This postulate is an existential claim and to be understood solely in an ontological sense. It says nothing regarding the possibility of knowledge of this reality. (ii) The postulate of structure: There is a reality, which is structured. This means: The phenomena are highly redundant and structured according to laws of nature. These structures are real, objective and are — as neuroscience has shown — a necessary condition for the possibility of knowlege. If reality were entirely chaotic, not only would human cognition be impossible, but there would be no cognitive systems at all. This postulate is directed especially against a radically constructivist conception, according to which the structures are solely a subjective addition of the cognitive subject.
298
Intuition and the Axiomatic Method
(iii) The postulate of consciousness: As a result of evolution there exist various cognitive systems (cat, ape, human) that have consciousness developed to varying degrees. This postulate excludes solipsism in particular, which only accepts the existence of one’s own consciousness as certain. (iv) The brain function postulate: All cognitive function is without exception bound to a material substrate. The mind in particular is a function of the brain and is to be understood as an emergent property of complex systems. (v) The postulate of interaction: The sense organs of a cognitive system and reality enter into a state of interaction. A large portion of the sensory input is filtered out, while a small portion is processed in a sense-specific way, i.e., evaluated, ordered, structured, completed according to instructions from the brain itself, and then interpreted as information about reality and held available for action adequate to survival. (vi) The postulate of construction: As highly developed cognitive systems, human beings always construct and reconstruct their specific reality in accordance with their respective ontogenetic and cultural stages of development. (vii) The postulate of coherence: The adaptation of the subjective structures to the objective structures of reality is to be understood in the sense of an inner coherence of the subjective structures, which is experienced as epistemological necessity (evidence) and which occurs when there is a balance between the complementary mechanisms of assimilation and accomodation. What is cognition? An empirically grounded theory of knowledge regards cognition both as a process and as a result. Every cognitive process is an interpretive act, which is influenced by past experiences, expectations, prejudices, convictions and preferences. The fundamental structure of cognition is a threeterm relation: S recognizes O as A The cognitive subject S recognizes the object of cognition O as a structured whole A.7 Knowledge arises through the adaptation of the subjective structures to the objective structures of reality in the sense of the postulate of coherence, i.e., the structure always has one component that refers back to the object of cognition and another component that is determined by the cognitive subject and that depends on the cognitive apparatus, the cognitive capacities and the respective stage of cognition. The subjective structures are fundamentally
Intuitive Cognition and the Formation of Theories
299
determined by phylogenetic evolution, ontogenetic maturation and cultural development. The individual terms of these relations will now be explained in more detail. (i) The cognitive subject S: In the course of evolution, the mechanisms of mutation, genetic recombination and selection produced a hierarchy of cognitive systems S1 , S2 , S3 , ... SH (paramecium, cat, ape, human), which at every higher stage developed new systematic properties (genetic fixation, capacity for learning, agency, language, intellect) for occupying various cognitive niches. In ontogenesis, human beings pass through three different stages of development from infancy to adulthood: (α) the senso-motoric stage SH1 , (β) the symbolic-linguistic stage SH2 and (γ) the operational stage SH3 , in the course of which cognitive abilities are gradually built up. The differences between cognititve systems are manifest in the architecture of the brain, i.e., in particular through an enormous increase and a complex circuitry of interneurons in humans. The cognitive subject must satisfy certain necessary conditions for cognition to be possible. This includes in particular the plasticity of neural circuitry, i.e. the self-organizing change of synapse weights in the process of learning, which can in principle be simulated in Kohonen networks. The special architecture of neural structures — in particular the circuitry connecting senso-motoric neurons with multiple layers of interneurons — allows the cognitive subject to classify phenomena and to supplement missing information in a meaningful manner. (ii) The object of cognition O: The postulates of reality and of structure assume the existence of a reality O0 structured according to laws of nature and independent of the existence of cognitive systems, which are able to perceive and investigate this reality. The postulate of consciousness assumes the existence of various cognitive systems, each of which is able to grasp a certain characteristic sector of reality O1 , O2 , O3 , ... OH and constitute it according to the laws of its cognitive capacities. The respective sectors differ quantitatively (the dimensions of the sector) as well as qualitatively (the structural constitution of the sector). Over the three developmental stages of ontogenesis, human beings construct a series of realities OH1 , OH2 , OH3 in accordance with the cognitive capacities already developed. The mesocosm OMC is the sector that is or was already accessible in a senso-motoric way to prescientific man. As a consequence of human cultural development, reality has been changed through technology into an extended mesocosm OEMC . That is to say, the mesocosm of a Stone Age person differs fundamentally from the mesocosm of Aristotle, and this in turn differs from the mesocosm of a
300
Intuition and the Axiomatic Method
person of the twenty-first century. In addition, a person trained in natural sciences has another extension of her sector of reality (microcosm ← mesocosm → macrocosm). This extended sector of reality OSci is disclosed mathematically and experimentally and satisfies new types of structural laws. According to the results of neuroscience, the object of cognition must satisfy certain necessary conditions for a subject to be able to gain knowledge of it. This includes the structure and redundancy of the phenomena. (iii) The implementation of structure A: Less developed cognitive systems only have a genetically fixed implementation of structure. Highly developed cognitive systems, on the other hand, have a capacity to learn to an extent characteristic for their species. Through processes of learning, these systems acquire the ability to structure sense perceptions in order to be able to act in a manner that is adequate for their survival. These implementations of structure A1 , A2 , A3 , ... of non-human cognitive systems are intuitive, unambiguous and fixed. Human beings, by contrast, have various possibilities of structuring. Of fundamental importance for this new capacity are the abilities to act and to speak and the unusually long period of ontogenetic maturation (neoteny). Over the three developmental stages of ontogenesis, a series of subjective structures AH1 , AH2 , AH3 is built up in accordance with the mechanisms of assimilation and accomodation. Each successive stage reconstructs the previous stage at a more abstract level and completes previous attempts at structuring. The possibilities of structuring are subject to certain constraints: (α) Simulations in neural networks demonstrate that the capacity for structuring is fundamentally determined by the architecture of the neural structures. (β) Evolutionary theory of knowledge emphasizes that of the set of all structuring options those that are relevant but inadequate for survival are sorted out and discarded. (γ) Gestalt psychology and genetic theory of knowledge regard the set of all structuring options as limited by laws of structuring — Gestalt psychology emphasizing the static aspect related to visual perception and genetic psychology emphasizing the dynamic aspect related to human action. For humans there are several ways of implementing structure: (α) mesocosmic structuring and (β) scientific structuring. The mesocosm is that sector of reality to which humans have adapted through phylogenetic evolution, ontogenetic maturation and standard cultural development, i.e., the sector that is directly accessible through the senses and the intellect in perception and in action. The limits of the mesocosm cannot be determined with precision, since it is funda-
Intuitive Cognition and the Formation of Theories
301
mentally an anthropocentric concept that includes latitude in several respects. On the one hand, there is latitude in connection with individual development, and on the other hand, there is cultural latitude, which depends on the transformation of the life-world through science and technology. Definitely excluded from the mesocosm, however, are very small systems, very large systems, systematic properties for which we lack sense organs, highly networked systems with complicated causal structures and systems with complicated temporal development. The totality of human structural implementation thus divides into two classes, which differ fundamentally in their structural features:8 (i) Mesocosmic structuring TM C , TEM C : This type of structuring refers to ordinary experience. To the extent that it finds conceptual expression, it is contained in the ordinary concepts of everyday life. Mesocosmic structuring is adequate for survival, oriented towards phenomena and committed to a naive realism. It is only constitutive of part of the knowledge that human beings are capable of attaining. It has developed in humanity’s dealings with reality and is adapted to the mesocosm. Outside of the mesocosm this structuring is not necessarily adequate. Mesocosmic cognition is based on a largely uncritical use of language, inductive inferences and generalizations as well as simple algorithmic calculations. It is only success oriented, i.e., it merely aims to reach a desired goal and does not aim to acquire true and consistent knowledge. (ii) Scientific structuring TSci ={T1 , T2 , T3 ...}: This type of structuring makes use of scientific concepts and principles and mathematical formalisms. It is oriented towards structure, committed to a critical type of realism and appeals to controlling standards (logic, experiments, mathematics). The theories are constructed according to the axiomatic method. The various theories, however, must not be considered merely as alternative ways of implementing structure. Rather, as we shall see in more detail, they are connected to the progress of knowledge. Scientific structuring strives to achieve consistency and truth. Mesocosmic structuring is a necessary condition for scientific structuring — in two respects. (α) Theory construction: Mesocosmic structuring forms the starting point of scientific knowledge. All theories of natural science begin with conceptions that are firmly rooted in our ordinary understanding of things. (β) Theory interpretation: For the interpretation of its theories, science must again reconnect to mesocosmic structuring. Using controlling standards accepted by the scientific community, scientific structuring subjects parts of mesocosmic structuring to critical examination and modifies them if necessary. On the other hand, it is impossible to put the entire stock of mesocosmic
302
Intuition and the Axiomatic Method
structuring to the test. Even scientific theories have remnants of mesocosmic structuring. Scientific progress aims to reduce the size of this remaining unexamined stock. In accordance with the genetic theory of knowledge, the general implementation of structure A is to be explained in a more detailed analysis of the cognitive process in terms of three special aspects: 1 S recognizes O as A PP P q P
S structures O as X S refers to O as Y S transforms O from Z to Z∗
This continuing analysis will demonstrate (α) that every structuring occurs according to certain transcendental forms of structuring, (β) that these transcendental forms of structuring can be derived from empirical investigation and (γ) that the transcendental forms of structuring are subject to change. The special transcendental forms of structuring built up during ontogenetic maturation and standard cultural development not only lead to mesocosmic structuring, but lie at the basis of scientific structuring as well — albeit in structurally modified and extended form. Hence, in contrast to Kant’s theory of knowledge, the transcendental forms of structuring of the genetic theory of knowledge are not derived from systematic considerations, but rather from empirical investigations. Moreover, they are subject to structural change in accordance with the developmental stage achieved at the time. The various transcendental forms of structuring can be summarized in a diagram, where the index Ω refers to the fact that the concretization depends on the respective stage of development: RΩ
ZΩ
PΩ
SΩ
figurative
representative
AΩ
operative
The individual implementations of structure X, Y and the transformation of Z to Z∗ of the relations and the various underlying transcendental forms of structuring RΩ , PΩ , ZΩ , SΩ and AΩ can be defined more precisely as follows: (i) The figurative structuring X: This type of structuring occurs in direct perception, i.e., it is a structuring of phenomena in three-dimensional space and in accordance with the formation of prototypes of objects and causal
Intuitive Cognition and the Formation of Theories
303
processes. It is already built up during the first stage of ontogenetic maturation (senso-motoric stage) and is enriched with specific structures at the higher stages of development. In various ways, non-human animals are capable of figurative structuring as well. Gestalt psychology has shown that (α) every perception must be understood in terms of a holistically structured perception, (β) that missing information is supplemented speculatively, (γ) that structuring occurs in accordance with previous experience, (δ) that structuring is susceptible to error and hence in need of suitable controlling standards and ( ) that spatial structuring is subject to certain hierarchically ordered figurative laws of structuring (“good Gestalt”). The result is then a nonverbal disposition regarding the concepts “space,” “substance” and “causality” that is relevant with respect to expectation and behavior. In scientific structuring this is revealed in the metaphysical assumptions regarding the constitution of nature, which often enter into the construction of theories tacitly. The concept of time is constituted at the second stage (symbolic-linguistic stage) as an intuitive time, referring merely to the succession of motoric states. The abstract concept of time with an ordering structure and a metrics that allows for the comparison of different movements is formed only at the third stage (operational stage). The figurative structuring X refers to the transcendental forms of structuring RΩ and PΩ . The structuring in space and time (the coexisting and successive order of objects and processes) and the knowledge of the structural features of space (dimensionality, topological and metric structures), of which the cognitive subject has mastery in dependence on its developmental stage Ω, are designated RΩ . The nonverbal prototype formation of objects (substance) and processes (causality) relevant to expectation and behavior, its structuring according to figurative laws of structuring and the knowledge of relations of invariance, of which the cognitive subject has mastery in dependence on its developmental stage Ω, are designated PΩ . (ii) The representative structuring Y: This type of structuring is built up during the second stage (symbolic-linguistic stage) of ontogenetic maturation and is continuously extended qualitatively and quantitatively during the higher stages of development. In mesocosmic structuring, three basic forms of representative structuring must be distinguished: (α) ritual representation (symbolic play), (β) iconic representation (two-dimensional and three-dimensional figures) and (γ) verbal representation (ordinary language). An abstract concept of number is formed during the third stage (operational stage): the arithmetic numerical unit on the basis of
304
Intuition and the Axiomatic Method
a piece-by-piece correspondence with an ordinal and a cardinal structure, which makes possible the construction of an arithmetic representation. Science systematically develops all of the implementations of representative mesocosmic structuring and thus gains an effective set of instruments for the investigation of nature. The representative structuring Y refers to the transcendental structuring forms ZΩ and SΩ . The basic forms of representative structuring (ritual, verbal), of which the cognitive subject has mastery in dependence on its stage of development Ω, are designated SΩ . The iconic representation and the arithmetic representation, of which the cognitive subject has mastery in dependence on its stage of development Ω, are designated ZΩ . (iii) The operative transformation from Z to Z∗ : This type of structuring is built up mainly during the third stage of ontogenetic maturation (operational stage) in the course of standard cultural development and is enriched with specific structures at the higher stages of development. This structuring describes cognitions inasmuch as they refer to actions. The possibility of repetition, combination, coordination and reversibility (inversion, reciprocity) of state transformations shows that changes of state are also subject to certain laws of structuring in the sense of operative laws of structure (group structures, structures of order). The beginnings of logical and algorithmic structuring that already exist in mesocosmic structuring arise on the basis of the capacities for speech and action. Scientific structuring refines the operative transformations of everyday life and continues them in an experimental setting. It extends the concept of number and builds up an extensive set of mathematical instruments. By means of experiments, logic and mathematics, additional operative structural laws may be discovered. In particular, the concept of infinity is important in science for the construction of the mathematical continuum and for complete induction. The type of structuring based on the operative transformation from Z to Z∗ refers to the transcendental forms of structuring AΩ . The basic concepts and axioms of logical relation (principle of identity, principle of excluded contradiction, principle of excluded middle) or of algorithmic procedures (principle of complete induction), of which the cognitive subject has mastery in dependence on its stage of development Ω, are designated AΩ .
4.
Intuitive cognition
With reference to traditional theories of knowledge as well as to the outline of an empirically grounded theory of knowledge sketched above, we will now
Intuitive Cognition and the Formation of Theories
305
seek to determine the nature of intuitive cognition. This will be followed by an analysis of its functions and limits. Four points must be noted in this regard: (α) the distinction between parallel and sequential information processing and their respective characteristic structural features, (β) the distinction between a phenomenally oriented sensible intuition and a structurally oriented rational intuition, (γ) the connection between the newly explicated concept of intuition and the various versions of the concept in traditional theories of knowledge and (δ) the functions and limits of intuition for the construction and interpretation of theories.
4.1
Sensible and rational intuition in traditional theories of knowledge . . .
Traditional theories of knowledge distinguish three types of cognition: (i) sensible cognition, (ii) intuitive cognition and (iii) another type of cognition that is defined as a counterpart to intuitive cognition. The concept of intuition and its counterpart are explicated against the background of the respective doctrine of method and vary accordingly. Important examples are: (α) intuitive versus demonstrative cognition (Locke), (β) intuitive versus deductive cognition (Descartes), (γ) intuitive versus symbolic cognition (Leibniz) and (δ) intuitive versus discursive cognition (Kant). They all share the claim that intuitive cognition is in a certain sense responsible for securing the certainty of cognition. (i) Locke’s empiricist theory of knowledge: Locke appeals to phenomenally oriented sensible-intuitive cognition as a counterpart to demonstrative cognition. According to Locke, the characteristic features of intuitive cognitions are: (α) the immediate, effortless and unquestionable perception of the agreement or lack of agreement between representational contents, (β) the absence of intermediary steps (proofs) and (γ) the reference to what is concretely perceivable through the senses. Examples are qualitative comparisons (white = black), quantitative comparisons (two = three) and the comparison of geometric figures (circle = triangle). The intuitively structured content refers to the mesocosmic transcendental forms of structuring. The distinction between what is intuitively known and what is demonstratively known remains firm and does not change with the advance of knowledge. According to this view, all scientific theories are unintuitive. (ii) Descartes’ rationalist theory of knowledge: Descartes appeals to structurally oriented rational-intuitive cognition as a counterpart to deductive cognition. According to him, the characteristic features of intu-
306
Intuition and the Axiomatic Method
itive cognition are: (α) the unquestionable result of an attentively executed process of abstraction, (β) the methodological use of a controlling standard (qualitative conceptual analysis) and (γ) the relation to a mathematically and philosophically trained intellect. The intuitively structured content refers to special transcendental forms of structuring, which — “cleansed” by the fire of methodical doubt — are inferred from processes of abstraction and considerations of limits. As a result, we obtain clear and distinct concepts and principles that are beyond any reasonable doubt. Examples are the concepts of substance (thinking substance, perfect substance, extended substance), the concepts of movement (movement in the ordinary sense, movement in the proper sense), the idea of numbers and figures and especially the emancipation of the concept of number from its geometric interpretation. The duality of cognition is clearly illustrated in his metaphor of a pearl necklace: intuitive cognition comprehends every individual pearl; deductive cognition merely provides the connections between the individual pearls. (iii) Leibniz’s rationalist theory of knowledge: Leibniz appeals to structurally oriented rational-intuitive cognition as a counterpart to demonstrative cognition. As characteristic features of intuitive cognition, Leibniz mentions: (α) the sudden grasp of simple concepts and original truths, (β) the methodological use of controlling standards (quantitative conceptual analysis, logic, continual transition from the finite to the infinite) and (γ) the relation to a mathematically and philosophically trained intellect. In addition, Leibniz is concerned with two distinct types of representation, which in turn are based on two distinct types of cognition: intuitive cognition, referring to iconic representation (geometric representation), and symbolic cognition, which refers to arithmetic representation (analytical representation) and makes use of the abstract Cartesian concept of number freed from the constraint of a geometric interpretation. Examples are the triangle, which can be grasped intuitively as well as symbolically, and a polygon with a thousand sides, which can only be grasped symbolically. In addition, Leibniz mentions the relationship between a circle, an ellipse and a parabola. The circle is a special case of the ellipse in the sense that the two focal points of the ellipse fall into one in the case of a circle, thus forming its centre. The ellipse is a special case of the parabola in the sense that one of the two focal points moves into infinity. In their geometric representation, the three geometric figures appear essentially distinct. Represented analytically, however, their structural “kinship” is easily recognized through a continual movement of the focal points.
Intuitive Cognition and the Formation of Theories
307
(iv) Kant’s theory of knowledge: Kant speaks of two irreducible cognitive powers and draws the very important distinction between the forms of intuition (intuitive) and the categories of thought (discursive). His theory of knowledge places special emphasis on Euclidean geometry and Newtonian physics.9 For Kant, the forms of intuition and the categories of thought are universally valid and necessarily true. As conditions of the possibility of cognition, they are prior to any actual cognition and have a function that is constitutive of reality. As far as geometry is concerned, its essential feature is the construction of geometric figures in space, which helps it enter upon the secure path of a science. The rationalist epistemologies of Descartes and Leibniz already contain the distinction between a phenomenally oriented sensible intuition and a structurally oriented rational intuition: (α) In the Cartesian metaphor of the pearl necklace, the investigation of conceptual analysis begins with a sensibly perceptible object of cognition (example: a concrete human being) with an appeal to a phenomenally oriented sensible intuition and ends with an object of cognition that can only be grasped by means of the understanding (the abstract substances: perfect substance, thinking substance, extended substance) with an appeal to a structurally oriented rational intuition. The investigation of conceptual analysis is rooted in a phenomenal object that is grasped sensiblyintuitively and leads to a result that can only be grasped through a structurally oriented rational intuition. (β) The Leibnizian talk of intuitive and symbolic cognition marks the distinction between a phenomenally oriented sensible intuition and a structurally oriented rational intuition even more clearly. According to geometric representation, a circle is to be constructed by means of a pair of compasses and a ruler. In analytic representation, however, a circle is to be expressed in terms of the formula x2 + y 2 = r2 . The advantage of geometric representation lies in its intuitiveness. It only requires a phenomenally oriented sensible intuition. Analytical representation, by contrast, presupposes the capacity of structurally oriented rational intuition. Only a mathematically trained person will immediately recognize that the above formula describes a circle.
4.2
. . . and the results of neuroscience
In distinguishing between parallel and sequential information processing, the results of neuroscience and in particular those derived from the simulation of cognitive abilities in neural networks confirm the duality of processing mechanisms in the brain assumed by the traditional theories of knowledge.10
308
Intuition and the Axiomatic Method
(i) Parallel information processing: Parallel information processing grasps the object of cognition immediately and holistically and integrates it into an overall concept. It reduces the sensory input to its essential characteristics, networks these characteristics and encodes them as a pattern. It is given as immediate insight. This is meant in the sense that insight requires no methodical deduction. It completes missing or insufficient information required to make the structuring meaningful. It enters every item of information into a relational network of other encoded information. Cognitive gaps are closed speculatively; world-views are completed. This always includes the essential core of the object of cognition, which can only come into focus, if the whole is grasped. Parallel information processing remains essentially speculative in its implementation of structure. It is capable of evaluation according to aesthetic (beautiful) and axiological (perfect) criteria. It is trained by way of examples, and its approach is inductive. It encodes prototypes of objects and causal processes, i.e., it extracts similarities in the sensory input. It is primarily not conscious, i.e., it occurs automatically. It is incapable of learning, even when its mistakes are revealed through controlling standards. It does not have a consciousness of time, but only a short temporal window, in which an item of information is held “online”. Typical features of parallel information processing are the implementation of structure in space and the processing of images (geometric figures). In terms of language, it encodes the semantic aspect. (ii) Sequential information processing: Sequential information processing is rooted in parallel information processing, adding a successive element to the latter. Here the attention is focused on the individual. It applies explicit rules and its procedure is deductive. It forms symbols in order to designate objects and causal processes. It is conscious, i.e., it requires concentrated attention. It is conscious of time in the sense that — with the help of geometric representations (culturally dependent: linear, circular) — it can visualize dynamic sequences of events. Typical features of sequential information processing are logical deductions and the execution of algorithmic procedures. In light of the structural features specified above, it seems plausible to identify intuitive cognition with parallel information processing and sequential information processing with discursive cognition — in the sense of a counterpart to the concept of intuitive cognition. Thus:
Intuitive Cognition and the Formation of Theories
309
Parallel Information Processing ⇔ Intuitive Cognition Sequential Information Processing ⇔ Discursive Cognition The most important structural features of these two processing mechanisms in the brain are compared in the following table: Sequential Information Processing
Parallel Information Processing
individual, incomplete isolated, unrelated (without links, disconnected) neutral symbol processing (logic) algorithm (operations with numbers) governed by rules, deductive, programmed conscious, attentive conscious of time (dynamic)
holistic, complementing associative, related to context (linked, connected) evaluative (aesthetic, axiological) prototype formation (objects, processes) space (images, geometric figures) governed by example, inductive, trained not conscious, automatic timeless (static) creative power
Intuitive cognition structures its object in accordance with the structural forms that have already been built up.
4.3
Sensible and rational intuition in physics
The evolutionary theory of knowledge and the genetic theory of knowledge provide a dynamic view of the transcendental structural forms. According to this view, these structural forms are not static, as Kant asserted, but rather change continually with scientific progress, as illustrated here through a sequence of diagrams. That is to say, their concreteness depends on the stage of development already achieved. The extension of mesocosmic cognition towards scientific cognition has two sides: (α) The experimental-technological extension of reality: By technological means, we are able to construct “prostheses of the senses,” which break through the limitations to average dimensions (microscope, telescope) or open up other areas of reality (electromagnetic phenomena). (β) The logico-mathematical extension of reality: The development of structural sciences (logic, mathematics, system theory) allows for the analysis of new types of structures of reality (functional dependencies, network-like causal structures, non-linear temporal developments). According to Heisenberg, a theory is closed if and only if it exists in axiomatic form and does not admit any minor modification or supplementation. Any modification would affect the conceptual and structural foundations of the theory. The replacement of a closed theory in Heisenberg’s sense with another generally implies the transition from one diagram to another. Such transitions involve a Kuhnian paradigm shift, i.e., they involve a restructuring of the stock
310
Intuition and the Axiomatic Method
of knowledge on the basis of newly formulated basic concepts and principles. The general theory of relativity substantially modifies the concepts of space and time and links them together. In addition, it interprets matter as a form of energy and gives up the assumption that matter is indestructible and unchangeable. Quantum theory questions the concepts of substance and causality as well as the validity of two-valued logic. Both theories involve counterintuitive consequences that are bound up with quantities that have the characteristic of a limit: the speed of light and the quantum of action. Mesocosmic Structuring
- Scientific Structuring
Experimental-technological 6
RRT
ZRT
PRT
SRT
RClass
ZClass
PClass
SClass
RMC
ZMC
PMC
SMC
ART
AClass
AMC -
Logico-mathematical The diagram illustrates the graduated progress of knowledge with the simplified example of three implementations of structure (according to the diagram in Section 3): (i) Mesocosmic structuring (index Ω = MC): This concerns transcendental forms of structuring that lie at the basis of ordinary experience: (α) implementations of spatial structure in three-dimensional physical space with topological and metric properties that are relevant with respect to expectation and behavior, (β) implementations of prototype structures as concrete objects with permanent properties that are subject to simple, linear connections of cause and effect, (γ) fuzzy concepts in the sense of fuzzy logic, (δ) implementations of quantitative structure using an arithmetical but mainly phenomenally oriented concept of number (rational
Intuitive Cognition and the Formation of Theories
311
numbers) and ( ) logical deductions limited to if-then relations (modus ponens), simple relations of inclusion of concrete sets and simple transitive relations of concrete series. Algorithmic procedures referring to basic number manipulations (addition, multiplication). (ii) Structuring in accordance with classical physics (index Ω = Class): This concerns transcendental forms of structuring that lie at the basis of classical physics:(α) implementations of spatial structure in three-dimensional mathematical space with a Euclidean structure, formulated as an axiomatic theory, (β) implementations of prototype structures (abstraction from individual properties) as idealized objects (material points) with permanent properties that are subject to connections of cause and effect in accordance with the strong causality principle, (γ) use of a verbal technical language that replaces the fuzzy concepts of natural language while remaining closely tied to the intuitive concepts of mesocosmic structuring, (δ) implementations of quantitative structure using an extended abstract concept of number (real numbers) and a concept of function that allows for the formalization of dynamic aspects (differential equations) and ( ) axiomatic construction in geometric or analytic formulation on the basis of a two-valued logic. Logic, experiment and mathematics are used as controlling standards. (iii) Structuring in accordance with the theory of relativity (index Ω = RT): This concerns transcendental forms of structuring that lie at the basis of the special and general theories of relativity: (α) implementations of spatiotemporal structures in a four-dimensional, pseudo-Euclidean or Riemannian space-time structure that is dependent on the distribution of matter, (β) implementations of prototype structures (abstraction from individual properties) as idealized objects (material points) with permanent properties that are subject to connections of cause and effect in accordance with the strong causality principle but limited by a lightcone structure, (γ) use of a verbal technical language that leads away from the intuitive concepts of mesocosmic structuring, (δ) implementations of quantitative structure using a highly extended abstract concept of number (complex numbers) and a concept of function, and ( ) axiomatic construction in analytic formulation on the basis of a two-valued logic. The progress in knowledge that comes with the construction of new theories is bound up with a dynamic development of the transcendental forms of structuring. This results in a shift in intuitive cognition. Intuitive cognition thus turns into a continuum and the shift signifies a break with the phenomenally oriented sensible intuition of objects and processes and a turn towards a structurally
312
Intuition and the Axiomatic Method
oriented rational intuition of symbolically communicated abstract structures in scientific structuring. The Shift of Intuitive Cognition in Theory Construction Structurally oriented rational intuition RSci
ZSci
PSci
SSci
ASci
:
RMC
ZMC
PMC
SMC
AMC
Phenomenally oriented sensible intuition Axiomatically constructed theories can therefore only be designed and understood with reference to a structurally oriented rational intuition. Two things are immediately clear in this regard: Structurally oriented rational intuition requires special logical-mathematical training and concerns the basic concepts and axioms of a theory. The counterintuitive consequences of a theory remain at the phenomenal level.
4.4
Functions and limits of intuitive cognition
Many physicists ascribe to sensible intuition an indispensable function. Frequently, however, an intuitive insight prevents an adequate understanding of the phenomena. This leads us to the question concerning the functions and limits of intuitive cognition. In what theoretical connections (theory construction, interpretation, justification) does intuitive cognition play a special role? What does it accomplish? Where and to what extent does it fail? Intuitive cognition has a heuristic function in theory construction and a semantic function in the interpretation of theories. In connection with the justification of theories, however, the function of intuitive cognition is very limited. This is due to the holistic structuring of intuitive cognition and the associated speculative character. Of the entire spectrum of the functions of intuitive cognition, we will select only a few for closer examination: Intuitive cognition is to (α) provide a plausible basis for the construction of theories, (β) secure objective reference and (γ) make the abstract formalism more intuitive. (i) The intuitive basis of theory construction: Intuitive cognition is indispensable for the purpose of theory construction, since it structures the rele-
Intuitive Cognition and the Formation of Theories
313
vant domain of objects. All theories of natural science start from conceptions that are deeply rooted in what are for the most part phenomenally oriented, intuitive representations of mesocosmic structuring, which are at first accepted as self-evident, without having been tested. There is no empirical science which could construct theories without such starting points. The starting points are based on general assumptions regarding the constitution of nature and include general supra-theoretical elements for the description of nature. This refers to concepts and principles that are already fixed at a stage prior to the formation of the theory itself. The supra-theoretical elements include above all the concepts of “space”, “time”, “substance” and “causality”, but also ideals of simplicity and perfection. By way of trial, physical science must formulate general assumptions regarding nature as the basis of theory formation, in order to systematize the chaotic abundance of particular phenomena. Hence, the question is not whether assumptions must be made, but rather, what assumptions they could be, how they should be formulated and how they can be justified. The description of nature begins with a visual perception of particular phenomena. In contrast to traditional theories of knowledge, Gestalt psychology teaches that the perception of phenomena is not a composition of individual sense impressions, but rather the analysis of the whole into “formed parts”. Phenomena are perceived instantly and holistically. They are structured unconsciously and untested in accordance with the transcendental forms of structuring and integrated into the established world-view. Missing and inadequate information is supplemented. The result of scientific structuring is a disposition regarding the concepts of “space,” “time,” “substance” and “causality” that is partly formulated in concepts and partly nonverbal and yet relevant with regard to expectation and behavior. The assumptions remain at least partly unarticulated and unconscious. They nevertheless always suggest certain solutions for the description of nature, while excluding others. They function as conditions of constraint. Frequently they have the status of epistemological necessities for lack of alternatives. To the extent that they are not falsifiable, they remain fundamentally speculative and are hence subject to change as theories change. Such metaphysical elements are nevertheless indispensable, for they guide procedure and provide a plausible basis for the formulation of definitions and axioms. They can neither be justified empirically nor logically. But in many cases it is possible to develop experiments that can decide between classes of theories without already requiring a decision regarding a specific formulation. An important example is the
314
Intuition and the Axiomatic Method
experiment by Aspect, which tests the validity of Bell’s inequality and hence answers the question regarding the general conditions of locality. (ii) Objective reference: With the use of experimental-technical and logicomathematical methods, the question regarding objective reference becomes increasingly problematic. The genetic theory of knowledge points out that the interplay of implementations of figurative, representative and operative structuring is characteristic of scientific structuring. Consequently, the structural commonalities and differences of these various implementations of structure must be analyzed with regard to the issue of objective reference. They share the fact that they all indicate relational networks. The results of Gestalt psychology demonstrate that implementations of figurative structure reveal deformations. Hence, they constitute the problematic part of objective reference in cognition. As emphasized by the genetic theory of knowledge, on the other hand, operational relations lead to enduring states of equilibrium between assimilation and accommodation and are therefore neither put into doubt by further developments nor refuted by experience. Thus, they constitute the unproblematic part of objective reference in cognition. Thus, the question regarding objective reference is answered in complete agreement with Hertz’s view. Hertz interpreted scientific theories as “pictures” [Bilder]. A Hertzian picture is a simplified image of a sector of reality. Such a picture must satisfy the following basic criterion: The logically necessary consequences of the pictures must be the pictures of the naturally necessary consequences of the objects. The imaging between the object and the picture is not bijective; there can be several pictures for every object. The possibility of constructing pictures is limited by additional conditions. Pictures must fulfill at least three criteria. (α) The criterion of reliability: This requirement refers to the logical consistency of the pictures. Logically incompatible pictures must not be used (logical control). (β) The criterion of correctness: This requirement refers to the agreement of the picture’s essential relations with the relations of the objects (experimental control). (γ) The criterion of suitability: This requirement refers to the number of the essential relations that the pictures can reflect (optimization of adequacy). This basic criterion secures objective reference. Since the imaging between object and picture is not bijective, objective reference remains only partial.
315
Intuitive Cognition and the Formation of Theories
- Consequences of the phenomena Phenomena naturally necessary ?
Pictures of the consequences of the phenomena ∼ Pictures = Consequences of the pictures logically necessary of the phenomena ?
(iii) Intuitiveness: Intuitive cognition is indispensable for the interpretation of theories. In the course of the extension of mesocosmic cognition through experimental-technical methods on the one hand and logicomathematical methods on the other hand, intuitiveness is increasingly lost. It is then imperative to find means of representation that allow for this lost intuitiveness to be at least partially regained. What does intuitiveness or intuitive illustration consist in? A scientific theory TSci is intuitively illustrated if and only if the structuring of the object of cognition O is carried out in accordance with those transcendental forms of structuring which the cognitive subject S has already developed. The concept of the intuitiveness or intuitive illustration of a scientific theory TSci is dependent on the respective cognitive subject S. This explains why Newtonian physics is unintuitive for an Aristotelian in a completely different sense from the sense in which the theory of relativity or quantum theory are unintuitive for a Newtonian. It is because both refer to different transcendental forms of structuring. Two special cases, SSC and SMC , deserve special emphasis: A scientific theory TSci is intuitive for a member of the scientific community SSC if and only if the structuring of the object of cognition O occurred in accordance with those transcendental forms of structuring, which have generally already been developed and understood. Intuitiveness will refer less and less to a phenomenally oriented sensible intuition. Instead, it will shift towards a structurally oriented rational intuition and thus aim at the intuitiveness of definitions, axioms and principles. The theory of relativity and quantum theory are considered unintuitive because they come into conflict with a view of nature that is oriented along the lines of classical physics. For an SMC capable of mesocosmic structuring, a scientific theory TSci may be intuitive to varying degrees: directly intuitive theories (intuitive illustration: not necessary), indirectly intuitive theories (intuitive illustration: necessary and possible) and non-intuitive theories (intuitive illustration: necessary, but only partially possible). How does a loss of
316
Intuition and the Axiomatic Method
intuitiveness come about, and what types of facts require intuitive illustration? Mesocosmic structuring is directly intuitive. It refers primarily to sensibly perceptible phenomena and presupposes an egocentric perspective. Mesocosmic structuring employs a fixed frame of reference and is limited to simple cause and effect connections. It uses ordinary language and remains largely unconscious, although nevertheless relevant with regard to expectation and behavior. All scientific theories TSci require an interpretation of their concepts and structures that provides an intuitive understanding of what the formalism expresses. Physical theories are considered unintuitive, since they use a specialized language with abstract conceptualization, idealizations and transitions from the finite to infinity. They represent their results in mathematical formulas and use phenomena artificially produced through experiments and technical devices for verifying their statements. What needs to be intuitively illustrated are the dynamic processes (exponential and hyperbolic developments), causal structures (cause and effect chains, feedback effects, the violation of the principle of strong causality) and the God’s eye perspective (break with the egocentric perspective). The space-time structures of the special and general theories of relativity (speed of light as limit, the theorem of the addition of speeds), the properties of matter (incompatible properties of matter as expressed in the wave-particle dualism and in the tunnel effect) and the causal structures of quantum theory (absense of a cause in radioactive decay) are unintuitive in a special but completely different sense. How can facts be intuitively illustrated? What representational means of intuitive illustration are required? The traditional approach to intuitive illustration usually proceeds along two paths: (α) The results are rendered intuitive by reducing them to their elements, i.e., the results are simplified such that only a few properties or structures remain. (β) The results are intuitively illustrated through visualization, i.e., they are spatialized and represented as geometric figures. These methods are well suited for the illustration of indirectly intuitive theories. For non-intuitive theories with their completely different structures, however, they do not yield satisfactory results. Consequently, two types of illustration must be distinguished. (i) The intuitive illustration of indirectly intuitive theories: The intuitive illustration is achieved through a shift in the type of structuring within the same transcendental forms of structuring. The missing intuitive structuring at the phenomenal level is substituted with an intuitive structuring at the level of representation. The most important type of representation
Intuitive Cognition and the Formation of Theories
317
is geometric representation. Its main function is to visualize time and causality. This type of illustration includes changes in scale that are either proportionate or not proportionate, the change from the egocentric to the god’s eye perspective and the choice of an arbitrary frame of reference. Time is illustrated by means of a line, the object by means of a point and causality by means of a vector. Functional dependencies are represented graphically. (ii) The illustration of non-intuitive theories: This type of intuitive illustration is limited and possible only by means of an analogization of the relevant structuring with recourse to simpler transcendental forms of structuring. This presupposes a familiarity with the types of structuring used in a different domain. The theory of relativity and quantum theory are considered non-intuitive because they come into conflict with the transcendental forms of structuring oriented along the lines of classical physics. This leads to problems, because it leaves us only with a partial illustration in the sense of an analogization: (α) A partial intuitive illustration is possible through a reduction of the spatial dimensions (two-dimensional planes, Minkowski diagrams). In terms of a mathematical formalism, the general theory of relativity shows that the space-time structure is a four-dimensional Riemann space. The limitation to a quasi-spatial hyperplane leads to a curved three-dimensional space. Newtonians find the theory of relativity non-intuitive because their transcendental forms of structuring contain a three-dimensional Euclidean space. If a Newtonian wanted to illustrate the new geometric relations of the theory of relativity in a geometric representation, she would have to construct concrete figures in a curved space. Here, we are obviously up against the limits of sensible-intuitive theory interpretation. Nevertheless, even in this case sensible intuition retains an important function, if the geometric relations are analyzed on arbitrarily curved two-dimensional planes. (β) A partial illustration is possible by means of “impossible pictures” (wave-particle dualism). In terms of its mathematical formalism, quantum theory shows that microphysical objects feature incompatible properties of matter. The assessment of this incompatibility refers to the transcendental forms of structuring of classical physics. The incompatible aspects can be interpreted either by means of Escher’s “impossible pictures” or in terms of Vollmer’s projection model of cognition.11 In thought, Vollmer projects a “tin can” in two different directions and obtains — depending on the projection — the two incompatible geometric figures: the parallelogram and the ellipse. The analogy consists in the
318
Intuition and the Axiomatic Method
fact that the microphysical object corresponds to the “tin can” and appears in measurement either as a classical particle or as a classical wave. The intuitive explanation of theory construction raises a number of problems and thus points to the limits of intuitive cognition. The explanations rest on plausible assumptions in intuition. This leads to problems (α) of unequivocalness, (β) of consistency and (γ) of adequacy. (i) The problem of unequivocalness: The problem of unequivocalness has been recognized and formulated with different emphasis in philosophy, in the philosophy of science and in physics. Kuhn formulated the problem very clearly as a disambiguation of structuring. It is possible to interpret the domain of phenomena in several ways, in the sense of a Gestalt switch. There is thus an analogy between the folding figures of Gestalt psychology and the Kuhnian paradigm change in science. In the case of the folding figures, parts of the figures may change the specific role they play in the system as a whole. Analogously, one and the same phenomenon is interpreted differently in different paradigmatic contexts. Alternative implementations of structure are possible through a different choice of basic concepts and principles. Structuring is not unequivocal in the sense that it is always possible to achieve a different structuring by means of other basic concepts and principles. The supplementation of missing information is always possible in several ways. Their arbitrariness can perhaps be reduced through plausible arguments; yet it still remains speculative. (ii) The problem of consistency: Sensible intuition provides the impetus and guidance to the formation of concepts. The basic concepts are initially formed intuitively with reference to concrete spatiotemporal phenomena but must later be refined gradually through mathematical methods. The result is a strict definition that allows for strict inferences. The question how basic mathematical and physical concepts can be developed can no longer be answered merely with reference to the construction. The investigations into this issue must become more subtle, since intuition can lead to contradictions (this is shown in 5.a. with the example of the concept of a continuum). (iii) The problem of adequacy: Every theory can contain hidden assumptions and tacit presuppositions, the adequacy of which must first be examined. Euclidean geometry, which conducts its proofs by means of the construction of figures, presupposes that figures in space preserve their shape when moved and that their congruence can be determined. The
Intuitive Cognition and the Formation of Theories
319
relations in generally curved spaces show that these unquestioned assumptions apply only to spaces with a constant measure of the curvature. Only an analytically formulated geometry is able to uncover this fact, since it defines intrinsic quantities that are suitable for describing spatial relations through numerical relations. Hilbert calls the investigation into the tacit presuppositions of a theory the regressive task of mathematics. To the extent that progress in knowledge refers to a deepening of the foundations of a theory in Hilbert’s sense, this signifies the idea of making the concepts and axioms of a scientific theory more precise and structurally richer. This always includes transitions from the finite to the infinite, the possibility of which cannot be demonstrated either with empirical or with logical force. Such a transition only becomes intuitive through the reference to an intuitive cognition that refers to abstract structure (as shown in 5.b with the example of a complete induction). Axiomatic theories based on abstract calculi, which define their semantics via model-theoretic approaches, have the advantage that they do not require any explicit definitions of basic concepts and hence have to rely much less on sensible-intuitive cognition. They have the disadvantage, however, that they can only be formulated at the end of a long theoretical development.
5.
An example from mathematics: Poincar´e’s concept of intuition
In the first chapter of his book The Value of Science, entitled “Intuition and Logic in Mathematics,” Poincar´e analyzes the concept, the functions and the limits of intuition in mathematics. Poincar´e begins by discussing two fundamentally distinct methodological approaches: (α) the method of the analytic mathematicians, which is guided by logic and proceeds step by step and (β) the method of the geometers, which makes use of intuition and “at one fell swoop makes great conquests that are, however, not always reliable”. In terms of neuroscience, this corresponds to the distinction between sequential and parallel information processing. Poincar´e, however, rightfully emphasizes the fact that even the analytical method does not rest on logic alone, but requires intuition as well. From his perspective, intuition is the tool of invention. We believe that in our reasonings we no longer appeal to intuition; the philosophers will tell us this is an illusion. Pure logic could never lead us to anything but tautologies; it could create nothing new; not from it alone can any science issue. In one sense these philosophers are right; to make arithmetic, as to make geometry, or to make any science, something else than pure logic is necessary. To designate this something else we have no word other than intuition. But how many different ideas are hidden under this same word?12
320
Intuition and the Axiomatic Method
Poincar´e speaks of different types of intuition and gives examples from logic, geometry and arithmetic: The basic concepts are formed in intuition. The axioms are considered intuitively clear, and the methods are justified with reference to intuition. His most important concepts are: (α) the intuition through the senses and the imagination (in other passages, Poincar´e speaks of intuition of perception or sensible intuition), (β) the “intuition of the pure numbers” and (γ) complete induction, which is based on the abstract concept of a number and which includes the transition from the finite to the infinite.13 We have then many kinds of intuition; first, the appeal to the senses and the imagination; next, generalization by induction, copied, so to speak, from the procedures of the experimental sciences; finally, we have the intuition of pure number, . . . which is able to create the real mathematical reasoning. . . . Now in the analysis . . . there can be nothing but syllogisms or appeals to this intuition of pure number, the only intuition which can not deceive us.14
The first concept refers to phenomenally oriented sensible intuition pertaining to concrete geometric figures. The second concept refers to structurally oriented rational intuition aimed at an abstract concept of number. This concept of number is removed from all of its relations to spatiotemporal phenomena and hence can only be grasped as an ideal construct. The third concept also refers to the structurally oriented rational intuition in as much as the complete induction is based on the abstract concept of number. In addition, this version also contains a methodological aspect. Poincar´e clearly points to the indispensable role of intuition — both to its functions as well as to its limits in connection with theory construction on the one hand and theory interpretation on the other. Here it is above all the “unity of a proof,” the holistic grasp of the facts that makes it possible “to see the goal already from a distance” and “to choose the path”. The intuitive insight into the unity of a proof enables the researcher to choose from among the innumerable possible logical inferences those that will yield the entire proof. One who only knows the individual steps in a mathematical proof but not the unity of the proof — writes Poincar´e — resembles a naturalist who observes an elephant only under a microscope. The intuitive insight into the unity of a proof thus initially provides the guideline for the discovery of the progression of the proof but later also for the understanding of this progression itself. Intuition therefore fulfills a dual function: On the one hand it is required for invention and on the other hand for understanding. The logician cuts up, so to speak, each demonstration into a very great number of elementary operations; when we have examined these operations one after the other and ascertained that each is correct, are we to think we have grasped the real meaning of the demonstration? . . . Evidently not; we shall not possess the entire reality; that I know not what which
Intuitive Cognition and the Formation of Theories
321
makes the unity of the demonstration will completely elude us.15 This shows us that logic is not enough; that the science of demonstration is not all science and that intuition must retain its role as complement . . . .16 We need a faculty which makes us see the end from afar, and intuition is this faculty. It is necessary for the explorer for choosing his route; it is not less so to the one following his trail who wants to know why he chose it.17
A further basic function of intuition consists in the fact that it provides the initial guidance to the formation of concepts. Thus, the basic concepts are at first formed sensibly-intuitively but must later be refined gradually through mathematical methods. The result is a strict definition that allows for strict inferences. The reason for the lack of certainty of sensible intuition lies in the fact that it is based on the senses or on the power of imagination and thus can deliver “only a crude image” and “not a precise idea”. It was not slow in being noticed that rigor could not be introduced in the reasoning unless first made to enter into the definitions. For the most part the objects treated of by mathematicians were long ill defined; they were supposed to be known because represented by means of the senses or the imagination; but one had only a crude image of them and not a precise idea on which reasoning could take hold.18 Intuition can not give us rigor, . . . .19
The question of how it is possible to form basic mathematical concepts can no longer be answered merely with reference to geometric construction. The investigations into this issue must become more subtle, because sensible intuition may lead to contradictions, thus revealing its limits.
5.1
Example: The concept of a continuum
The example of the concept of a continuum may illustrate the functions and limits of intuition. The first attempt at the formation of this concept closely follows sensible intuition. Poincar´e considers a series of perception produced, for example, by the different weights A, B and C. If the differences in weight between A and B and between B and C are sufficiently small, they are possibly no longer perceptible, even though the weight A remains distinguishable from the weight C. Thus, to the extent that we only consider perception, these facts would be expressed by the following formulas: A = B, B = C, A = C. According to the laws of logic, this procedure leads into contradiction, thus revealing sensible intuition to be in need of correction. Even if crude sense perception is refined by means of measuring devices, the contradiction will at some point reappear. But here is an intolerable disagreement with the law of contradiction, and the necessity of banishing this disagreement has compelled us to invent the mathematical continuum. We are therefore forced to conclude that this notion has
322
Intuition and the Axiomatic Method been created entirely by the mind, but it is experiment that has provided the opportunity. . . . Although we may use the most delicate methods, the rough results of our experiments will always present the characters of the physical continuum with the contradiction which is inherent in it. We only escape from it by incessantly intercalating new terms between the terms already distinguished, and this operation must be pursued indefinitely.20
Sensible intuition can only provide a partial view of reality, and that is the view relating to the finite. Sensible intuition therefore encounters its limits where it gets caught up in contradiction. Mechanisms of control and correction are required in order to resolve these contradictions. To sum up, the mind has the faculty of creating symbols, and it is thus that it has constructed the mathematical continuum, which is only a particular system of symbols. The only limit to its power is the necessity of avoiding all contradiction; but the mind only makes use of it when experiment gives a reason for it. In the case with which we are concerned, the reason is given by the idea of the physical continuum, drawn from the rough data of the senses. But this idea leads to a series of contradictions from each of which in turn we must be freed.21
The example of the concept of a continuum shows three things: (α) Concept formation requires phenomenally oriented sensible intuition. (β) Contradictions may appear that require a replacement of the intuitive concept formations with strict definitions through mathematical methods. (γ) These definitions are based on the transition from the finite to the infinite, which is established with reference to the structurally oriented rational intuition.
5.2
Example: Complete induction
Poincar´e stresses that complete induction is based on the intuition of pure numbers. Structurally oriented rational intuition hence provides the transition from the finite to the infinite. We first show that a theorem is true for n = 1; we then show that if it is true for n − 1 it is true for n, and we conclude that it is true for all integers. . . . The essential characteristic of reasoning by recurrence is that it contains, condensed, so to speak, in a single formula, an infinite number of syllogisms. . . . But however far we went we should never reach the general theorem applicable to all numbers, which alone is the object of science. To reach it we should require an infinite number of syllogisms, and we should have to cross an abyss which the patience of the analyst, restricted to the resources of formal logic, will never succeed in crossing. . . . . . . to prove even the smallest theorem he must use reasoning by recurrence, for that is the only instrument which enables us to pass from the finite to the infinite. ... This rule, inaccessible to analytical proof and to experiment, is the exact type of the a priori synthetic intuition.22
Intuitive Cognition and the Formation of Theories
323
As Kant before him, Poincar´e too raises the question concerning the possibility of mathematics. Mathematics has two characteristic features: (α) It is distinguished by the universality and necessity of its propositions. (β) It is distinguished by an expansion of knowledge. With regard to these two features, mathematics cannot be classified — according to Kant and Poincar´e — either as analytic a priori or as synthetic a posteriori. Poincar´e agrees with Kant to the extent that he also attributes great importance to construction, which has helped mathematics enter upon the secure path of a science. In contrast to Kant, however, he believes that although this feature is necessary, it is not sufficient. Mathematicians therefore proceed “by construction,” they “construct” more complicated combinations. When they analyse these combinations, these aggregates, so to speak, into their primitive elements, they see the relations of the elements and deduce the relations of the aggregates themselves. . . . Great importance has been rightly attached to this process of “construction,” and some claim to see in it the necessary and sufficient condition of the progress of the exact sciences. Necessary, no doubt, but not sufficient! . . . We can only ascend by mathematical induction, for from it alone can we learn something new. Without the aid of this induction, which in certain respects differs from, but is as fruitful as, physical induction, construction would be powerless to create science.23
Thus there are two features that help mathematics enter upon the secure path of a science: (α) construction and (β) complete induction. As far as constructions in Euclidean geometry are concerned, phenomenally oriented sensible intuition is sufficient. When it concerns mathematics in general, however, science also requires structurally oriented rational intuition.
Notes 1. McCloskey (1983). 2. Vollmer (1988), Engels (1989). 3. Piaget (1970), Case (1985), Fetz (1988). 4. Metzger (1975), Metzger (1986). 5. Kandel / Schwartz / Jessell (1995), Churchland (1996), Hedrich (1998), Spitzer (2000). 6. Ritter / Martinez / Schulten (1990), Dorffner (1991). 7. The distinction between a cognitive subject and an object of cognition is only intended as a heuristic device for the analysis of the cognitive process and is not supposed to establish a dualist world view. 8. The analogy of purchasing an automobile may illustrate the point. Automobiles must be equipped with a minimal set of features merely in order to be roadworthy. In addition, there are various luxury features corresponding to the individual preferences of drivers. Mesocosmic structuring corresponds to the standard equipment, allowing human beings to act so as to ensure their survival. Scientific structuring would then be luxury equipment. 9. Falkenburg shows that Kant’s forms of intuition and categories of thought do not force the acceptance of Euclidean geometry and Newtonian physics (Falkenburg (2002), this volume pp. 267–292). 10. Dorffner (1991). Sensible cognition corresponds to pure sensory input. 11. Vollmer (1988).
324
Intuition and the Axiomatic Method
12. Poincar´e (1958), p. 19. 13. Folina (1994). 14. Poincar´e (1958), p. 20. 15. Poincar´e (1958), p. 21f. 16. Poincar´e (1958), p. 21. 17. Poincar´e (1958), p. 22. 18. Poincar´e (1958), p. 18. 19. Poincar´e (1958), p. 17. 20. Poincar´e (1952), p. 22f.. 21. Poincar´e (1952), p. 27. 22. Poincar´e (1952), p. 9–13. 23. Poincar´e (1952), p. 15f.
References Case, R. (1985), Intellectual Development, Academic Press, New York. Churchland, P. M. (1996), The Engine of Reason, The Seat of the Soul. A Philosophical Journey into the Brain, MIT Press, Cambridge. Dorffner, G. (1991), Konnektionismus. Von neuronalen Netzwerken zu einer “nat¨urlichen” KI, Teubner, Stuttgart. Engels, E. M. (1989), Erkenntnis als Anpassung?, Suhrkamp, Frankfurt. Falkenburg, B. (2002), “Functions of Intuition in Quantum Physics”, this volume, 267–292. Fetz, R. L. (1988), Struktur und Genese. Jean Piagets Transformation der Philosophie, Haupt, Bern. Folina, J. (1994), “Logic and Intuition in Poincar´e’s Philosophy of Mathematics” in: Greffe, J. L., G. Heinzmann, K. Lorenz (1994), 417–434. Greffe, J. L., G. Heinzmann, K. Lorenz (1994) (eds.), Henri Poincar´e. Science et Philosophie. Science and Philosophy. Wissenschaft und Philosophie, Akademie, Berlin. Hedrich, R. (1998), Erkenntnis und Gehirn. Realit¨at und ph¨anomenale Welten innerhalb einer naturalistisch-synthetischen Erkenntnistheorie, Mentis, Paderborn. Kandel, E. R., J. H. Schwartz, T. M. Jessell (1995), Essentials of Neural Science and Behavior, Appleton & Lange Norwalk, Connecticut. McCloskey, M. (1983), “Intuitive Physics” in: Scientific American 248, 122–130. Metzger, W. (1975), Gesetze des Sehens, Kramer, Frankfurt. Metzger, W. (1986), Gestalt-Psychologie, Kramer, Frankfurt. Piaget, J. (1970), Genetic Epistemology, Columbia University Press, New York. Poincar´e, H. (1952), Science and Hypothesis, Dover Publications, New York. Poincar´e, H. (1958), The Value of Science, unabridged and unaltered republication of the first English translation, Dover Publications, New York. Ritter, H., T. Martinetz, K. Schulten (1990), Neuronale Netze: Eine Einf¨uhrung in die Neuroinformatik selbstorganisierender Netzwerke, Addison-Wesley, Bonn. Spitzer, M. (2000), Geist im Netz. Modelle f¨ur Lernen, Denken und Handeln, Spektrum Akademischer Verlag, Heidelberg. Vollmer, G. (1988), Was k¨onnen wir wissen?, Vol. I and II, Hirzel, Stuttgart.