Representation and Productive Ambiguity in Mathematics and the Sciences
This page intentionally left blank
Represen...
112 downloads
1219 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Representation and Productive Ambiguity in Mathematics and the Sciences
This page intentionally left blank
Representation and Productive Ambiguity in Mathematics and the Sciences
Emily R. Grosholz
1
1
Great Clarendon Street, Oxford ox2 6dp Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York Emily R. Grosholz The moral rights of the authors have been asserted Database right Oxford University Press (maker) First published 2007 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available Typeset by Laserwords Private Limited, Chennai, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn, Norfolk ISBN 978–0–19–929973–7 10 9 8 7 6 5 4 3 2 1
This book is dedicated to Donald Gillies and Grazia Ietto-Gillies, and to Carlo Cellucci and Mirella Capozzi, with affection and respect.
This page intentionally left blank
Contents Preface Acknowledgments
Part I: Introductory Chapters 1. Productive Ambiguity: Galileo contra Carnap 1.1. Galileo’s Demonstration of Projectile Motion 1.2. Carnap on Language and Thought 1.3. From the Syntactic to the Semantic to the Pragmatic Approach 1.4. A Pragmatic Account of Berzelian Formulas
2. Analysis and Experience 2.1. Analysis 2.2. Mathematical Experience
Part II: Chemistry and Geometry 3. Bioorganic Chemistry and Biology 3.1. 3.2. 3.3. 3.4. 3.5. 3.6.
What Lies Between Representing and Intervening The Reduction of a Biological Item to a Chemical Item Formulating the Problem Constructing the Antibody Mimic Testing the Antibody Mimic Conclusions
4. Genetics and Molecular Biology 4.1. 4.2. 4.3. 4.4. 4.5. 4.6.
Objections to Hempel’s Model of Theory Reduction The Transposition of Genes: McClintock and Fedoroff McClintock’s Studies of Maize J. D. Watson’s Textbook Fedoroff’s Translation of McClintock The Future of Molecular Biology
xi xv 1 3 5 16 19 24
33 33 47
61 63 64 68 71 74 84 89
91 92 97 103 111 113 123
viii
contents
5. Chemistry, Quantum Mechanics, and Group Theory 5.1. 5.2. 5.3. 5.4. 5.5. 5.6.
Symbols, Icons, and Iconicity Representation Theory Molecules, Symmetry, and Groups Symmetry Groups, Representations, and Character Tables The Benzene Ring and Carbocyclic Systems Measuring Delocalization Energy in the Benzene Molecule
Part III: Geometry and Seventeenth Century Mechanics 6. Descartes’s Geometry 6.1. 6.2. 6.3. 6.4. 6.5.
Locke’s Criticism of Syllogistic Descartes’ Geometry as the Exemplar of Cartesian Method Diagrams as Procedures Generalization to the Construction of a Locus Generalization to Higher Algebraic Curves
7. Newton’s Principia 7.1. 7.2. 7.3. 7.4.
Philip Kitcher on History Jean Cavaill`es on History Book I, Propositions I and VI in Newton’s Principia Book I, Proposition XI in Newton’s Principia
8. Leibniz on Transcendental Curves 8.1. 8.2. 8.3. 8.4. 8.5.
The Principle of Continuity Studies for the Infinitesimal Calculus The Principle of Perfection The Isochrone and the Tractrix The Catenary or La Chainette
Part IV: Geometry and Twentieth Century Topology 9. Geometry, Algebra, and Topology 9.1. Vuillemin on the Relation of Mathematics and Philosophy 9.2. Euclid’s Elements and Descartes’ Geometry Revisited
126 126 128 131 135 139 147
157 159 159 165 169 171 177
184 187 189 192 198
204 205 207 213 215 221
225 227 227 230
contents ix 9.3. Kant’s Transcendental Aesthetic: Extrinsic and Intrinsic Intuition 9.4. The First Pages of Singer and Thorpe 9.5. De Rham’s Theorem 9.6. Nancy Cartwright on the Abstract and Concrete
10. Logic and Topology 10.1. 10.2. 10.3. 10.4. 10.5. 10.6.
Penelope Maddy on Set Theory A Brief Reconsideration of Arithmetic The Application of Logic to General Topology Logical Hierarchies and the Borel Hierarchy Model Theory and Topological Logics Coda
List of Illustrations Glossary Bibliography Books Articles Index
232 235 243 254
257 259 262 268 273 279 283
285 291 293 293 299 307
This page intentionally left blank
Preface My philosophical account of representation and productive ambiguity in this book revisits a series of demonstrations that exhibit the special kind of cogency we find in mathematics, as well as some scientific demonstrations in which empirical evidence may also intervene: Galileo’s proof that the trajectory of projectile motion is parabolic, the construction of an antibody mimic, a description at the molecular level of the maize Spm transposon, the formal construction of molecular orbitals for benzene, Descartes’ solution for Pappus’s problem, Newton’s proof of the inverse square law, Leibniz’s construction of several transcendental curves, a proof of De Rham’s theorem, Stone’s Representation Theorem and G¨odel’s completeness and incompleteness theorems. In so doing, it treats a number of highly reductive methods: the geometrization of mechanics, mapping of nucleotide sequences, computations of LCAO-MO theory, the algebraization of geometry, the infinitesimal calculus, the use of group theory in algebraic topology and chemistry, and the deployment of logic in number theory and topology. I do not refer to the individual psychology of the mathematicians and scientists whose work I study, nor to the peculiar political and cultural pressures that shape the social institutions where mathematicians and scientists must labor. My intention in this book is to do full justice to the special rigor of mathematical and scientific reasoning, but I insist that such rationality is more inclusive and multifarious than philosophers of mathematics have admitted during the past century. Demonstration is not only a matter of logic; it is also the deployment of a variety of modes of representation in the service of problem solving (certain problems and not others, by certain means and not others) and rational persuasion. Someone must persuade other people that a problem about intelligible objects has been solved, given the representations at hand and standards of proof appropriate to them. Rational persuasion is thus historically located, and context dependent. To judge how a demonstration works and why it is persuasive, we must take into account its pragmatic as well as its semantic and syntactic dimensions; but the goal is still truth, and
xii
representation and productive ambiguity
the intelligible objects still present their stubborn indifference to the whims of mathematicians and the policies of kings. And my case studies show that reductive methods are successful at problem-solving not because they eliminate modes of representation, but because they multiply and juxtapose them; and this often creates what I call productive ambiguity. My favorite part of mathematics is geometry. One of my aims in writing this book is to defend the rights, rationality, and irreducibility of geometry within mathematics, as well as the primacy or canonicity of certain geometrical objects, like the Euclidean line, right triangle and circle. Over the years, my defense of geometry has led me to other themes, which again return me to geometry. Inspired by Leibniz, I define ‘analysis’ as the search for conditions of intelligibility and as the essential enterprise of mathematics; discovery and justification are then two aspects of one rational way of proceeding, and the source of the mathematician’s formal experience. I defend the importance of iconic, as well as symbolic and indexical, signs in mathematical representation, and the indispensability of pragmatic, as well as syntactic and semantic, considerations to the study of mathematical reasoning. I re-examine what is meant by experience in mathematical reasoning, and criticize the notion of ‘intuition’ as we find it in Plato, Descartes, Kant, and Brouwer. The philosophical account so generated has led me to a close examination of mathematical notation and diagrams, and the way results are presented on the page in mathematical (and biological, chemical, and mechanical) texts. When two or more traditions combine in the service of problem-solving, we find juxtaposed and superimposed notations and diagrams, as well as notations and diagrams that are subtly altered from their earlier uses, surrounded by prose in natural language that tries to explain how to use them in combination. Viewed this way, the texts often yield striking examples of language and notation that are irreducibly ambiguous and productive because they are ambiguous. Mathematical reality (as ‘given’ and independent of human knowing) is determinate but infinite. As determinate, it lends itself admirably to discourse; as infinite, it does not. Our notations are paper tools—to borrow Ursula Klein’s useful term—that represent the infinitary in finitary terms by eliciting periodicity and (more generally) repetition from mathematical things; by assigning limits or constraints; by articulating continua; and simply by negating, by excluding. They also suggest, and are suggested by, paradigmatic items and problems that focus mathematical activity on one
preface xiii thing rather than another. Mathematical notation is selective, and it makes things ‘compact.’ It transforms the infinite so that it becomes tractable and visible; to reverse Donne’s famous line, it makes one Everywhere a little room. This way of talking about notation should remind the reader of the emphasis on models and/or ‘nomological machines’ in the writings of philosophers like Ian Hacking, Nancy Cartwright, Kenneth Schaffner, and Bas van Fraassen; but my study of mathematical texts always pushes me beyond the semantic into the pragmatic, because we can’t know what and how the notation and diagrams represent unless we understand the problem-solving context. Notation and diagrams represent mathematical items, but their reference and meaning must be carefully explained; the simple appeal to metalanguage and object language is insufficient, as the attempt to express mathematics in any single idiom is ill-judged and unrealistic. Moreover, meaning often goes beyond the intent of mathematician or textbook author, and may well generate novelties: a notation is never merely material, and never wholly circumscribed by its initial use, but takes on a life of its own as a formal reality. Thus in ways unintended by their makers, Arabic numerals make certain concepts essential to modern number theory visible; the recreation of Cossist notation by Vieta and Descartes as part of analytic geometry begins the study of polynomials; and the idiom of modern logic recast by Frege and Russell generates the study of recursive functions and (in conjunction with nineteenth century analysis) sets. Mathematical items are not to be confused with notation or diagrams; notation and diagrams never exhaust or replace the mathematical items they are designed to represent, even when those items are idealized versions of notation, that is, even when they are precipitated by notation. However, mathematics cannot be carried on without good notation and diagrams, and the study of mathematical rationality cannot dispense with the study of representations, including the question of when and why they are good. And even a good notation or diagram has its limitations; no representation is ever perfectly expressive, for if it were it would not be a representation but the thing itself. Some of these insights into the role of representations can be argued with special clarity by first looking at organic chemistry (where chemistry reduces biology) and molecular biology (where molecular biology reduces genetics), and then at the role geometry plays in chemistry, in particular
xiv
representation and productive ambiguity
the use of symmetry groups in the study of molecules (where quantum mechanics reduces chemistry). The philosopher of mathematics might object that chemistry is impertinent, since it has a ‘real’ subject matter that is studied in the laboratory. But the situation is not so simple. The purified sludges and vapors that chemists capture in their test tubes are studied as a conduit to a subject matter (molecules) that is strongly inaccessible: 1023 orders of magnitude below the world we live in, where stable structures never exist in isolation and never stop oscillating and transforming. We can no more set an oxygen molecule on the table in order to examine it than we can a right triangle. So in Part II, I introduce an excursion into chemistry, where the existence of the things studied is problematic but not now contested; and I show that the same arguments may be transferred to the intelligible things of mathematics.
Acknowledgments I would like to thank the following people, journals, and publishing houses for their kindness in granting permission to reproduce the following items, as indicated: Wiley-VCH Verlag GmbH & Co KG for permission to reproduce Diagrams 1–4 and Figures 1–4 in A. Hamilton, Y. Hamuro, M. C. Calama, and H. S. Park, ‘A Calixarene with Four Peptide Loops: An Antibody Mimic for Recognition of Protein Surfaces,’ Angewandte Chemie, International English Edition 36 (23) (Dec. 1977), 2680–3. Professor A. Hamilton for permission to reproduce the images just cited. Cold Spring Harbor Laboratory Press for permission to reproduce Figure 1 on p. 1196 of N. V. Fedoroff and D. D. Brown, ‘The Nucleotide Sequence of the Repeating Unit in the Oocyte 5S Ribosomal DNA of Xenopus laevis,’ Cold Spring Harbor Symposia on Quantitative Biology, 42 (1977), 1195–200. Elsevier, Global Rights, for permission to reproduce Figure 4 on p. 704 and Figure 11 on p. 712 in N. V. Fedoroff and D. D. Brown, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-Rich Spacer,’ Cell, 13 (1978), 701–16. John Wiley & Sons, Inc., Global Rights Department, for permission to reproduce Figure 1 on p. 294 and Figure 2 on p. 296 in N. Fedoroff, M. Schl¨appi, and R. Raina, ‘Epigenetic Regulation of the Maize Spm Transposon,’ BioEssays, 17(4) (1994), 291–7. Professor N. Fedoroff for permission to reproduce the images just cited. Cold Spring Harbor Laboratory Press for permission to reproduce Figure 8 on p. 21 of B. McClintock, ‘Chromosome Organization and Genic Expression,’ Cold Spring Harbor Symposia on Quantitative Biology (1951 / 1952), 13–47. The Genetics Society of America for permission to reproduce Table 4 on p. 591 and Table 5 on p. 594, in B. McClintock, ‘Induction of instability at selected loci in maize,’ Genetics, 38 (1953), 579–99.
xvi
representation and productive ambiguity
Elsevier, Global Rights, for permission to reproduce Plate 1, facing p. 446, in F. Sanger and A. R. Coulson, ‘A Rapid Method for Determining Sequences in DNA by Primed Synthesis with DNA Polymerase,’ Journal of Molecular Biology, 94 (1975), 441–8. Elsevier, Global Rights, for permission to reproduce Figure 8.20 on p. 184, and Figure 11.27 on p. 289 in F. Brescia, J. Arents, H. Meislich, and A. Turk, Fundamentals of Chemistry: A Modern Introduction (New York: Academic Press, 1966). American Physical Society for permission to reproduce Figure 8.20 just cited, as it originally appeared in Physical Review, 37 (1931), 1416, in H. E. White, ‘Pictorial Representations of the Electron Cloud for Hydrogen-like Atoms.’ John Wiley & Sons, Inc., Global Rights Department, for permission to reproduce Figure 3.2 on p. 25, the six schemata on p. 158, and all of pp. 146 and 147 in F. Albert Cotton, Chemical Applications of Group Theory, 3rd edn (New York: John Wiley & Sons, 1990). Professor F. A. Cotton for permission to reproduce the figures and pages just cited. Springer Science and Business Media for permission to reproduce the diagram on p. 299 of ‘On the Representation of Curves in Descartes’ G´eom´etrie,’ by H. J. M. Bos in Archive for History of Exact Sciences, 24 (1981), 295–338. The Georg Olms Verlag AG for permission to reproduce Figures 119, 120, 121, and 139 from the folding pages of plates at the end of G. W. Leibniz, Mathematische Schriften, ed. C. I. Gerhardt, Vol. 5 (Hildesheim: Georg Olms, 1962). Springer Science and Business Media for permission to reproduce Figure 1.1 on p. 3, Figure 1.2 on p. 6, Figure 5.18 on p. 152, and Figure 6.4 on p. 159, as well as pp. 162–4, from I. M. Singer and J. A. Thorpe, Lecture Notes on Elementary Topology and Geometry (New York: Springer, 1967). Professor I. M. Singer for permission to reproduce the figures and pages just cited. My thanks go to Sandra Stelts, Curator of Rare Books and Manuscripts, Pattee Library and Paterno Library, The Pennsylvania State University, for allowing me to photograph images from R. Descartes, Geometria (Amsterdam, 1683), and I. Newton, Philosophiae naturalis principia mathematica,
acknowledgments xvii (London 1714); and to the staff of Interlibrary Loan in the same libraries, and to the University of Chicago Libraries, for helping me to borrow and photograph images from Vol. 8 of G. Galilei, Opere, Edizione Nazionale (1890–1909). I would also like to thank Karine Chemla, Director, and the other members of REHSEIS (Equipe Recherches Epist´emologiques et Historiques sur les Sciences Exactes et les Institutions Scientifiques), Universit´e Paris 7 Denis Diderot et Centre National de la Recherche Scientifique, for providing me with a stimulating and encouraging work environment during my last sabbatical year, 2004–05. I am happy to be a corresponding member of this research group. I owe a debt of gratitude as well to Elhanan Yakira at Hebrew University and Roald Hoffmann at Cornell University for on-going conversations that have left a deep imprint on this book. I would like to thank four successive heads of the Department of Philosophy (Mitchell Aboulafia, Nancy Tuana, John Christman, and Shannon Sullivan) and the Dean of the College of the Liberal Arts, Susan Welch, at the Pennsylvania State University, for their support of my sabbatical year in Paris and the research links that have followed from it. And I would like to thank as well the National Endowment for the Humanities for granting me a fellowship during 2004 that allowed me to spend the whole year in Paris, as well as the American Council of Learned Societies for granting me a fellowship in 1997 that allowed me to spend a whole year at the University of Cambridge, where this book first took shape. The exuberance of my year in Paris is attested by the following forthcoming or very recent publications, which have not made their way into the footnotes, but which should be acknowledged here, with thanks to the various editors; because they constitute the last stages of my thinking progress up to the completion of this book, they overlap with various pages in it. ‘Constructive Ambiguity in Mathematical Reasoning,’ Mathematical Reasoning and Heuristics, eds. D. Gillies and C. Cellucci (London: King’s College Publications, 2005), 1–23. ‘Productive Ambiguity in Leibniz’s Infinitesimal Calculus,’ Leibniz and Infinitesimals, ed. U. Goldenbaum (Princeton: Princeton University Press), forthcoming.
xviii
representation and productive ambiguity
‘Leibniz on Mathematics and Representation: Knowledge through the Integration of Irreducible Diversity,’ The Young Leibniz, ed. M. Kulstad, Studia Leibnitiana (Sonderheft, forthcoming). ‘How to Say Truth Things in Algebraic Topology,’ Demonstrative and Nondemonstrative Reasoning in Mathematics and Natural Science, ed. P. Pecere (Cassino: Edizioni dell’ Universit`a degli Studi di Cassino, 2006, 27–54). ‘Locke and Leibniz on Form and Experience,’ Locke and Leibniz, eds. M. de Gaudemar and P. Hamou, (Editions CNRS, forthcoming). ‘Form and Experience: Leibniz and Hume on the Correction of Knowledge,’ Leibniz: What Kind of Rationalist?, ed. M. Dascal, (Berlin: Springer Academic Publications, forthcoming). ‘Iconic and Symbolic Modes of Representation in Descartes’ Geometry,’ Festschrift for Henk Bos, eds. M. Panza and S. Maronne (London: King’s College Publications, forthcoming). The dedication of this book testifies to twenty years of friendship, and an ever-increasing admiration as I watch my four friends grow into their respective projects, flourish as individuals and couples, and keep the many aspects of their lives in balance. I recommend Carlo Cellucci’s and Donald Gillies’ work in the footnotes of this book, but I would also like to recommend Mirella Capozzi’s Kant e la logica and Grazia Ietto-Gillies’ Transnational Corporations and International Production: Concepts, Theories, and Effects, each in its own field a magisterial work. Finally, I would like to thank my husband, Robert R. Edwards, for his inspiration as a scholar, his love as a husband, and his support as a friend; and our children—Ben, Robbie, William, and Mary-Frances—for the warmth, laughter, and spirit of unpredictability they scatter around the house every day.
PA RT I
Introductory Chapters
This page intentionally left blank
1 Productive Ambiguity: Galileo contra Carnap Argument that employs controlled and highly structured ambiguity can play a central role in mathematical discovery and justification. The exposition of projectile motion given on the Fourth Day, following preliminaries on the Third Day, of Galileo’s Discourses and Mathematical Demonstrations Concerning Two New Sciences¹ is a good illustration of my central claim. I propose to begin this book in medias res, with a case study, because Galileo’s proofs are canonical, paragons of mathematical and scientific demonstration that set the stage for the scientific revolution. If productive ambiguity is central to these proofs, we can expect it to be both commonplace and rational in many other settings. In the rest of the book, I will show that this is indeed the case; and I will draw out the philosophical consequences of this insight for epistemology, including a theory of mathematical knowledge. In the analysis of free fall on the Third Day, the use of proportions is polyvalent because Galileo asks us to read their terms both as finite and as infinitesimal. When we read them as finite, they allow for the application of Euclidean results and also exhibit patterns among whole numbers; and their configurations stand iconically for geometrical figures. When we read them as infinitesimal, they allow for the elaboration of the beginnings of a dynamical theory of motion, leading to the work of Torricelli and Newton; and their configurations stand symbolically for dynamical, temporal processes. In the exposition of projectile motion, the curve of the semi-parabola, read iconically, stands for a temporal, dynamical process that we ‘see’ whenever a projectile leaves a trail behind it; read symbolically, it stands for an infinite-sided polygon that articulates the ¹ Galileo Galilei, Dialogues Concerning Two New Sciences, tr. Henry Crew and Alfonso de Salvio, (New York: Dover, 1914/1954); hereafter referred to as Discorsi.
4
introductory chapters
rational relations among an infinite array of instances of uniform motion that compose the accelerated, curvilinear motion of the projectile. And the rationality of that reduction is justified by results involving proportions and the similarity of geometric figures.² Galileo’s use of ambiguous modes of representation is typical of reasoning in mathematics, even though the pattern has not been sufficiently noted and studied by philosophers of mathematics. Under the influence of the Vienna School, Anglophone philosophers often write as if the language of mathematics and science is, or ought to be, univocal and transparent; the second section of this chapter examines this thesis, which I contest, in the writings of Carnap. The terms of an ideal language, Carnap argues, should refer one-to-one to all and only those things that exist, and its predicates and relations should follow suit. Thus, its locutions should not refer to more than one state of affairs at a time, and should not add anything to the situation: there should be no linguistic ‘artifacts.’ In the third section, I locate my own position in the context of a general tendency in Anglophone philosophy of science and mathematics to move from a syntactic approach to a semantic and indeed pragmatic approach, which studies the use of language in terms of its representational role in an historical context of problem-solving. The ‘pragmatic’ philosophers find that problem-solving typically requires the juxtaposition of a variety of modes of representation; I emphasize that in such contexts a single mode of representation, used iconically for one purpose and symbolically for another, may be called upon to mean more than one thing. The resultant polysemy generates not confusion but insight. In the fourth section, and as a bridge to the second chapter, where I propose an account of analysis and mathematical experience that emerges from my reflection on Leibniz, Hume, Peirce, and the work of various contemporary philosophers and historians, I discuss a case study from the history of chemistry developed by Ursula Klein. This episode prefigures my own chemical and mathematical case ² The philosophical use of the polar terms ‘icon’ and ‘symbol’ is due to C. S. Peirce, who distinguished the former as similar to their objects, and the latter as linked to their objects only by convention. In recent essays, I have made use of the distinction, while insisting on the iconic dimensions of symbols and the symbolic dimensions of icons. Peirce also uses the term ‘indexical’ which I note but make less use of. See C. S. Peirce, ‘On the Algebra of Logic: A Contribution to the Philosophy of Notation,’ The American Journal of Mathematics, 7 (2) (1885), 180–202; repr. in Collected Papers of Charles Sanders Peirce, eds. C. Hartshorne and P. Weiss (Cambridge: Harvard University Press, 1931–), Vol. 3, pars. 359–403.
productive ambiguity: galileo contra carnap 5 studies, where I argue that the role of representation must be understood in its semantic and pragmatic as well as syntactic dimensions. Keeping all three dimensions of mathematical reasoning in view together, I can better address the issues of reduction, explanation, representation, and rationality.
1.1. Galileo’s Demonstration of Projectile Motion Mathematics often requires the combination of different modes of representation in the same argument: equations, diagrams, matrices, tables, proportions, schemata, natural language. Arguments in mathematics do many things. They defend definitions, constitute problems, explain problem solutions, deploy and exhibit procedures or methods, and formally or informally present proofs. When modes of representation are combined in mathematical arguments, they may be juxtaposed or superimposed, or carefully segregated to exhibit certain features of the situation. Some arguments—and I claim this to be true of Galileo’s reasoning examined in this section—may require that one and the same representation be used ambiguously in order for the mathematician to exhibit a novel organization and exploration of things, and for the reader to follow the reasoning. Galileo’s treatment of free fall and projectile motion occurs in the Third Day and Fourth Day of his Discourses and Mathematical Demonstrations Concerning Two New Sciences (referred to hereafter as the Discorsi). The Third Day of the Discorsi is entitled ‘Change of Position,’ and its first section is ‘Uniform Motion.’ Galileo defines uniform motion—straight line motion at a constant speed—as ‘one in which distances traversed by the moving particle during any equal intervals of time, are themselves equal,’ and adds that the equal intervals must be thought of as being arbitrarily chosen; he thus includes the possibility that they may be chosen to be arbitrarily small. The first diagram he offers, which accompanies Theorem I, Proposition I of ‘Uniform Motion,’ consists of two horizontal lines, the line IK representing time and the line GH representing distance that is re-conceptualized to mean displacement, since we are instructed to suppose that a moving particle is traversing it.³ (See Figure 1.1.) The two lines therefore have a different status, since no particle traverses the time-line. Both lines are, ³ Galileo, Discorsi, 153–6.
6
introductory chapters
however, measured off in intervals: the left-hand half of line IK is measured in intervals of length DE and the right-hand half in intervals of length EF, while the left-hand half of line GH is measured in intervals of length AB and the right-hand half in intervals of length BC.
Figure 1.1. Galileo, Discorsi, Third Day, Uniform Motion, Theorem I, Proposition I
Theorem I, Proposition I states, ‘If a moving particle, carried uniformly at a constant speed, traverses two distances the time-intervals required are to each other in the ratio of these distances.’⁴ This theorem asserts that a (non-continuous, that is, without a shared middle term between the two ratios) proportionality AB : BC :: DE : EF holds between any two displacement intervals and any two corresponding time-intervals in ⁴ Galileo, Discorsi, 155–6.
productive ambiguity: galileo contra carnap 7 uniform motion. Galileo has designed the diagrams and the reasoning to allow for a direct application of the Euclidean/Eudoxian axiom, which states that proportions between non-continuous ratios, Q : R :: S : T, can be formed if and only if for all positive integers m, n, when nQ ≤ mR, then correspondingly nS ≤ mT. Its intent is to allow for the comparison of ratios when Q and R are one kind of thing, and S and T are another kind of thing, while holding to the precept that ratios themselves may only compare things of the same kind. In Greek mathematics, ratios cannot hold between lines and numbers, between finite and infinitesimal magnitudes, or between curved lines and straight lines. The Euclidean tradition treats ratios as relations, different from the things related, and proportions as assertions of similitude (not equality) between ratios. There is, however, a second, medieval tradition of handling ratios and proportions that originates with Theon, a commentator on Ptolemy’s Almagest, and is transmitted by Jordanus Nemorarius, Campanus, and Roger Bacon. It associates each ratio with a ‘denomination,’ that is, a number which gives its size, and in general treats the terms occurring in ratios as well as the ratios themselves uniformly as numbers. Thus ratios are just quotients and the distinction between ratio and term is abolished insofar as they are all numbers.⁵ The proportion Q : R :: S : T becomes Q/R = S/T, the equation of two numbers, and so automatically Q × T = R × S. The first tradition governing proportions is invoked here and proves just what Galileo requires; indeed, the second tradition would be unhelpful because in the assertion Q × T = R × S, the product [(time interval) × (displacement interval)] is physically pointless. The really interesting product is [(time-interval) × (mean velocity during that time-interval)], as will appear below. In the sequel, Theorem II, Proposition II, and Theorem III, Proposition III, Galileo examines cases involving two particles in uniform motion, and concludes in Theorem IV, Proposition IV: ‘If two particles are carried with uniform motion, but each with a different speed, the distances covered by them during unequal intervals of time bear to each other the compound ratio of the speeds and time-intervals.’⁶ In other words, a precise ⁵ See Edith Sylla, ‘Compounding Ratios,’ Transformation and Tradition in the Sciences, ed. E. Mendelsohn (Cambridge: Cambridge University Press, 1984), 11–43. ⁶ Galileo, Discorsi, 157–8.
8
introductory chapters
relationship can be established between any two cases of uniform motion; Galileo formulates it as a proportion: D1 : D2 :: [S1 : S2 compounded with T1 : T2 ]. The problem is that ‘compounding’ or finding a product of ratios can only be carried out with continuous ratios, according to the first tradition of handling proportions: to compound the ratios A : B and B : C is rewrite their combination as A : C. However, S1 : S2 and T1 : T2 are not continuous, and Galileo is not willing to treat S1 : S2 and T1 : T2 as fractions that could simply be multiplied and thus compounded according to the second tradition. Galileo solves the problem by finding the middle term I between D1 and D2 , which must satisfy the proportions D1 : I :: S1 : S2 and I : D2 :: T1 : T2 : it is the distance the second particle would traverse in the time-interval allotted to the first particle. Since we can always find such an I, we can always bring S1 : S2 and T1 : T2 , and D1 and D2 into rational relation. The accompanying diagram (see Figure 1.2) is just a collection of line segments, one each for the speed, time, and distance traversed of body E (resp. A, C, and G) and the speed, time, and distance traversed of body F (resp. B, D and L), as well as the seventh line segment, I, which links the two sets of proportions. The reason why Galileo goes to the trouble of showing that two separate cases of uniform motion can be rationally linked in this manner is because he is going to reduce the uniformly accelerated motion of free fall to a series of cases of uniform motion which then must be brought into rational relation. This is a nice instance of problem reduction, leading a problem about a more complex thing (uniformly accelerated motion) back to a problem about a simpler thing (uniform motion). However, we will see that the reduction only works if the intervals in question may be made ‘as small as one wishes,’ which of course leads to a highly non-Euclidean employment of the theory of proportions as well as highly non-Euclidean geometric diagrams. Galileo’s treatment of the proportions and diagrams later on becomes carefully ambiguous; and therein lies the innovation. Theorem I, Proposition I in the section ‘Naturally Accelerated Motion’ states: ‘The time in which any space is traversed by a body starting from rest and uniformly accelerated is equal to the time in which that same space would be traversed by the same body moving at a uniform speed whose value is the mean of the highest speed and the speed just before acceleration
productive ambiguity: galileo contra carnap 9
Figure 1.2. Galileo, Discorsi, Third Day, Uniform Motion, Theorem IV, Proposition IV
began.’⁷ The accompanying figure has two components, a vertical line CD on the right representing space traversed (again, not just distance but displacement), and a two-dimensional figure AGIEFB on the left, in which AB represents time (see Figure 1.3). The two-dimensional figure reproduces Oresme’s diagram that applies the important theorem reached by the logicians at Merton College, Oxford, concerning the mean value of a ‘uniformly difform form’ to uniformly accelerated motion. However, Galileo rotates the diagram by 90◦ because he is going to apply it even more specifically to the case of free fall, and wants to emphasize its pertinence to the vertical trajectory CD. Koyr´e points out that the genius of this set of figures is that AB represents not the distance traversed but time, for Galileo (like Oresme) had wrested geometry from the geometer’s preoccupation with extension and put it in the service of the temporal processes of mechanics.⁸ The left-hand figure represents a process like integration with ⁷ Galileo, Discorsi, 173–4. ⁸ A. Koyr´e, ‘La loi de la chute des corps,’ Etudes galil´eenes (Paris: Hermann, 1939), 11–46. See also my ‘Descartes and Galileo: The Quantification of Time and Force,’ Math´ematiques et philosophie
10
introductory chapters
respect to time: the parallels of the triangle AEB perpendicular to AB stand for velocities, and the area of the triangle as a whole, taken to be a summation of instantaneous velocities, therefore represents distance traversed. Distance is then represented in two different ways, as the line segment CD and as the area of the triangle AEB; because the second representation is a two-dimensional figure, it can exhibit the way that uniformly increasing velocity and time are related in the determination of a distance.
Figure 1.3. Galileo, Discorsi, Third Day, Naturally Accelerated Motion, Theorem I, Proposition I
A two-fold representation of distance also occurs in the analysis of free fall given immediately afterwards in Theorem II, Proposition II, but here the right-hand line gains articulation and the left-hand two-dimensional de l’antiquit´e a` l’age classique: Hommage a` Jules Vuillemin, ed. Roshdi Rashed (Paris: Editions du Centre National de la Recherche Scientifique, 1991) 197–215.
productive ambiguity: galileo contra carnap 11 figure loses some: this theorem is about distances, or rather, displacements (see Figure 1.4). The theorem states: ‘The spaces described by a body falling from rest with uniformly accelerated motion are to each other as the squares of the time-intervals employed in traversing these distances.’⁹ The right-hand figure, the line HI, stands for the spatial trajectory of the falling body, but it is articulated into a sort of ruler, where the intervals representing distances traversed during equal stretches of time, HL, LM, MN, etc., are indicated in terms of unit intervals (by a shorter cross-bar) and in terms of intervals whose lengths form the sequence of odd numbers, 1, 3, 5, 7 ... (by a longer cross-bar). The unit intervals are intended to be counted as well as measured. In the left-hand figure, AB represents time (divided into equal intervals AD, DE, EF, etc.) with perpendicular instantaneous velocities raised upon it—EP, for example, represents the greatest velocity attained by the falling body in the time interval AE—generating a series of areas which are also a series of similar triangles. Galileo then considers two cases of uniform motion, and brings them into rational relation, which proves the theorem. He instructs us to draw the line AC at any angle whatsoever to AB and then, given any two equal time intervals AD and DE, to draw parallel lines DO and EP intersecting AC at O and P. He uses the result just proved to show that the distance traversed by a particle falling from rest with uniformly accelerated motion during the time interval AD (resp. AE) is the same as the distance traversed by a particle moving with speed 1/2 DO (resp. 1/2 EP) during the time interval AD (resp. AE). Thus we know that the ratio D1 : D2 is the same as the ratio (distance traversed during AD at speed 1/2 DO) : (distance traversed during AE at speed 1/2 EP). But what is the latter ratio—how can we bring these two cases of uniform motion into rational relation? The answer is given in Theorem IV, Proposition IV from the section on uniform motion: ‘the spaces traversed by two particles in uniform motion bear to one another a ratio which is equal to the product of the ratio of the velocities by the ratio of the times.’¹⁰ And in this case, because ADO is similar to AEP, we know that the ratio of AD : AE is equal to the ratio of 1/2 DO : 1/2 EP, which is just the same as the ratio DO : EP; so [V1 : V1 compounded with T1 : T2 ] is just [T1 : T2 compounded with T1 : T2 ]. Since the ratios only involve the ⁹ Galileo, Discorsi, 174–5.
¹⁰ Galileo, Discorsi, 157–8.
12 introductory chapters
Figure 1.4. Galileo, Discorsi, Third Day, Naturally Accelerated Motion, Theorem II, Proposition II
single parameter time, Galileo doesn’t mind treating them as numbers and calls the product [T1 : T2 ]2 . Thus D1 : D2 is equal to [T1 : T2 ]2 . By the same token, D1 : D2 is equal to [V1 : V2 ]2 , the square of the ratio of the final velocities. Galileo gains his insight here by combining numerical patterns with geometry in the service of mechanics, as this summary from the immediately following Corollary I indicates: Hence it is clear that if we take any equal intervals of time whatever, counting from the beginning of the motion, such as AD, DE, EF, FG, in which the spaces HL, LM, MN, NI are traversed, these spaces will bear to one another the same ratio as the series of odd numbers 1, 3, 5, 7; for this is the ratio of the differences of the squares of the lines [which represent time] ... While, therefore, during equal
productive ambiguity: galileo contra carnap 13 intervals of time the velocities increase as the natural numbers, the increments in the distances traversed during these equal time-intervals are to one another as the odd numbers beginning with unity.¹¹
Since 1 + 3 = 22 , 1 + 3 + 5 = 32 , 1 + 3 + 5 + 7 = 42 , and so forth, these sums representing distances will be proportional to the square of the time intervals; and since the time elapsed is proportional to the final velocity, as the similar triangles in the diagram to the left makes clear, the distance fallen will be proportional to the square of the final velocity. Galileo is now using at least four modes of representation to express his argument: proportions, geometrical figures, numbers, and natural language. He also employs a systematic ambiguity to carry his argument further. By adding Corollary I to Theorem II, Proposition II, he insists on the pertinence of the number theoretical facts just discussed to the analysis of free fall. The reader is thus forced to read the intervals depicted (AD, DE, EF ... , and then HL, the three intervals of LM, the five intervals of MN ... ) sometimes as units, sometimes as infinitesimals. There is only one set of diagrams, but the set must be read in two ways. Reading the intervals as finite allows both for the application of Euclidean results, and for the pertinence of the arithmetical pattern just noted. Reading the intervals as infinitesimal allows for the analysis of accelerated motion. The accompanying text in natural language guides and exploits this double meaning. Note that in Corollary I, Galileo does not compare the interval-terms directly, but is careful to refer to them in ratios. Even if infinitesimal intervals (instants and points, to use Galileo’s vocabulary) are mathematically suspect—as they surely were in the early seventeenth century—the geometry of the diagrams supports the rationality of holding that ratios between them are ‘like’ the ratios between their finite counterparts. That is, AD : DE :: AO : OP no matter what size the configuration is; or, to use the other diagram, HL : LM :: 1 : 3 no matter what the size of the configuration. Theorem IV, Proposition IV from the section on uniform motion, which is so carefully Euclidean in its reasoning, is here put to highly non-Euclidean use because of its juxtaposition with the systematically ambiguous diagram. When the time intervals AD and AE are read as finite, the application of the theorem that brings disparate cases of uniform motion into relation is direct; when AD and AE are read as infinitesimal (because we may take ¹¹ Galileo, Discorsi, 175–6.
14
introductory chapters
‘any equal intervals of time whatsoever’) the application of the theorem is non-Euclidean because Euclid does not allow infinitesimal terms. But it is this application that allows cases of uniform motion to be brought into rational relation with uniformly accelerated motion in a way that Newton can employ when, a generation later, he uses geometry to represent the dynamical processes of the solar system. Read as finite, the triangles are the iconic representations of geometrical figures; read as infinitesimal, the triangles are the symbolic representation of a dynamical process, free fall. What lies before us in Figure 1.4 is a diagram that must be read in two different ways, as both icon and symbol, and natural language that explains the ambiguous configuration. The same point can be made a propos the left-hand diagram that accompanies Theorem I, Proposition I (Figure 1.3) and the diagram borrowed from the Oxford Calculators that adumbrates Corollary I in the section ‘Naturally Accelerated Motion’. The centrally important diagram of projectile motion from Theorem I, Proposition I of the Fourth Day (see Figure 1.5) also enjoys an ambiguity rich in consequences. The diagram of projectile motion must be compared to the diagram accompanying Theorem I, Proposition I from the section ‘Uniform Motion’ (see Figure 1.1), the diagram accompanying Theorem II, Proposition II from the section ‘Naturally Accelerated Motion’ of the Third Day (see Figure 1.4), and the diagram of a parabola borrowed from Apollonius just preceding. Significantly, the diagram of projectile motion refers to all of them and conflates and superimposes certain of their elements in instructive ways.¹² The first thing to note about Figure 1.5 is that the line abcde conflates the two lines in Figure 1.1, IK representing time and GH representing distance understood as displacement in uniform motion.¹³ The intervening theorems have taught us that it is precisely because line GH represents displacement in uniform motion that it can be merely a line; non-uniform motion requires either a two-dimensional figure or a ruler-with-commentary for its representation. However, in the case of uniform motion, a line suffices and moreover can also serve to represent time, since (as the theorem accompanying Figure 1.1 states) in such motion the intervals of time elapsed are proportional to the intervals of distance traversed. The line bogln is the ¹² Galileo, Discorsi, 244–57. ¹³ Galileo, Discorsi, 248–50.
productive ambiguity: galileo contra carnap 15
Figure 1.5. Galileo, Discorsi, Fourth Day, The Motion of Projectiles, Theorem I, Proposition I
line HLMNI from Figure 1.4 divided in just the same proportions. The genius of the diagram is the perpendicular juxtaposition of line bogln with abcde, which represents the insight that projectile motion is ‘compounded of two other motions, namely, one uniform and one naturally accelerated.’¹⁴ The proof of Theorem I, Proposition I in the section ‘The Motion of Projectiles’ shows that the rest of the diagram stems from the superposition ¹⁴ Galileo, Discorsi, 244.
16
introductory chapters
of Apollonius’ construction of the parabola as the pathway of the moving body: ‘A projectile which is carried by a uniform horizontal motion compounded with a naturally accelerated vertical motion describes a path which is a semi-parabola.’¹⁵ In order for the reasoning in the proof of Theorem I, Proposition I of the Fourth Day to proceed, the line abcde must mean both time and distance; it must represent time symbolically in order for the application of the results achieved in the Third Day, and it must represent distance qua displacement in order for the diagram to make sense as the icon of a trajectory, the movement of a body across a plane in space. The line segments HL, HM, HN from Figure 1.4, re-labeled ci, df, eh in Figure 1.5, are drawn to represent the vertical displacement at equal intervals of time/displacement at c, d, and e; b is the point taken to represent the beginning of the projectile motion, cb is chosen as the ‘unit’, and cb = dc = de and so on. Galileo’s exposition of the diagram claims that no matter how cb is chosen (‘if we take equal time-intervals of any size whatsoever’) the curve described is always the same, and it is the semi-parabola, as the results from Apollonius that precede the proof in Theorem I, Proposition I have made clear. Thus the best way to understand projectile motion is as ‘uniform horizontal motion compounded with a naturally accelerated vertical motion,’ which produces a parabolic downward trajectory. Reading cb as a finite interval allows for the application of results of Euclid and Apollonius; reading cb as an infinitesimal allows the diagram to stand as an analysis of accelerated motion. The great diagram that presents projectile motion thus succeeds because of Galileo’s inspired handling of controlled ambiguity.
1.2. Carnap on Language and Thought The reductionist program set out in Rudolf Carnap’s The Logical Structure [Aufbau] of the World can only conclude that Galileo’s text should be rewritten.¹⁶ Though the program has long been regarded as inconclusive and ultimately unsuccessful by most Anglophone philosophers of science and ¹⁵ Galileo, Discorsi, 245. ¹⁶ Rudolf Carnap, The Logical Structure of the World and Pseudoproblems in Philosophy, tr. B. Rolf and A. George (Berkeley: University of California Press, 1967/69); Der Logische Aufbau der Welt (Hamburg: Meiner, 1928).
productive ambiguity: galileo contra carnap 17 mathematics, many of its presuppositions nonetheless remain unnoticed in our discourse and thinking. I now return briefly to Carnap’s classic work, in order to uncover some of these assumptions and to question them more closely. Then I will give a brief sketch of what became of Carnap and Hempel’s view of scientific and mathematical knowledge in the work of their academic children (Bas van Fraassen, Nancy Cartwright, Margaret Morrison, Ian Hacking) and grandchildren (Robin Hendry, Ursula Klein, myself), a development that will bring us back to the project of this essay in an unexpected way. Carnap begins the book with a description of his ideal, a ‘constructional system’ that begins with certain fundamental objects/concepts and constructs all other objects/concepts from them. Trying to avoid the antinomy between rationalism and empiricism, he argues that to every concept there belongs one and only one object: ‘the object and its concept are one and the same.’¹⁷ An object/concept is said to be reducible to one or more other objects/concepts if all statements about it can be transformed into statements about the other objects/concepts via a constructional definition. This is a ‘rule of translation which gives a general indication how any propositional function in which a occurs may be transformed into a coextensive propositional function in which a no longer occurs, but only b and c.’¹⁸ Then we say that a is logically reducible to b and c. The example that Carnap gives to illustrate his point is pertinent to our discussion: EXAMPLE. The reducibility of fractions to natural numbers is easily understood, and a given statement about certain fractions can easily be transformed into a statement about natural numbers. On the other hand, the construction, for example, of the fraction 2/7, i. e., the indication of a rule through which all statements about 2/7 can be transformed into statements about 2 and 7, is much more complicated. Whitehead and Russell have solved this problem for all mathematical concepts [Princ. Math.]; thus they have produced a ‘‘constructional system’’ of the mathematical concepts.¹⁹ Earlier in the book, he tells us, ‘all real numbers, even the irrationals, can be reduced to fractions. Finally, all entities of arithmetic and analysis are ¹⁷ Carnap, The Logical Structure of the World, 10. ¹⁸ Carnap, The Logical Structure of the World, 61. ¹⁹ Carnap, The Logical Structure of the World, 61.
18
introductory chapters
reducible to natural numbers.’²⁰ And in the Preface to the second edition, he adds that Frege, Russell, and Whitehead had shown, ‘through the definition of numbers and numerical functions on the basis of purely logical concepts, the entire conceptual structure of mathematics [ ... ] to be part of logic.’²¹ This highly controversial last statement was written, significantly, in 1961; the last section of my book revisits the controversy. The alleged success of Russell and Whitehead at reducing all of mathematics to logic inspires Carnap: ‘Logistics (symbolic logic) has been advanced by Russell and Whitehead to a point where it provides a theory of relations which allows almost all problems of the pure theory of ordering to be treated without great difficulty.’ His book is thus designed to apply the theory of relations to the task of analyzing reality ... in order to formulate the logical requirements which must be fulfilled by a constructional system of concepts, to bring into clearer focus the basis of the system, and to demonstrate by actually producing such a system (though part of it is only an outline) that it can be constructed on the indicated basis and within the indicated logical framework.’ ²²
He adds that if this project is successful, it will show ‘that there is only one domain of objects and therefore only one science.’²³ My aim in this chapter is not to evaluate Carnap’s overall project, but to note its reductionism (and its optimism about reductionism, even in the face of much evidence to the contrary), and to focus especially on its theory of language. Even philosophers of science and mathematics who reject the constructionist Aufbau of the world have accepted many of Carnap’s claims about language. For example, he writes that a constructional definition must be ‘pure,’ that is, free of unnoticed conceptual elements; and it must be ‘formally accurate,’ that is, ‘it must be neither ambiguous nor empty ... it must not designate more than one, but it must designate at least one, object.’ He notes that in natural language this requirement is difficult to fulfill, but by contrast ‘this requirement is easily and almost automatically fulfilled when we apply an appropriate symbolism, for example, when we apply the logistic forms for the introduction of classes or relation extensions and for definite descriptions of individuals. It is a fact of logistics that these ²⁰ ²¹ ²² ²³
Carnap, The Logical Structure of the World, 6. Carnap, The Logical Structure of the World, vii. Carnap, The Logical Structure of the World, 7–8. Carnap, The Logical Structure of the World, 9.
productive ambiguity: galileo contra carnap 19 forms guarantee unequivocalness and logical existence, for they have been created with these desired properties in view.’²⁴ In sum, the ideal language for philosophy of science and mathematics, the language in which science and mathematics are to be re-constructed in order to exhibit the real structure of the world, is ‘the symbolic language of logistics.’²⁵ It does the best job of demonstrating that all objects are reducible to the basic objects: ‘It is obvious that the value of a constructional system stands or falls with the purity of this reduction, just as the value of an axiomatic exposition of a theory depends upon the purity of the derivation of theorems from axioms. We can best insure the purity of this reduction through the application of an appropriate symbolism.’²⁶ The symbolic language of logistics is allegedly an ideal mode of representation that makes all content explicit; it stands in isomorphic relation to the objects it describes, and that one–one correspondence insures that its definitions are ‘neither ambiguous nor empty.’ It is, obviously, symbolic and not iconic; in the ideal limit, it will replace—not merely supplement—natural language. And its successful use in the Aufbau of the world will show that there is only one kind of thing: Carnap’s choice of basic object is the sense datum, but he also believes that in the end mathematics has no subject matter, having been reduced to the pure formalism of logic.
1.3. From the Syntactic to the Semantic to the Pragmatic Approach Galileo’s account of projectile motion in the Discorsi thus appears to fall woefully short of Carnap’s ideal. It involves icons and natural language as well as symbols; and many of the modes of representation that it employs refer ambiguously. Should we commit it to the flames? Carnap’s judgement would probably not be so severe; he might suggest we review it as a curiosity, admirable in its time but philosophically inert for us. This is one reason why he was not interested by the history of mathematics. By contrast, I find in my historical case study evidence that is philosophically pertinent ²⁴ Carnap, The Logical Structure of the World and Pseudoproblems in Philosophy, 154. ²⁵ Carnap, The Logical Structure of the World and Pseudoproblems in Philosophy, 153. ²⁶ Carnap, The Logical Structure of the World and Pseudoproblems in Philosophy, 154.
20
introductory chapters
in its own right, and that weighs strongly against many of the assumptions made by logical positivists such as Carnap, and in favor of a very different philosophical view of mathematics and indeed of language and logic. One way to make clear the nature of my quarrel with Carnap and mid-twentieth-century logical positivism is to sketch a philosophical development that links us, which I understand to be driven by problems articulated but unsolved by his program. It is not unfair to characterize Carnap’s project in the Aufbau as essentially syntactical, for it reduces content to the sense datum (which really has no content) and tries to build everything else into the form. As Robin Hendry sums up: ‘The logical positivists bequeathed to philosophy of science a characterization of theories as linguistic structures whose content was to be identified in terms of the notion of logical consequence, a notion intimately related to structural features of their formulations in canonical formal languages.’²⁷ If we recall that logical positivists like Carnap and Hempel gave an account of the relations between theory and evidence, between explanation and the explanandum, and between the reducing and reduced theory in terms of deductive (only sometimes inductive) relations among sets of sentences in a formal language, the approach appears unrelentingly syntactic. The philosophical offspring of these mid-twentieth century philosophers concluded that the study of formal languages would not in itself provide a deeper epistemological account of how scientists and mathematicians represent reality. I would put it this way: logical inference can be formalized, though we would do well to keep in mind that formalization is a kind of representation, and that formalization qua representation tends to precipitate novel ideal items like sets, well formed formulas, and recursive functions, which must then be studied. But the things of mathematics and science must be represented in order to be studied, and we cannot understand the problems or procedures they occur in and give rise to without looking closely at how they are represented. Representation is a much broader notion than formalization; and formalization suits inference, which is indifferent (up to a point) to the things it treats. Thus, the semantic philosophers turned their attention to ‘models,’ and a new school of Anglophone philosophy of science that characterized its ²⁷ Robin F. Hendry ‘Mathematics, Representation, and Molecular Structure,’ in Tools and Modes of Representation in the Laboratory Sciences, ed. U. Klein, (Dordrecht: Kluwer, 2001), 221–36.
productive ambiguity: galileo contra carnap 21 approach as semantic began to dominate philosophical debates. Some of these philosophers thought of models in the sense used by logicians, as a structure that satisfies some set of sentences in a meta-language, where what is meant by ‘structure’ is a set of sentences in an object-language; Bas van Fraassen, inter alia, had a rather more conservative view of what constitutes a model. But others urged a broader and richer account. As Robin Hendry observes, ‘This logical notion is quite different, however, from the sense of ‘‘model’’ that is at work when we speak of (for instance) the molecular model that we can construct from one of Linus Pauling’s kits: here the important relation is not satisfaction but representation.’²⁸ Once the notion of model is investigated along these lines, it becomes clear that different models bring out different aspects of the real systems they model with different degrees of precision and explanatory power. Philosophers who pursue a more broadly semantical philosophy of science, like Nancy Cartwright, Margaret Morrison, Kenneth Schaffner, and Ian Hacking, tend to be interested in the activity of contemporary scientists, not only as they justify their results in journals but also as they discover them in the laboratory and the field. All the same, as we see in recent works by van Fraassen and Schaffner especially, the adequacy of a scientific theory is characterized in terms of a relation of isomorphism between theory and model.²⁹ That is, a theory is said to be empirically adequate when a model of the appearances (the quantitative results of experiment) is isomorphic to a (mathematical) submodel of one of the theory’s models.³⁰ It is often assumed that the best way to think of representation in mathematics is in terms of isomorphism between structures, and this habit can then be simply and appropriately transferred to science. But the philosophical children of the semanticians (and grandchildren of the logical positivists) have concluded otherwise, finding the semantic view of representation problematic not only for philosophy of science but also for philosophy of mathematics. They propose instead a view that encompasses pragmatic as well as syntactic and semantic considerations, focusing on the successful posing and solution of ²⁸ Hendry, ‘Mathematics, Representation, and Molecular Structure,’ 225. ²⁹ See, for example, Laws and Symmetry by Bas van Fraassen (New York: Oxford University Press, 1990), and Discovery and Explanation in Biology and Medicine by Kenneth Schaffner (Chicago: University of Chicago Press, 1994). ³⁰ See, for example, Schaffner, Discovery and Explanation, ch. 3.
22
introductory chapters
problems in a context of use that is, in the end, historical. Thus the program of Carnap—which, like the program of Kant, eschews history—has been transformed into a program that finds it cannot escape history after all. Ursula Klein concedes that the semantic notion of isomorphism might capture the notion of ‘representation of,’ or denotation, but must be supplemented by ‘representation as’ or meaning. A representation A of an entity B is not merely a denotation of it, but also creatively describes and classifies it as such-and-such. Representation ... is not a matter of passive reporting ... Rather, representation involves organization, invention, and other kinds of activity.’³¹
When isomorphism is the central term in the analysis of scientific language, the philosopher assumes that the objects so related are already available and organized in a definitive way. But representation itself may have a role in constituting and organizing the things represented. Klein argues, for example, that the use of Berzelian formulas (the familiar formulas like H2 O used by chemists) was crucial to the swift advance of organic chemistry in the mid-nineteenth century, not because it was ‘more isomorphic’ to experimental patterns recorded in the laboratory than the notations preceding it, but because it enjoyed a useful, meaningful iconicity, ambiguity, and algebraicity.³² I discuss her case study in more detail at the end of this chapter. Hendry reminds us that the notion of isomorphism in fact does not even account very well for denotation in science. One notorious difficulty with first-order predicate logic is that, in most non-trivial cases, a first order theory describes correctly a whole range of models that satisfy it, and cannot by itself pick out the intended model. Hendry argues that the context of natural language in which symbolic language is used makes its reference a determinate, not merely stipulative relation. ‘The particular historical and material context of a language within which a theoretical discourse is pursued is what endows it with reference, and reference can be passed on to other media (like equations) which become entwined in that discourse.’³³ Whereas isomorphism is reflexive ³¹ Klein, Tools and Modes of Representation, p. viii. ³² U. Klein, Experiments, Models, Paper Tools: Cultures of Organic Chemistry in the Nineteenth Century (Stanford, CA: Stanford University Press, 2003), ch. 1. I wrote a review essay about this book in Studies in History and Philosophy of Science, 36 (2005), 411–17. ³³ Hendry, ‘Mathematics, Representation and Molecular Structure,’ 227.
productive ambiguity: galileo contra carnap 23 and symmetric, representation is irreflexive and asymmetric due to its intentionality. Moreover, Hendry observes, supposing we find a way to single out the intended model, there are always uninteresting ways of constructing a theory that will stand in the relation of isomorphism to it: the question is then, how to articulate and explain the selection of an interesting theory. Klein and Hendry both argue that interesting modes of representation contribute to the advance of scientific knowledge, that is, to success in posing and solving problems. And when we look at the details of their case studies, the representations in those interesting theories turn out to be iconic as well as symbolic, often ambiguous, embedded in natural language, and partially constitutive of what they stand for. Thus the semantic reliance on the notion of isomorphism appears misplaced. My own work in the philosophy of mathematics over the years has led me to the same conclusion about mathematics, an insight missed even by Hendry and Klein who sometimes write as if the semantic approach might work well for mathematics but not for the empirical sciences. Van Fraassen and Schaffner emphatically assume that the relation between theory and thing in mathematics is successfully captured by isomorphism between symbolic meta-language and symbolic object-language, and for the same kind of reason Carnap assumes the syntactic reduction of mathematics to logic is successful. They all suppose that their epistemological account (syntactic or semantic) works well for mathematics and, since mathematics is the language of science, it can be transferred without much difficulty to science itself. My purpose in this book, by contrast, is to move towards an epistemology that works properly for mathematics by taking into account the pragmatic as well as the syntactic and semantic features of representation in mathematics. Focusing on the pragmatic dimension of mathematical language allows us to see the philosophical interest of useful ambiguity in mathematics, as well as the limits of formalization. Carnap wanted to rewrite science (and mathematics) in the language of logic, in order to exhibit the logical structure of reality. Thus for him the role of language is to purify and correct. It should render every inferential step explicit; stand in isomorphic relation with the objects/concepts it refers to; and either reduce content to form or show that there is only one kind of thing. Recall that Carnap’s preferred ‘thing’ is the sense datum, which does not in itself have any content. (The preferred ‘thing’ of mathematical
24
introductory chapters
structuralists is either the point or the empty set, for the same reason, that each lacks content.) Carnap is an enthusiastic champion of the Russellian project of reducing geometry and analysis to arithmetic and arithmetic to logic. He is also interested in the reductionist project of reducing the things of biology and chemistry to the electron (and, he reluctantly adds, the proton). Like that other reductionist champion of conceptual purity, Descartes, Carnap downplays the difficulty of arriving at only one kind of thing. Descartes was left with the cogito, the line segment, the particle in inertial motion, and the mechanism; Carnap is left with the sense datum, the proposition, and the electron (and—darn it!—the proton). But the ideal remains. By contrast, I (along with my pragmatist cohort) regard the role of representation to be the successful discovery and solution of problems about problematic things, heterogeneous things of many different kinds. My exposition of Galileo’s analysis of projectile motion in the Discorsi is designed to show the fruitful employment of consortia of modes of representation at work in his argument, as well as their inescapable ambiguity; in a later chapter, I will show how his procedures lend themselves to further generalization in Newton’s use of Galileo’s results in the Principia. The logicist account of mathematical language makes this useful ambiguity impossible to see, because it tries to eliminate modes of representation that are ‘different’—that is, only one mode of representation at a time is countenanced—and it insists that all referring be univocal. To provide a more adequate approach, I now turn to the notion of analysis and its allied understanding of mathematical experience, illustrated with a chemical example developed by Ursula Klein.
1.4. A Pragmatic Account of Berzelian Formulas Mathematics invites us to study intelligible but problematic things that are hard to investigate because they cannot be directly perceived and because they involve the infinite. We search for the conditions of intelligibility of these things (which are themselves conditions of the intelligibility of other things) as we search for the conditions of solvability of the problems that characterize them as well as the reach of procedures we use in the solutions. I discuss this process in the next chapter, under the general term
productive ambiguity: galileo contra carnap 25 analysis. Our search depends on representations that make the invisible visible and the infinite finite; I argue throughout this book that we study the things of mathematics by a kind of triangulation in the nautical sense, where we combine different modes of representation in order to improve our access to them. Formalization in terms of predicate logic is one kind of representation among many, useful for some purposes but limited in important ways. Different representations reveal different aspects of intelligible things, the problems in which they occur, and the procedures that successfully solve problems. When distinct representations are juxtaposed and superimposed, the result is often a productive ambiguity that expresses and generates new knowledge. Mathematical experience emerges from traditions of representation and problem-solving, as it explores the ‘combinatorial spaces’ (to use Jean Cavaill`es’s phrase) produced in polyvalent mathematical discourse. It cannot be summed up by the formalism of predicate logic or abstract algebra or any single mode of representation, nor by the ‘intuition’ variously invoked by Descartes, Kant, or Brouwer in an attempt to escape the traps of formalism. Formalism, structuralism, and intuitionism are all intolerant of ambiguity, as a consequence of exaggerated epistemological ambitions. Some mathematical representations are iconic, that is, they picture and resemble what they represent; some are symbolic and represent by convention, without much resemblance; and some are indexical, representing for the sake of organization and ordered display. The diagram that accompanies Euclid’s proof of the Pythagorean Theorem is an example of the first; the polynomial equation that represents a circle in Descartes’ Geometry is an example of the second; and the G¨odel numbers that name well formed formulas of a certain specified formal language are an example of the last. But all icons have a symbolic dimension, as all symbols have an iconic dimension; and all representations to a certain extent organize, order, and display. Which representations we have at our disposal and how we combine them determines how we can formulate and solve problems, discern items and articulate procedures, supply evidence in arguments and offer explanations. And how the representations should be understood, their import and meaning, must be referred to their use in a given tradition of problem-solving. Thus, I believe that mathematical rationality must be studied in historical context. To underline this point, I turn to an example from Ursula Klein’s book Experiments, Models, Paper Tools:
26
introductory chapters
Cultures of Organic Chemistry in the Nineteenth Century.³⁴ I find this example so instructive, along with Robin Hendry’s and Robert Bishop’s studies of the applications of quantum mechanics to chemistry and the insights of Roald Hoffmann into organic chemistry set out in a co-authored paper that forms the basis of Chapter 3, that I have begun my book on philosophy of mathematics with an excursus into chemistry and geometry. Klein’s case study concerns the impact of Berzelian formulas, introduced by the Swedish chemist Jacob Berzelius in 1813, on the emergence of organic chemistry between 1827 and 1840. During that period, plant and animal chemistry—an experimental practice concerned with the extraction and description of organic substances—changed into carbon chemistry, in which quantitative analysis and the experimental study of chemical reactions led to the identification, classification and construction of a wealth of new chemical objects in the late nineteenth century. Her main thesis is that the catalyst of this change was the wide-spread adoption of Berzelian formulas by organic chemists, which exhibit the concatenation of chemical components in a simple and perspicuous way. (Berzelius wrote the numerical subscripts as superscripts.) Its algebraicity is striking. Klein makes two interesting observations in an initial chapter entitled ‘The Semiotics of Berzelian Chemical Formulas,’ one about the ambiguity of the notation and another about its abstractness. The former is that the letters in Berzelian formulas are signs that convey a plurality of information simultaneously. Depending on context, they may refer to macroscopic chemical compounds and their composition from chemical elements; to stoichiometric or volumetric quantitative relations; and to atomic weights or ‘atoms’ in the sense of sub-microscopically small particles. In the new theoretical context, a chemical proportion meant a bit or portion of a substance defined by its unique and invariant relative combining weight. Thus, Berzelian formulas became signs for scale-independent chemical portions. The ambiguity of the notation allowed chemists to move back and forth between the macroscopic and microscopic worlds as needed. The abstractness of the notation also freed those who used Berzelius’ formulas from a commitment to an elaborate foundational theory, like that of Dalton. Dalton proposed a (prematurely) iconic notation intended to be read as unequivocal signs for sub-microscopically small atoms that ³⁴ Klein, Experiments, Models, Paper Tools.
productive ambiguity: galileo contra carnap 27 have a certain shape, size, orientation in space, and additional chemical properties; he seemed to view his diagrams as realistic images of atoms. By contrast, Berzelius’s formulas are noncommittal. In Chapter 1 of his book Force and Geometry in Newton’s Principia, Fran¸cois De Gandt makes a similar point about Newton’s abstract definition of a center of force as that point with respect to which equal areas are swept out in equal times by a body subject to the action of the centripetal force.³⁵ The definition was metaphysically noncommittal, which meant it could be used in succeeding generations by a gamut of physicists even as they argued over the metaphysics of force. So too Berzelius’s formulas served both anti-atomists like Berthelot and atomists like Wurtz and Grimaux, even as they argued over the reality and nature of atoms, and the relation between physics and chemistry. In Klein’s words, Berzelian formulas ‘had different layers of meaning and conveyed a building-block image of chemical portions without simultaneously requiring an investment in atomic theories.’³⁶ However, Klein disputes the way that historians of chemistry have emphasized the merely symbolic nature of Berzelian formulas, as if his notation were just convenient shorthand that allowed chemists to evade certain issues of explanation. In particular, she challenges Fran¸cois Dagognet, whose books Tableaux et Languages de la Chimie³⁷ and Ecriture et Iconographie³⁸ deal with many of the same issues as her Experiments, Models, Paper Tools. Dagognet insists upon a strong distinction between symbols and icons, and in particular contrasts the ‘logical,’ symbolic notation of Berzelius with the graphical, iconic structural formulas of Archibald Scott Couper and the stereochemical formulas of Jabobus van’t Hoff. On Dagognet’s account, only the latter are paper tools with a ‘generative function’ above and beyond the mere conveyance and storing of information. Klein contests this dismissal, first by questioning a rigid dichotomy between iconic and symbolic modes of representation, and second by marshalling the details of her case study, which show convincingly (to my mind) the generative function of Berzelian formulas. Invoking and then thinking beyond Nelson Goodman’s criticism of the icon/symbol distinction in his Languages of Art, Klein points out, quite ³⁵ Fran¸cois De Gandt (Princeton: Princeton University Press, 1995). ³⁶ U. Klein (2003). Experiments, Models, Paper Tools 35. ³⁷ (Paris: Editions du Seuil, 1969). ³⁸ (Paris: Librairie Philosophique J. Vrin, 1973).
28
introductory chapters
rightly, that any iconic notation is incompletely iconic and must involve certain symbolic conventions, and that any symbolic notation is iconic in a rudimentary way: typographical isolation, for example, is iconic in intent. ‘The fact that a letter is a visible, discrete, and indivisible thing (unlike a written name) constitutes a minimal isomorphy with the postulated object it stands for, namely, the indivisible unit or portion of chemical elements.’³⁹ I came to similar conclusions about the iconic dimensions of symbols and the symbolic dimensions of icons in mathematics, as will be evident in the chapters that follow, so I could not agree more with this point. Klein goes on to explain briefly (borrowing Goodman’s terms) that Berzelian formulas have both ‘graphic suggestiveness’ and ‘maneuverability.’ They exhibit clearly the constitution of compounds in ways that verbal descriptions and Daltonian diagrams did not; and, given the goal of constructing models of compounds and chemical reactions, Berzelian formulas were much easier to use. Leibniz, who always insisted on the generative capability of a good ‘characteristic,’ would have been delighted by the new notation of Berzelius. Klein contrasts the old mode of organic chemistry with the new mode that emerged around 1830 by showing that what counted as a scientific object changed drastically, along with the concept of ‘organic substance’ and its material referents in the laboratory; and so too did the system of classification, the type of experiments performed, and the kinds of problems to be solved, the goals and beliefs of scientists. Thus, she argues, pre-1830 plant and animal chemistry and post-1840 carbon chemistry should be considered two different scientific cultures; and her subsequent exposition makes a good case that Berzelian formulae were a centrally significant agent of change in that metamorphosis. Examining the interaction between improved laboratory methods of quantitative analysis in the nineteenth century, improvements that nonetheless remained limited, and the recording of laboratory events in terms of Berzelian formulas, Klein writes, Berzelian formulas ... presupposed quantitative chemical analyses of the substances at stake. They were products of the transformation of the analytical results into small integral numbers of ‘portions’ (or ‘atoms’) of elements in that the elemental composition of the substances expressed in weight percentages was divided by the theoretical combining weight (also ‘atomic weight’) of each element.⁴⁰ ³⁹ U. Klein, Experiments, Models, Paper Tools, 26. ⁴⁰ U. Klein, Experiments, Models, Paper Tools, 119.
productive ambiguity: galileo contra carnap 29 In 1827, Jean Dumas and Polydore Boullay revisited an earlier experiment involving the discovery of ‘sulfovinic acid,’ a reaction product that appeared at the beginning of the process of distilling alcohol. The problem of interpreting the reactions in question was the multiplicity of reaction products. Using Berzelian formulas, the two chemists demonstrated that there were two simultaneously occurring but fully distinct reactions between alcohol and sulfuric acid; that is, they brought order to the chaotic cascade of reaction products by distinguishing independent parallel reactions as well as a successive reaction. In the detail of her exposition, Klein shows that the modeling of the reaction depended on the use of Berzelius’s notation, because it exhibited so well the balancing of the masses involved in the interpretive model. The balance of the number of element symbols (C, H, O, and S) on the left and right sides of the chemical equations must exhibit the equality of the masses of the initial substances and the reaction products. Thus in the two parallel reactions, and the reaction subsequent to them, the chemists could show formally that all reaction products had been accounted for. As Ursula Klein argues, in summary, ‘balancing schemata of organic reactions could not be constructed based on the measurement of the masses of the reacting substances, but required the acceptance and application of theoretical combining weights such as those represented by Berzelian formulas.’⁴¹ In the early nineteenth century, it was difficult and often impossible to isolate and measure all the pertinent masses of the substances entering into and resulting from reactions; Berzelian formulas permitted the construction of exact models in the partial absence of experimental data. Around 1840, Jean Dumas and Auguste Laurent developed a new chemical concept that was incompatible with the theory of the binary constitution of organic substances (a theory that held sway for a couple of decades before 1840), the concept of substitution. Klein shows that the new concept depended just as heavily on ‘work on paper’ with Berzelian formulas as the predecessor theory had. The example given is Dumas’ interpretive models for the formation of ‘chloral,’ a reaction product discovered by Justus Liebig when chlorine gas was introduced into alcohol and heated up. The last of these models, offered around 1835, ⁴¹ U. Klein, Experiments, Models, Paper Tools, 126.
30
introductory chapters
involved a rule that one ‘atom’ of hydrogen of an organic substance can be substituted by one ‘atom’ of chlorine, bromine, or iodine, or half an ‘atom’ of oxygen; this was the basis of what Dumas called his theory of substitution. Moreover, Dumas claimed that the formation of chloral proceeded step-by-step, creating a series of intermediate compounds by a step-wise substitution of one portion of hydrogen by one portion of chlorine. His model, expressed in Berzelian formulas, visually demonstrates the step-wise substitution process and shows that none of the compounds demanded by the theory of binary constitution are involved. Klein defends her bold claim that Dumas and his assistant Laurent owed this theory first and foremost to Berzelian formulas considered as a paper tool on the following grounds. The concept of substitution could not have been derived from any existing chemical theories. No empirical result of experimentation suggested that there was a discrete, step-wise process of substitution. And no prior, explicitly stated goal or aim of Dumas’ contributed directly to the introduction of this rule and theory; on the contrary, it undermined the theory to which he had previously given allegiance. Thus Klein sees here a strong interplay between paper tools and theory construction, precisely the same kind of dialectic that Ian Hacking and Andrew Pickering have noted between laboratory apparatus and the modes of data analysis they foster, and theory construction. Her discussion of this historical process of transformation in the midnineteenth century touches on a number of issues that arise in the next chapter. The first is that in nineteenth century chemistry, alcohol and its derivatives are ‘model substances’ in much the same way that Drosophila is for genetics in the twentieth century. The reasons for this are both natural (alcohol is a very reactive substance that can be transformed easily) and cultural (human beings have been experts in distilling alcohol cheaply and abundantly since the Neolithic era, for obvious reasons). Also, distilled alcohol is relatively pure, and alcohol contains only carbon, hydrogen, and oxygen, the elements essential to and typical of organic chemistry. Chemists found it relatively straightforward to reconstruct the cascade of reaction products in reactions involving alcohol, and to assimilate the study of alcohol and its derivatives to the methods of inorganic chemistry. The existence and central importance of model objects in science support a distinction between generality and abstraction as a goal of scientific (and mathematical) theories. This is a distinction central to
productive ambiguity: galileo contra carnap 31 the research of another philosophical historian of mathematics, Karine Chemla, which I discuss in the next chapter, for the use of canonical items is also essential to the elaboration of general methods of investigation in mathematics. The second point addresses Thomas Kuhn’s notion of theory incommensurablity.⁴² Observing that the scientific culture of chemistry in 1827 and its counterpart in 1840 were incompatible, she qualifies her claim: I use the term incommensurability not for pointing out a problem of theory choice and theory justification. Rather, I use it as an analytical category to denote the different modes of identifying and classifying the scientific objects, which makes it impossible for us historians of science to compare the two cultures of organic chemistry in any direct way.’⁴³
She adds that her conception of a scientific culture is a fragile, ambiguous unity of heterogeneous elements, which are not fully determined by any unifying theory. Thus (as her historical narrative shows) the incommensurable cultures of 1827 and 1840 are rationally related by overlapping practices. Defending the generative role of Berzelian formulas in this transformation, Klein reiterates her claim that the innovations they precipitated were not derived from previously existing theory or forced upon researchers by experimental data. Scientists neither intended nor foresaw the changes the new notation brought about or the exponential growth of artificial substances in synthetic carbon chemistry it precipitated. ‘Paper tools’ spur and direct human intentions and actions. All the same, Klein challenges the conclusion that a philosopher like Bruno Latour draws from the power of paper tools and laboratory instruments to channel and organize research: for him, human agency is only secondary when science is understood as a series of chains of inscription. By contrast, Klein invokes the tradition of pragmatism, like Robin Hendry and Nancy Cartwright who have worked through the limitations of syntactic and semantic approaches to scientific knowledge. The dialectical interplay between scientists and their conscious objectives, and the tools they employ to realize those objectives, must be reconstructed in terms of the pragmatics of problem-solving. And I shall do the same in this book. Formal languages play an important role in modern mathematics; but formalization is only one kind of representation. ⁴² The Structure of Scientific Revolutions (Chicago: University of Chicago Press, 1996). ⁴³ Klein, Experiments, Models, Paper Tools, 221–2.
32
introductory chapters
And what eludes complete formalization is legion: the intelligible thing, the constitution of ‘combinatorial spaces,’ the knower in relation to both item and representation, the generalization of procedures, how representations may be juxtaposed and exploited in juxtaposition, formalization itself, productive ambiguity, and in general mathematical rationality.
2 Analysis and Experience In this chapter I explain what I mean by analysis and mathematical experience, variously invoking Leibniz, Locke, Hume, Peirce, and Cavaill`es as well as some of my contemporaries. Two figures are conspicuously absent from this historical line-up, Descartes and Kant; my objections to their doctrines in this context have been set forth elsewhere and recur throughout this book, so I won’t pause to rehearse them here.¹ Once I have given my positive account, we will revisit, in Chapter 4, the midtwentieth-century schemas of theory reduction viewed as the insertion of a reduced theory into a reducing theory, and the deductive-nomological model of explanation, in order to understand in a deeper way why they are inadequate for mathematics, and what alternatives are available.
2.1. Analysis Analysis is the search for conditions of intelligibility. If we take the notion of term as primary, it is the search for the reasons or causes, the requisites, that are necessary for a thing to be thinkable and possible, to be what it is. To analyze is not to unpack a concept as if it were a bare concatenation of concept-parts, nor to impose externally a set of relations on a thing as if it were a placeholder, but to distinguish or develop or articulate the content included in a thing. This means, against Kant, that what is given to awareness is already unified; unity is not conferred by awareness. Intelligible things have content, which is the ground for their relations with ¹ On Descartes, see my Cartesian Method and the Problem of Reduction (Oxford: Oxford University Press, 1991); on Kant, see my ‘Jules Vuillemin’s La Philosophie de l’alg`ebre: The Philosophical Uses of Mathematics,’ Philosophie des math´ematiques et th´eorie de la connaissance: L’Oeuvre de Jules Vuillemin, eds. R. Rashed and P. Pellegrin, (Paris: Albert Blanchard 2005,) 253–70; as well as related discussions later on in Ch. 9 of this book.
34
introductory chapters
other things: there are only a few surds or placeholders in mathematics (the geometrical point, zero, the empty set) and each of them must be treated separately, and as enclitic or indexical. My position is that to be intelligible is to exist and to be one, though there are different ways in which things can exist and be unified; it is also to be expressive—intelligible things lend themselves to relations of analogy and may stand for other things. Moreover, against Descartes, I note that awareness of intelligible things, even ‘simple’ intelligible things, may be far from complete understanding, for intelligibility is not transparency: intelligible things are problematic and must be investigated. If we take the notion of proposition as primary, analysis is the search for the conditions of solvability of a problem. This process was first formulated in the classical era by the fourth century Greek geometer Pappus: to solve a problem, consider it as solved and then introduce a hypothesis that would be a sufficient condition for its solution, obtained from the problem and plausible with respect to other, pertinent data. This process is open-ended in the sense that the hypothesis itself can be ‘analyzed’ for its plausibility; and it is highly specific, for the pathways leading to the solution of a problem are closely connected with it. Moreover, pathways may traverse different domains; and there is no reason to think that only one, or only one preferred, pathway exists. If we take the notion of argument as primary, analysis is the exploration of the meaning of procedures, algorithms, and methods in terms of paradigmatic problems that may be used to exhibit their correctness, clarify their domain of application, and indicate how that domain can be extended. This is a kind of analysis, a process of generalization not abstraction, which we often find in textbooks; it is as commonplace as axiomatic presentations. All discourse involves terms, propositions, and arguments, so I will review analysis by focusing on each correlated interpretation in turn and then consider them together at the conclusion of the discussion, to lead into the discussion of mathematical experience, which of course includes and goes beyond discourse. 2.1.1. Analysis of Terms Leibniz’s method of analysis is a rational search for grounds of thinkability, and moves from something that is complex but unified to something simple and explanatory. In general, more than one way of proceeding from the complex to the simple is available, and what may play the
analysis and experience 35 role of the ‘simple’ requires reflection, and may be revised. (His late doctrine of monodology may be an exception to this, but I find Leibniz more interesting in his middle period.²) The complex is not, as it is on the reductive account of Descartes, exhaustively defined by means of the simple; rather, for Leibniz, the simple is defined in terms of the complex as a kind of degree zero, different from but standing in rational relation to it. The simple and the complex, while heterogeneous, are held together in virtue of his ‘principle of continuity,’ discussed in the chapter on Leibniz below. Leibnizian analysis is not the unpacking of merely concatenated simples; ‘mere concatenation,’ supposing there is such a thing, rarely appears in the constitution of complexes and when it does it is thoroughly problematized, as when the truths of arithmetic are led back to the primitive notion of the unit.³ Thus Leibniz leads the labyrinth of the continuum back to the point, the phenomenon of motion to rest, vis viva to vis mortua, perception to petits perceptions, the complex forms of biology to rudimentary monads, and the projects of law and science to rational, self-conscious monads. This family of methods related by analogy is fundamentally opposed to the method of Descartes, for the analysis it carries out is not ‘reductionist.’⁴ Descartes presents both the simples and his means of concatenation as if they were obvious, unique, and transparent to reason—not the outcome of philosophical reflection; and he assumes that the complex can be fully recovered from a concatenation of simples, which are taken to be homogeneous with the complex. By contrast, for Leibniz the discovery of what is simple in a given context is the outcome of philosophically reflective analysis that typically results in a hypothesis that may need to be reconsidered, and the ways in which simples may be associated also provides the occasion for further philosophical reflection and revision. Leibnizian analysis is thus not a trivial spelling out of what has already been thought in the terms involved in a claim, as Kant would have it, nor is it the extracting of an empty form that universally determines the truth of any contents inserted ² See, for example, my ‘Theomorphic Expression in Leibniz’s Discourse on Metaphysics,’ in Studia Leibnitiana, Thematic Issue: Towards the Discourse on Metaphysics, ed. G. H. R. Parkinson, 33, (1) (2001), 4–18. ³ I discuss this example in detail in ‘L’analogie dans la pens´ee math´ematique de Leibniz,’ in L’Actualit´e de Leibniz: Les deux labyrinths, eds. D. Berlioz and F. Nef, Studia Leibnitiana, Supplementa 34 (Stuttgart: Steiner, 1999), 511–22. ⁴ See my book, co-authored with Elhanan Yakira, Leibniz’s Science of the Rational, Studia Leibnitiana Sonderheft, 26 (Stuttgart: Steiner, 1998), ch. 1.4.
36
introductory chapters
into it. It is not ‘reductionist,’ and cannot be reversed into a Cartesian process of retrieving the complex from the simples. It is, rather, a method that works variously but analogously in many fields of human endeavor, a philosophical project that searches for the rational conditions underlying manifold complexity, but which can never come to a definitive conclusion. A good example of the analysis of intelligible things may be found in Euclid. The main theme of Book I of Euclid’s Elements is the triangle; its meditation on the triangle is organized around a master problem, the enunciation and proof of the Pythagorean Theorem, which is solved in Proposition 47. The main theme of Books III and IV of the Elements is the circle; its meditation is also, rather more loosely, organized around a master problem, the Squaring of the Circle—or the precise determination of the area of the circle—though the problem is not solved in the Elements. (One can say either that it is insoluble by ruler and compass construction, or that it may be solved by means outside the system of the Elements.) Both problems have to do with the way the whole of the figure constrains its parts, imposing an ordered relationship upon them: how a right triangle constrains its sides, and how a circle constrains its diameter and circumference, as well as the angles inscribed within it.⁵ The nature of this constraint is significant. Its investigation (the kind of analysis that takes place in Book I and Books III and IV) seeks to discover what makes a shape the shape it is: what are the requisites for a right triangle to be a right triangle, or a circle a circle? Though these analyses look into the parts and their interrelations, what they uncover is the way in which the whole (of a right triangle, or a circle) is greater than—heterogeneous with respect to and prior to—the parts, which in turn is the key to the irreducibility of shape as bounded extension. The analysis of shape, the search for its conditions of intelligibility, returns us to the integrity of shape. First, figure is not the mere concatenation of parts somehow homogeneous to it: a triangle is not ‘really’ a set of points, or even a set of line segments. Second, when a part of a triangle or circle is altered, other parts must undergo a compensatory alternation, an adjustment so that the triangle may remain a triangle (or, the kind of triangle it is) or so that the circle may remain a circle. No part has a relation to another part that is not mediated by the whole to which they belong. ⁵ Euclid Elements ed. T. L. Heath (New York: Dover, 1956), 3 vols.
analysis and experience 37 We are used to calling ‘parts’ the boundaries of a triangle—primarily the lines which are its sides, but sometimes also the points which are its vertices (and the boundaries of its sides) as well as its internal angles. And so they are, but parts in a sense different from units as parts of whole numbers, or atoms as parts of molecules, or bricks as parts of a house, or elements as parts of a set. Analytic methods in different domains must proceed differently. Euclid’s definitions remind us to keep distinct, while still rationally related, things we are used to thinking of as homogeneous. Set theory in connection with modern logic has persuaded us to accept as an equation what is only an analogy, and to suppose that the things of geometry may be unproblematically and unreflectively decomposed into (sets of) points. Many propositions in Book I show that sides and angles in a triangle provide both global and local constraints on each other. Proposition 6 of Book I shows that a triangle, two of whose three angles are equal, must have two sides equal, namely the two sides subtended by the two equal angles. Proposition 18 proves that in any triangle, the greater side subtends the greater angle, and Proposition 19 that the greater angle is subtended by the greater side. The side opposite an angle, we might say, reflects that angle, and vice versa. Proposition 17 and Proposition 20 are related in exhibiting a global condition: the former shows that any two interior angles of a triangle taken together are less than two right angles, and the latter that any two sides taken together are greater than the remaining one. We might say, a triangle may be flattened, but if it is flattened all the way down to a line, it is no longer a triangle. Propositions 28–31, concerned with the angles that straight lines make as they cut parallel lines, have received a great deal of discussion due to their relation with the Parallel Postulate. But the point of these propositions may be lost sight of; Euclid’s interest is not so much in the parallel lines themselves, but in the parallelograms they constitute when they are bounded. Euclid is intent on showing how parallelograms constrain the triangles that are inscribed within them, constraints which compound, as it were, the constraints already present within triangles because they are triangles. The right triangle in Proposition 47, the Pythagorean Theorem, is then presented as caught in a web of constraints pictured by the auxiliary constructions that surround it, and which include not only the squares built upon its sides but a series of parallelograms and triangles within them. These
38
introductory chapters
constraints show that no matter how we perturb a right triangle, it will always be the case that ‘the square of the side subtending the right angle is equal to the (sum of the) squares on the sides containing the right angle.’ Likewise, Book IV makes use of a series of results in Book III that show how the circle constrains the angles we may inscribe inside them; these results are often presented in terms of triangles, particularly right triangles, one of whose sides is a diameter of the circle. These results in turn are used to show, in Book IV, how circles constrain the figures we inscribe within them (and, sometimes, circumscribe around them). The general progression of Book IV is the inscription of rectilinear figures with a greater and greater number of sides within the circle, and the tour de force in Proposition 16, the last proposition in the book, is the inscription of a fifteen-angled, equilateral and equiangular figure in the circle. Because the squaring of the circle is insoluble within Euclid’s Elements, the way this unsolved problem organizes Books III and IV is striking. For Proposition 16 is quite clearly not an ending: it is a gesture in a certain direction, made evident by the trajectory of the propositions leading up to it. That gesture points towards Archimedes and the seventeenth century. Even though the analysis here is inconclusive, it proposes the rational relatedness of the curvilinear and the rectilinear by what we might call the inductive direction of its reasoning. In Euclid’s Elements, the canonicity of the straight line—and of the triangle as the simplest figure constructible by straight lines on the plane—and of the circle—is clear. When Hilbert, in his role as formalist, claims that all relevant geometrical information is embedded in sets of axioms, so that geometry is only what is common to all interpretations of a theory up to isomorphism, he cannot account for the canonicity of certain objects.⁶ If ‘point,’ ‘line,’ and ‘plane’ can be given alternative interpretations that yet produce a model isomorphic to Euclid’s, then there is no reason to demur; and there is no reason to prefer a Euclidean to a non-Euclidean theory for geometry. Geometry then becomes a kind of smorgasbord of models; philosophers of the late nineteenth century were dismayed because they felt they had lost all grounds for choosing which geometry was ‘true,’ since the appeal to (usually Kantian) intuition had been discredited as subjective, in either a psychologistic or transcendental sense. ⁶ Grundlagen der Geometrie (Teubner 2002); Foundations of Geometry (Chicago: Open Court, 1971). See also David Hilbert’s Lectures on the Foundations of Geometry, 1891–1902, eds. M. Hallett and U. Majer (New York: Springer, 2004).
analysis and experience 39 Yet Hilbert, writing in his role as geometer, acknowledges the canonicity of certain objects as a matter of course, for without appeal to that canonicity a domain like differential geometry could not even be broached. Introducing the chapter on differential geometry in Geometry and the Imagination, a book full of diagrams and photographs, he writes, ‘We will, to start with, investigate curves and surfaces only in the immediate vicinity of any one of their points. For that purpose, we compare the vicinity, or ‘neighborhood,’ of such a point with a figure which is as simple as possible, such as a straight line, a plane, a circle, or a sphere, and which approximates the curve as closely as possible in the neighborhood under consideration ... ’⁷
The related, more modern notion of differentiable manifold central to topology, makes the same appeal to canonical (flat) space. A (Hausdorff) topological space is called ‘locally Euclidean’ when, for every point of the space, there is an open set containing the point and a homeomorphism that maps that open set onto an open set of Rn , that is, when the space looks flat in sufficiently small neighborhoods.⁸ This notion underlies the more general notion of differentiable manifold, and makes possible the definition of functions, and the differentiation and integration of functions, on them. Both expositions make use of iconic as well as symbolic representations, linked by natural language, in order to refer as needed to canonical items in their specificity. Canonical items don’t present themselves as canonical to ‘intuition,’ but prove to be canonical in the reflective analysis of the mathematician, that is, prove to be conditions of the intelligibility of more general and complex mathematical things. In order to define and integrate a function on a strange new topological space, a mathematician must find a way to lead the situation back to more well-known and tractable situations; understanding the new space as locally like other, canonical spaces is one highly successful method for solving the problem. Here we see the way in which analysis as problem-solving, and analysis as the reflective search for the simple things underlying complexity, or more generally for the conditions of intelligibility, are so closely linked. The canonicity of ⁷ D. Hilbert and S. Cohn-Vossen Anschauliche Geometrie (Berlin: Springer, 1996); Geometry and the Imagination (Providence: American Mathematical Society, reprint edition, 1999). ⁸ I. M. Singer and J. A. Thorpe, Lecture Notes on Elementary Topology and Geometry (Glenview, IL.: Scott, Foresman & Co., 1967; repr. New York: Steiner), 109–10.
40
introductory chapters
mathematical things is thus not passively encountered, but emerges out of a process of discovery and reflection in which certain things prove to be ineluctably canonical: mathematical experience. What is specific to mathematics is that the canonicity of certain things was apparent 2500 years ago, and time has not dispelled their canonicity, but only revealed deeper explanations of it, and new uses for it. 2.1.2. Analysis of Propositions The best formulation I know of analysis in this second sense comes from two books by Carlo Cellucci: Le ragioni della logica and Filosofia e matematica.⁹ As my dedication attests, this book owes a great deal to Cellucci’s work. He systematically and polemically contrasts the analytic method of proof discovery with the axiomatic method of justification, which reasons ‘downwards,’ deductively, from a set of fixed axioms. He argues that the primary activity of mathematicians is not theory construction but problem solution, which proceeds by analysis, a family of rational procedures broader than logical deduction. Analysis begins with a problem to be solved, on the basis of which one formulates a hypothesis; the hypothesis may give rise to another problem which, if solved, would constitute a sufficient condition for the solution of the original problem. To address the hypothesis, however, requires making further hypotheses, and this ‘upwards’ path of hypotheses must moreover be evaluated and developed in relation to existing mathematical knowledge, some of it available in the format of textbook exposition, and some of it available only as the incompletely articulated ‘know-how’ of contemporary mathematicians. Indeed, some of the pertinent knowledge will remain to be forged as the pathway of hypotheses sometimes snakes between, sometimes bridges, various domains of mathematical research (some of which may be axiomatized and some not), or when the demands of the proof underway throw parts of existing knowledge into question. In such situations, I would add, different traditions of representation are typically juxtaposed and superimposed, and their conjunction must be explained in natural language. ⁹ Le ragioni della logica (Rome: Editori Laterza, 1998); Filosofia e matematica, (Rome: Editori Laterza, 2002). A good introduction in English to Cellucci’s position can be found in ‘The Growth of Mathematical Knowledge: An Open World View,’ The Growth of Mathematical Knowledge, eds. E. Grosholz and H. Breger (Dordrecht: Kluwer, 2000), 153–76.
analysis and experience 41 Proof searches, Cellucci argues, do not appeal to fixed axioms, but rather to provisional hypotheses that can be changed in the course of proof and are sought by a trial-and-error process, establishing a path or ordering of propositions that is local, not global, problem-dependent, not problemindependent. Whereas the proof strategy of analysis presupposes background knowledge, much of which must remain tacit, the axiomatic method seeks axioms that presuppose no other knowledge and that will ‘cover’ a host of arbitrary propositions, so that all pertinent knowledge is articulated; to use Cellucci’s formulation, it seeks a ‘closed universe’ of mathematical knowledge. Since the hypotheses made in an analytic proof search may sometimes be discarded or revised in the light of mathematical knowledge brought to bear on the proof, and since proof discovery is so central to mathematical rationality, it is better to think of mathematics as an open rather than as a closed system, Cellucci concludes. Philosophers who restrict rationality to fixed deductive systems can only dismiss discovery as intuition or mystical vision; by contrast, he urges the study of rational discovery procedures. Sometimes Cellucci formulates analysis as reasoning that moves upwards in search of a hypothesis from which a result may be inferred rather than moving downwards from axioms to theorems, that is, as a relation among propositions. This is also the way in which Hintikka and Remes discuss the method in their influential book, The Method of Analysis: Its Geometrical Origin and its General Significance¹⁰ and it is the way in which Peirce’s notion of abduction is typically presented. This schematization is not incorrect, but it is incomplete; and each of the authors just cited testifies to this incompleteness as they go on writing about it. Cellucci, for example, argues in a forthcoming paper that the purpose of analysis is explanation, the discovery of explanatory proofs; as examples he gives a series of reasonings about diagrams as well as equations, proportions, infinite series, where expository prose relates the various modes of representation. He notes, ‘Hypotheses need not belong to the same genus as the problem, so no branch of knowledge is a closed system,’ and also, ‘a hypothesis for the solution of a problem in the analytic method is closely connected with it, since it is aimed at it and serves to solve it specifically.’¹¹ This means ¹⁰ J. Hintikka and U. Remes (Stuttgart: Springer, 1974). ¹¹ ‘The Nature of Mathematical Explanation,’ presented at the conference Philosophy of Mathematics Today, Scuola Normale Superiore di Pisa, Pisa, 2006.
42 introductory chapters that heterogeneous items and traditions of representation are going to be brought very closely together, harnessed by the need to solve a problem in a way that explains why the proof works. Jaakko Hintikka in a forthcoming paper observes that while his own earlier co-authored book discussed analysis in terms of relations among propositions, analysis seems to mean (or at least emphasize) something else, viz. a study of the interrelations of different geometrical objects in certain figures, that is, in certain geometrical configurations. This is the sense of ‘analytic’ in analytic geometry, which came about when interdependencies of different geometrical objects in a given configuration began to be expressed algebraically.¹²
And Marcus Giaquinto, whose earliest work is concerned with Hilbert’s formalism, in his recent and forthcoming books uses a sometimes phenomenological, sometimes rather more neuro-psychological approach, to investigate how ‘configurations’ understood in a broad sense are used in mathematical discovery.¹³ Cellucci often dismisses Peirce’s notion of abduction as ‘banal,’ but I believe that what he dismisses is only the schematic view of abduction as a relation among propositions; a much richer account of abduction may be found scattered among Peirce’s writings on mathematics. Daniel Campos, in his application of Peirce’s ideas to the history of probability theory, defends abduction as a creative and plausible or hypothetical mode of reasoning, strongly colored by pragmatic considerations (not surprisingly!).¹⁴ Campos writes in Chapter 2, Peirce is interested in accounting for the reasoning activity of scientific inquirers within the actual complex situations in which they find themselves thinking, and ¹² ‘Analyzing (and Synthesizing) Analysis,’ draft document. Like Cellucci but in a different key, Hintikka has a deep and interesting way of formulating a logic of discovery; see for example his Inquiry as Inquiry: A Logic of Scientific Discovery (Dordrecht: Kluwer, 1999). ¹³ See his Visual Thinking in Mathematics: An Epistemological Study (forthcoming from Oxford University Press) and his recent The Search for Certainty: A Philosophical Account of Foundations of Mathematics (Oxford: Clarendon Press, 2002), as well as ‘From Symmetry Perception to Basic Geometry,’ and ‘Mathematical Activity,’ Visualization, Explanation and Reasoning Styles in Mathematics, ed. K. Joergensen and P. Mancosu. (Dordrecht: Kluwer, 2003). ¹⁴ Daniel Gerardo Campos, ‘The Discovery of Mathematical Probability Theory: A Case Study in the Logic of Mathematical Inquiry.’ Philosophy PhD thesis, State College, The Pennsylvania State University, Pennsylvania, 2005. For a general exposition of abduction, see ch. 2, 30–56; for a historical case study of ‘creative abduction,’ see ch. 6, 272–94. (For the original source of the term, see C. S. Peirce, ‘Pragmatism as the Logic of Abduction,’ in The Essential Peirce, Vol. 2 (Indianapolis: Indiana University Press, 1998), 226–41).
analysis and experience 43 in assessing the merit of their methods of reasoning on the basis of their aim within the context of scientific inquiry. The ‘economy of research’ in scientific inquiry entails that inquirers, at various points, must make abductive conjectures and test them inductively.¹⁵
Campos shows that Peirce distinguishes ‘creative abduction’ from ‘ordinary abduction’ (the conjectural classification of facts by means of available laws) and ‘induction’ (which only serves to test a conjecture, not to suggest it). Creative abduction occurs when the general rule is not given in advance, and the inquirer must forge the rule as a plausible hypothesis. Once again, heterogeneity enters in: Creative abductive inferences are the site of novel conception, as the entities involved in the causes, principles, or laws that plausibly explain the observed phenomena are often of a different nature from the observed phenomena. That is, when we propose a hypothetical explanation for an observed fact, we often conceive causes, principles, or laws that involve entities that are essentially different from the observed fact. In abduction, we begin with an observed particular phenomenon, we suppose a general explanation—a cause, a law, or a principle—that involves entities that are of a different nature than the observed particular phenomenon, and we provisionally conclude that the supposition is plausible.¹⁶
Good examples of this kind of analytic reasoning are discussed in the chapters below: the reduction of problems of genetics to problems of molecular biology; the reduction of problems of geometry to problems about polynomial equations or differential equations; the reduction of problems of planetary motion to problems concerning centers of force. The model of theory reduction proposed by Carnap and Hempel requires that both the reduced domain and the reducing domain be rewritten as formal theories; the axioms of the theory of the reduced domain are then deductively derived as theorems from the theory of the reducing domain. If, however, theory reduction is better viewed as problem reduction, and if problem reduction is a kind of analysis, we may expect that the chapters to come will radically call into question the mid-twentieth century account of theory reduction. If the convergence of heterogeneous traditions in a problem solution is accompanied by the convergence of distinct traditions of representation, we may likewise expect that the multiplicity of modes of ¹⁵ Campos, ‘The Discovery of Mathematical Probability Theory,’ 49. ¹⁶ Campos, ‘The Discovery of Mathematical Probability Theory,’ 52–53.
44
introductory chapters
representation and attendant productive ambiguity will call into question Carnap’s ideal of a scientific language. The generalizing effect of creative abduction noted by Campos carries us to the next section. 2.1.3. Analysis of Arguments The analysis of intelligible objects, the explanation of problematic facts by plausible explanatory hypotheses, and the convergence of disparate traditions in the service of problem-solving, take place within a mathematical tradition that continually reformulates and reorganizes its results in order to transmit them. Mathematics textbooks sometime present material in axiomatic form: we are used to the axiomatic presentation of Euclid, and of the highly abstract, highly axiomatized, diagram-free presentations of the twentieth century Bourbaki school and its offspring. However, there are other ways to analyze collections of problems and procedures and to teach the methods of mathematics. Many modern textbooks use an exposition that prizes generalization over abstraction, and I discuss some of them below, in reference to topology. And if we look at the text that is the analogue of Euclid’s Elements in Chinese culture, we find a striking instance of it as well. Karine Chemla and Guo Shuchun have recently re-edited and translated (into French) the Chinese mathematical classic Les Neuf chapitres, with extensive notes, glossary and bibliography.¹⁷ In the introductory essay, Chemla presents her central thesis, at once historical and philosophical. The scholarly mathematical tradition of ancient China valued generality as much as, if not more than, abstraction. We should not misread Les Neuf chapitres as a mere compendium of concrete problems: the original text taken together with the commentaries exhibits an important kind of organization that is, however, not the axiomatic treatment of Euclid’s Elements. It is instead the exploration of the meaning of procedures and algorithms in terms of paradigmatic problems that may be used to exhibit their correctness, clarify their domain of application, and indicate how that domain may be extended. Because it is concerned with explaining why a result is correct and not just establishing that it is correct, it employs the ¹⁷ Les Neuf chapitres: Le Classique math´ematique de la Chine ancienne et ses commentaires, with a preface by Geoffrey Lloyd (Paris: Dunod, 2004). I wrote a review essay of this book in Gazette des Math´ematiciens, 105 (July 2005), 49–56.
analysis and experience 45 approaches discussed in the preceding section. Carl Boyer, in A History of Mathematics (1985) thus misreads Les Neuf chapitres when he writes, ‘Whereas the Greeks of this period were composing logically ordered and systematically expository treatises, the Chinese were repeating the old custom of the Babylonians and Egyptians of compiling sets of specific problems.’¹⁸ On the contrary, Chemla argues, we must examine more reflectively how the Chinese mathematicians make use of the problems. The nine chapters of the work introduce nine fundamental procedures. Each procedure, often described in abstract terms without mention of any particular problem, is placed at or near the beginning of the chapter and gives its name to the chapter. In chapters where a problem leads off, it is exhibited as a paradigm for the clear exhibition of a procedure or the algorithm that can be elicited from it. The working out of problems typically exhibits intermediary steps in a process of reasoning that contributes to the meaning of the final result and indicates how it might be extended. The exhibition of the meaning of procedures and the correctness of algorithms in terms of paradigmatic problems and canonical objects thus involves the combination of distinct modes of representation, descriptions of how to reason both upwards and downwards, and figures that exhibit spatial articulation. The overarching order that organizes Les Neuf chapitres is a progressive and pedagogical search for the reasons that underlie general procedures and the constitution of objects, a search for deeper as well as broader understanding. Chemla has explored modern instances of similarly structured reasoning in the work of nineteenth century geometers, Lazare Carnot, Jean-Victor Poncelet, Joseph Gergonne, and Michel Chasles; her younger colleague Anne Robadey explores it in Poincar´e’s work concerning geodesics on convex surfaces, and celestial mechanics.¹⁹ ¹⁸ A History of Mathematics (Princeton: Princeton University Press, 1968/85), 218. ¹⁹ Karine Chemla, ‘Lazare Carnot et la g´en´eralit´e en g´eom´etrie. Variations sur le th´eor`eme dit de Menelaus,’ Revue d’histoire des math´ematiques, 4 (1998), 163–90; ‘Remarques sur les recherches g´eom´etriques de Lazare Carnot,’ in Lazare Carnot ou le savant-citoyen (Paris: Presses de l’Universit´e de la Sorbonne, 1990), ed. Jean-Pierre Charnay, 525–41; ‘Le rˆole jou´e par la sph`ere dans la maturation de l’id´ee de dualit´e au d´ebut du XIXe si`ecle. Les Articles de Gergonne entre 1811 et 1827,’ Actes de la quatri`eme universit´e d’´et´e d’histoire des math´ematiques, Lille, 1990 (Lille: IREM, 1994), 57–72. With Serge Pahaut, ‘Pr´ehistoires de la dualit´e: explorations alg´ebriques en trigonom´etrie sph´erique, 1753–1825,’ Sciences a` l’´epoche de la revolution, ed. Roshdi Rashed (Paris: Librairie A. Blanchard, 1988), 149–200; and ‘Histoire ou pr´ehistoire de la dualit´e. Relecture des triangles sph´eriques avec et apr`es Euler,’ Aspects de la dualit´e en math´ematiques, ed. Paul Van Praag, Cahiers du Centre de Logique, Vol. 12 (Universit´e
46
introductory chapters
But we should not suppose, despite the great influence of Bourbaki, that generalization as an ideal has been wholly suppressed by the ideal of abstraction and axiomatization in the twentieth century. On the contrary, the two ideals appear to co-exist dialectically. If we consult, for example, the earlier textbooks cited in the Introduction to the Topologie of Paul Alexandroff and Heinz Hopf ²⁰ we see that topology in 1935 had become divided into two distinct research programs. Pursuing one avenue, Oswald Veblen and Solomon Lefshetz write presentations of algebraic geometry and differential geometry, the study of algebraic varieties and differentiable manifolds by topological methods. Lefshetz’ presentation is generally historical; he shows the student where and why various problems arose, and how they have been addressed.²¹ Veblen explores the foundations of the field and offers an axiomatization of differential geometry.²² The other avenue is explored by Maurice Fr´echet and Casimir Kuratowski, who use topology to investigate the abstract function spaces (infinite-dimensional spaces) and infinitary point-sets that arose in real and complex analysis. Fr´echet’s presentation, like that of Lefshetz, is historical,²³ while Kuratowski concentrates on an axiomatization of topology. Kuratowski writes at the beginning of his textbook, ‘the methods of reasoning that I use in this volume belong to set theory; the methods of combinatorial topology (homology, Betti groups, etc.) in general don’t intervene in the questions treated here.’²⁴ The authors whose approach is historical—Lefshetz on the one hand and Fr´echet on the other—present the analysis of canonical objects in a process of generalization. The authors who aim at logical systematization—Veblen on the one hand and Kuratowski on the other—present axiomatizations that ‘synthesize’ the whole domain by relating principles, rules, and definitions inferentially in a process of abstraction. Both approaches, in tandem, are needed to bring up students who will do research in topology. Catholique de Louvain, 2003), 9–25. See also Anne Robadey, ‘Exploration d’un mode d’´ecriture de la g´en´eralit´e: L’Article de Poincar´e sur les lignes g´eod´esiques des surfaces convexes (1905),’ Revue d’histoire des math´ematiques, 10 (2004), 257–318. ²⁰ (Berlin: Springer, 1935). ²¹ Lefshetz’ historical presentation in L’analysis situs et la g´eom´etrie alg´ebrique (Paris: Gauthier-Villars, 1924) and G´eom´etrie sur les surfaces et les vari´et´es alg´ebriques (Paris: Gauthier-Villars, 1929), begins with Poincar´e’s analysis situs, which studies the distribution of curves on an algebraic surface. ²² Foundations of Differential Geometry (Cambridge: Cambridge University Press, 1932). ²³ M. Fr´echet, Les espaces abstraits et leur th´eorie consid´er´ee comme introduction a` l’Analyse g´en´erale (Paris: Gauthier-Villars, 1928). ²⁴ Topologie I (Warsaw/Lvov, 1933), IX.
analysis and experience 47
2.2. Mathematical Experience 2.2.1. Leibniz, Cavaill`es, and Breger: Formal Experience as Formal For Leibniz, as also for Plato, the notion of analysis (both philosophical and mathematical) devolves into a critique of empiricism and materialism, for the analysis of perceived things, or even more generally of material things, invariably leads beyond them, to things that are encountered in reflection as their conditions of intelligibility, explaining not only their orderliness but also what makes them puzzlingly incomplete and unable to account for themselves. The intelligibility of perceived things, for example, depends on their countability and measurability, their integrity and side-by-side-ness, their shape, their persistence, and their development, thus one is often led back in reflection to further intelligible—but not perceptible—unities, that is, to number and figure. Moreover, reflection upon number leads us back to figure, and reflection on figure to number; number and figure are the Adam and Eve of mathematics. Because they are so determinate, the features that number and figure turn out to have, they have necessarily, which allow us to account for the force of demonstration, and to furnish explanations that exhibit the reasons for things being the way they are. For both Leibniz and Plato, analysis guides any attempt to move from perceptual experience to science, or from science to philosophy; and the intelligibility it pursues is objective, in the sense of being independent of the accidents of the empirical world, and in particular of human subjectivity. What is at stake is not ‘necessary conditions of the possibility of human experience,’ i.e. Kant’s transcendental conditions, but rather conditions of the intelligibility of existing things. Mathematical objects do not exist in the way that material or perceived objects exist, but their existence is called for in order to explain the intelligibility of the latter, and indeed of anything thinkable. However, our awareness of intelligible things proceeds by representations, even though it cannot be identified with any one such representation or even a ‘complete set’ of them, as there are no such complete sets. Intelligible things are inexhaustible—they elude and inspire all our representations of them; yet there is no mathematics without representations. The things of mathematics, as I noted in the Preface of this volume, are determinate but infinite; the natural numbers, the line, the space. As determinate,
48
introductory chapters
they lend themselves to discourse and indeed to a discourse that exhibits necessary connections, but as infinite, they are recalcitrant. Our notations and diagrams are paper tools that represent the infinitary in finitary terms by inducing periodicity and, more generally, repetition from mathematical things; by assigning limits or constraints; by articulating continua; and simply by negating, by excluding. Human discourse frames intelligible but fathomless and endless reality, like a window through which we see the star-struck blackness of the night sky; and this is also true for mathematical discourse. Our notations and diagrams also suggest, and are suggested by, paradigmatic items and problems that focus mathematical activity on one thing rather than another. Mathematical notation is selective, and it makes things ‘compact,’ and so surveyable, but reduced.²⁵ Awareness of an intelligible thing cannot be summed up in its representations, but also cannot be distinguished from them; the best we can do is ‘triangulate,’ that is, investigate it using a variety of representations, each of which may capture distinct and complementary aspects. This is why mathematicians like to re-prove the same theorem by different means, along different analytic pathways; and why the philosophical project of finding the sole correct representation for mathematics is misguided. Leibniz is often praised for his prescient appreciation of the important role that formal modes of representation play in mathematical and scientific discovery. His conviction about the usefulness—indeed the indispensability—of symbolic characteristics to an ars inveniendi certainly stemmed from the great success of his infinitesimal calculus, which expanded the characteristic of Descartes’ algebra to include the symbol for differentiation (dx) and integration (∫ x) as well as notation for infinite sequences and series. The admiration bestowed on Leibniz at the turn of the last century by Louis Couturat and Bertrand Russell, however, has obscured two important features of Leibniz’s use of characteristics. Russell, as mentioned above, was committed to a program of logicism, which sought a unified language for logic and procedures for reducing all of mathematics to logic, via a reduction first of arithmetic to logic and then of geometry to arithmetic. Because Russell read Leibniz as a logicist, he missed Leibniz’s emphasis on the multiplicity of formal ²⁵ For deep and challenging arguments in favor of realism and the infinitary nature of mathematical reality, see J. Vuillemin, ‘La question de savoir s’il existe des r´ealit´es math´ematiques a-t-elle un sens?’ Philosophia Scientiae, 2(2) (1997), 275–312. Related arguments can be found in G. Oliveri, ‘Mathematics as a Quasi-Empirical Science,’ Foundations of Science, 11 (2006), 41–79.
analysis and experience 49 languages, admirably traced in Hourya Sinaceur’s ‘Ars inveniendi et th´eorie des mod`eles’.²⁶ For that matter, he seems to have missed Frege’s late emphasis on the irreducible multiplicity of the Begriffschrift, also admirably traced in Claude Imbert’s Pour une histoire de la logique.²⁷ More to the point, Russell missed the significance of the broad range of modes of representation that recur throughout the thousands of pages of Leibniz’s Nachlass, where his novel characteristics combine with a stunning variety of tables, sketches, and diagrams. Different formal languages (symbolic and iconic) reveal different aspects of a given domain, and lend themselves better to certain domains than to others. Russell’s narrowly-focused vision remained fixed on a linear symbolic characteristic, and neglected the conceptual possibilities offered by various two- and three-dimensional symbolic representations, as well as the more iconic representations of geometry, topology, mechanics, and chemistry. Indeed, it missed the spatial and iconic aspects (and limitations) of the characteristic he helped to construct on the basis of the Begriffschrift. Given his emphasis on justification at the expense of discovery, Russell also missed Leibniz’s insight that writing, the use of characteristics and other modes of representation to express thought and analyze the conditions of intelligibility of things, allows us to say more than we know we are saying: the best have a kind of generative power, especially when they are used in tandem. This is the positive converse to the negative results of G¨odel’s Incompleteness Theorem. A good formal representation advances knowledge not only by choosing especially deep fundamental concepts, but also by allowing us to explore the analogies among disparate things, a practice which in the formal sciences tends to generate new intelligible things, some of which I have called ‘hybrids’ in various articles.²⁸ Moreover, modes of representation may stand for and refer to themselves as intelligible items, and so add to the furniture of the universe. The investigation of conditions of intelligibility not only discovers but also induces order: we add to the non-totalizable infinity of intelligible things as we search for conditions of intelligibility using a spectrum of modes of representation. ²⁶ Dialogue, 27 (1988), 591–613. ²⁷ (Paris: PUF, 1999). ²⁸ See my ‘The Partial Unification of Domains, Hybrids, and the Growth of Mathematical Knowledge,’ The Growth of Mathematical Knowledge, eds. E. Grosholz and H. Breger (Dordrecht: Kluwer, 2000), 81–91; my notion of hybrid is discussed at length in Carlo Cellucci’s Filosofia e matematica, ch. 37.
50
introductory chapters
Mathematical experience is thus the study of intelligible things in a tradition of representation that uses a spectrum of modes of representation to investigate them. We might say, with the great French philosopher of mathematics Jean Cavaill`es, that mathematical experience is a formal or discursive experience that mathematicians and students acquire to varying degrees, comparable to the formal experience of the law acquired by lawyers and judges. It is the mastery of ‘combinatorial spaces’ constituted in the history of mathematics; mathematical notation and figures belong to traditions of representation that severely constrain what may be put on the page, how marks may be set next to other marks, and what meanings they may take on.²⁹ In his essay ‘Tacit Knowledge and Mathematical Progress,’³⁰ Herbert Breger observes that such experience, while formal, can never be completely formalized, arguing that the shaping of a new concept, method or theory is a process with two steps. The first is the rise of a specific ‘know-how’ or practical knowledge as mathematicians get to know their way around certain problems and objects, a know-how that is at first only tacit or partly sketched: for example, methods for handling certain classes of problems. Cavaill`es would call it the mastery of various combinatorial spaces, and Wittgenstein the mastery of certain language games.³¹ The second step turns tacit knowledge at the meta-level into general principles or new abstract objects along with theorems that govern them. Breger points out that such generalization and abstraction are not pursued for their own sake, but because they reorganize knowledge in deep and fertile ways. Recognition of this pattern explains why older textbooks often seem long-winded to us, giving lots of examples instead of a general rule, though closer examination may reveal that a particular case is meant to represent or suggest the general case, and that the author is striving to bring out those features of the particular case that point towards the generalization. Moreover, the author may lack notation and abstract concepts needed to state the rule in its full generality, or the problem context may not yet require the articulation of a general rule. In a related essay, Detlef ²⁹ See the exposition of these points in Pierre Cassou-Nogu`es’ ‘Signs, Figures, and Time: Cavaill`es on ‘‘Intuition’ in Mathematics,’’ Theoria, 55 (2006), 89–104 and in the final section of Hourya Sincaceur’s Jean Cavaill`es, Philosophie math´ematique (Paris: Presses Universitaires de France, 1994). See also Pierre Cassou-Nogu`es’ recent book that centers on Cavaill`es, De l’exp´erience math´emathique (Paris: Vrin, 2001). ³⁰ Grosholz and Breger, The Growth of Mathematical Knowledge, 221–30. ³¹ L. Wittgenstein, Remarks on the Foundations of Mathematics (Cambridge, MA: MIT Press, 1983); see also E. Grosholz ‘Wittgenstein and the Correlation of Arithmetic and Logic,’ Ratio, 23(1) (1981), 33–45.
analysis and experience 51 Laugwitz reminds us that the conversion of tacit knowledge into textbook exposition plays an indispensable function within mathematics, because it allows the reliable transmission to succeeding generations of methods that would otherwise die with the clever mathematician who cannot explain his or her own know-how.³² It is part of the rationality of mathematical practice to make the activity of research explicit enough to continue. Breger however also cautions that the drive towards abstraction and systematization—ultimately axiomatization—may lead to a misunderstanding of mathematical knowledge. I have argued above that axiomatic text-book expositions are typically complemented by presentations that emphasize generalization rather than abstraction. Not all know-how can be translated into abstract formal theory. The know-how that lets human beings interpret symbolic and iconic representations, combine them, and apply them to problems within mathematics and to physical reality, for example, resists formalization. This is one reason why formal modes of representation must be surrounded by natural language that explains their significance. The relation of an axiomatized formal system to a thinking person or to an intelligible thing or to another axiomatized formal system is not itself part of the system, as all futile attempts to encapsulate the conditions of intelligibility reveal. And the advance of mathematical knowledge seems always to carry with it the generation of new kinds of tacit knowledge, both new knowledge at the frontier, a meta-level that will one day call for articulation, and older, more concrete levels of knowledge that, because they have been abstracted away from, have lost their articulation. Thus, Breger recounts, Leibniz surveyed various idiosyncratic attempts to find the areas of different configurations in the context of a geometry altered by Descartes’ new analytical approach that makes use of axes and algebraic equations, reducing geometrical locus problems to problems that employ equations to construct line segments. Leibniz tried to summarize and articulate that practice in a set of abstract rules. The algorithms of the integral calculus proved both extremely powerful, handling whole classes of problems, and limited, since the operation of integration typically takes algebraic functions to transcendental functions. In his maturity, Leibniz ³² Grosholz and Breger, The Growth of MathematicalKnowledge, 177–98.
52
introductory chapters
spilled a great deal of mathematical ink trying to understand and represent the new class of curves that his new methods helped to define but not to systematize. That work was carried on by the Bernoullis and Euler, and given a textbook systematization a hundred years later by Lagrange in his celebrated Th´eorie des fonctions analytiques (1779), which tries to avoid some of the controversies surrounding the use of infinitesimals by restricting functions to formal power series. Cauchy criticized the narrowness of Lagrange’s definition in his Cours d’analyse (1821 and 1823), offering the epsilon-delta approach, but he also there defined functions too narrowly to deal with certain important issues of convergence and continuity. Indeed, in his own mathematical and physical research, he used methods frowned upon by his textbook. Breger’s essay, in sum, shows that problem reduction as a form of analysis and theory construction as a form of retrospective synthesis may be considered two aspects of a single on-going process; he also shows that the need for articulation and systematization guides problem reduction, and that theory construction is in turn guided and revised by the problems it helps to bring to light. 2.2.2. Leibniz and Hume: Formal Experience as Experience Leibniz had a much deeper and richer understanding of the nature of formalism than Locke. He understood that ‘formalism’ meant not just logic as the canon of forms of inference about anything at all, but also algebra—indeed, algebras of different kinds that represent formally the peculiar features of different subject matters. In the New Essays on Human Understanding, he treats syllogistic as embedded in a broader enterprise that he calls a kind of universal mathematics or ‘art of infallibility’ in his commentary on Locke’s Chapter VII: Of Reason, in which Locke mounts a sustained attack on the epistemological pretensions of logic. Leibniz explains that he means by formal arguments, ‘any reasoning in which the conclusion is reached by virtue of the form, with no need for anything to be added.’³³ Leibniz saw something about formal languages more clearly than anyone else in the seventeenth century, and certainly more clearly than Locke. He understood the ‘algebraic’ virtue of form: it can be (provisionally) detached from its applications or instances—allowed to take on a life of its own—and ³³ Ed. P. Remnant and J. Bennett (Cambridge: Cambridge University Press, 1982), 479
analysis and experience 53 then its rules of procedure may be considered infallible. According to the algebra of arithmetic, a(b + c) infallibly produces ab + ac, no matter what integers we plug into the formula. We can rely on the formalism; in this case, we do not have to perform the operations of adding b + c and then multiplying the result by a, but can instead multiply a and b, then a and c, and finally add the products. The algebraic form insures that we will arrive at the same result. Leibniz also understood better than anyone else at the time that the mathematical study of forms independent of their applications is rewarding mathematically. He rightly chastises Locke for thinking that the relation between a formal expression and its instances is abstraction or induction from instances. It is not abstraction, because abstraction begins with a range of instances and subtracts what makes them different, leaving only what they have in common. But on the contrary, ‘insofar as you conceive the similarities amongst things, you are conceiving something in addition to the things themselves, and that is all that universality is.’³⁴ Discovery of significant form adds something to the furniture of the universe, that is, intelligible structure conceived in addition to the things compared. Algebra adds its own truths to those of arithmetic and geometry. And the relation is not induction from instances, because the selection of what counts as the instances in induction presupposes that we already have a grasp of the additional ‘something’ of significant form. ‘The instances derive their truth from the embodied axiom, and the axiom is not grounded in the instances.’³⁵ The pattern of reasoning to which Leibniz points here is more like Peirce’s creative abduction. Leibniz was understandably enchanted by his novel insight into autonomous and detachable form, and its importance for both the criticism and the growth of knowledge, an enchantment heightened by his polemic against Cartesian intuition. But the enchantment and the polemic led him to overstate the virtues of form, in ways that led Russell and Couturat to misread him. Locke’s protests against the pretensions of logic have weight after all, and we must take them into account. For example, Leibniz writes, ‘[i]t is by no means always the case that ‘the mind can see easily’ whether something follows: in the reasoning of other people, at least, one sometimes finds inferences which one has reason to view initially with skepticism, until a demonstration is ³⁴ Leibniz, New Essays on Human Understanding, 485. ³⁵ Leibniz, New Essays on Human Understanding, 449.
54
introductory chapters
given. The normal use of ‘instances’ is to confirm inferences, but sometimes this is not a very reliable procedure ... ³⁶
His claim echoes the early critique of Descartes: one’s subjective conviction that an idea is clear and distinct may lead to error. He makes the point by referring to two geometric examples, and warns against using images in proof, because the faculty of imagination, drawing on sense-experience, must be prone to the confusions attendant on sensation. The first example shows that Leibniz understood that the parallel postulate in Euclid has a status different from some of the other first principles. (Many pages in Leibniz’s Nachlass are spent examining the logical structure of Euclid’s Elements.) Euclid, for instance, has included in his axioms what amounts to the statement that two straight lines can meet only once. Imagination, drawing on sense-experience, does not allow us to depict two straight lines meeting more than once, but this is not the right foundation for a science. And if anyone believes that his imagination presents him with connections between distinct ideas, then he is inadequately informed as to the source of the truths, and would count as immediate a great many propositions which really are demonstrable from prior ones.³⁷
Leibniz faults the definition of a straight line given by Euclid. The second example is that of the asymptote to a curve: It is likely too that by allowing our senses and their images to guide us we would be led into errors; we see something of the sort in the fact that people who have not been taught strict geometry believe, on the authority of their imaginations, that it is beyond doubt that two lines which continually approach each other must eventually meet. Whereas geometers offer as examples to the contrary certain lines which they call asymptotes.³⁸
Here I would argue, in defense of Locke, that Leibniz misunderstands the roles of instances and of images in proof. I begin with the assumption that mathematical representation serves the aim of problem-solving, so that (a) problem-solving is often enhanced, or only possible, when a variety of modes of representation are combined; (b) icons and the iconic aspects of symbols as well as natural language are necessary to the denotation ³⁶ Leibniz, New Essays on Human Understanding, 481. ³⁷ Leibniz, New Essays on Human Understanding, 451. ³⁸ Leibniz, New Essays on Human Understanding, 452.
analysis and experience 55 required for representation; and (c) some modes of representation are better than others for certain kinds of problem-solving and others better for others. So, as I argued earlier, in the celebrated diagram that accompanies the symbolic ratios and proportions and natural language (Greek) of the Pythagorean Theorem, the diagram exhibits, and must exhibit, this right triangle in order to show that the theorem holds for any right triangle. Due to the irreducibility of shape, in order to denote this triangle we have to present a shape; due to the nature of mathematical induction, the anchor case must be exhibited in its particularity in order to generate what Poincar´e calls the ‘cascade’ of other cases. Algebraic symbols and numbers can be correlated with shapes in the service of solving geometrical problems, but successful problem-solving requires both denotation and apt representation, representation that exhibits aspects of the things denoted pertinent to solving the problem. The diagram does in fact successfully and correctly help to exhibit the relation between the squares on the legs of the triangle and the square on its hypotenuse; there is nothing misleading about its contribution to the proof. The diagram, the symbolic notation of ratio and proportion, and the explanatory natural language that links them, provide a combination that leads to a satisfactory proof of the problem. Leibniz reproaches Euclid for offering a definition of the line that does not articulate what happens to parallel lines at infinity. But this frame issue arises from problems that interested Leibniz in the seventeenth century; it does not impugn the truth of the Pythagorean Theorem for the cases to which it was intended to apply, or the cogency of Euclid’s definition of a straight line, which must be taken in conjunction with its representation by a picture of a straight line. Both diagrams and symbols are found to be ‘incorrect’ when they are applied to cases not envisaged by their original authors. Some of Descartes’ algorithms for the hierarchy of algebraic curves, for example, were disproved when more information was gleaned about algebraic curves higher than the conics and cubics, as were Leibniz’s own attempts to formulate algorithms for infinite series. The use of symbols—as opposed to icons—in mathematics is no guarantee of infallibility; and the use of icons does not doom an argument to confusion, or to a lack of reliability and rigor. Leibniz is, however, right on one point. The genius of algebra is not only the way in which it allows us to combine reasoning about numbers and figures, but also the way in which it allows us to move between the finitary
56
introductory chapters
and the infinitary (and the infinitesimal) in our reasoning. Leibniz was right to celebrate the ‘blindness’ of his new characteristics, because they allowed him to assert rational structural relations between finitary things and things that are too big or too small or too ‘far away’ to be pictured directly. Thus it is significant that the examples he invokes to reproach Euclid involve mathematical things that happen at infinity (as did his original example against Descartes, the ‘fastest velocity’). His own mathematical innovations, like the algorithms for the differential and integral calculus, and his nascent sense that mathematics might be full of algebras, were splendid examples of the exploitation of the blindness of symbols. But in the context of this dispute, we must remember that the yoking of the finitary and infinitary by symbols often involves icons (which then take on new functions and picture indirectly what cannot be pictured directly) and depends on certain spatial and iconic features of the symbols themselves. Icons are no more or less tied to sense perception than symbols. We might take a longer backwards look at the mathematical pathway leading from Euclid to non-Euclidean geometry, which proceeds by the consideration of particular instances essential to both fruitful generalization and the justification and correction of algorithms.³⁹ Though on the one hand it is the record of logical investigations, it is also the investigation of novel curves and surfaces, as well as novel meanings for algebraic forms. Poincar´e observed that formal analysis in mathematics opens up too many possibilities and also tends to decompose things, so that the mathematician who deploys it must also reinstate the unity of mathematical things and choose one possibility among many: these goals are also achieved by representation, the representation of particular instances.⁴⁰ This, I believe, is what Locke means when he appeals to the use of intuition in mathematics; and although Leibniz downplays its importance as he challenges Locke, his own work in mathematics does not dispense with intuition so understood. In order to negotiate the dispute between Locke and Leibniz, I bring in David Hume’s account of experience as realized in the ‘formal experience’ that lawyers and judges have of the law, and mathematicians have of mathematics. A successful legal system requires not only tough-minded ³⁹ J. Gray, Ideas of Space: Euclidean, Non-Euclidean, and Relativistic (Oxford: Clarendon Press, 1989); and ‘The Nineteenth-century Revolution in Mathematical Ontology,’ Revolutions in Mathematics, ed. D. Gillies (Oxford: Clarendon Press, 1992), 226–48. ⁴⁰ H. Poincar´e, La Valeur de la science (Paris: Flammarion, 1970), 36–7.
analysis and experience 57 empiricism in its management of evidence, and rigorous formalism in its appeal to principles and rules of inference for the sake of impartiality, but the specialized expertise of legal practitioners. Successful mathematics also requires not only the articulation of rules, principles and structures in well-considered inferential order, and common knowledge of more concrete procedures found in schoolbooks, but the ongoing experience of mathematicians who test and revise rules by applying them to particular objects and problems. Leibniz overstates the virtue of formality, or does so at least in the New Essays; a Humean account of experience, and formal experience in particular, as key to the stability and improvement of knowledge, can moderate Leibniz’s formalism in a way that Leibniz himself might approve, and that indeed finds confirmation in other writings of Leibniz. The late twentieth century reception of Hume’s doctrine was strongly colored by an association with A. J. Ayer’s foundational phenomenalism, and unduly focussed on his critique of causal knowledge as a skeptical argument. Recently, however, various philosophers have tried to locate a more supple and historically accurate reading of empiricism; Bas van Fraassen offers just such an illuminating account in his recent book The Empirical Stance.⁴¹ and Donald Gillies in his Philosophy of Science in the Twentieth Century.⁴² Both of these books have deeply influenced my thought and also my teaching. For the purposes of this argument, however, I make use of Catherine Kemp’s account of Humean experience in relation to the philosophy of law. Here and in other essays, she offers a reading of Hume detached from the invocation of sense data as the starting point of all knowledge (which doesn’t get one very far in either legal or mathematical epistemology) and the overstatement of his skepticism.⁴³ Kemp writes that for Hume, ‘belief emerges out of what is reliable in our experience and persists as long as it remains reliable. Its effect on us is to limit the set of things that affect us greatly: the mind runs in the channels established by the custom emerging out of our experience.’ Moreover, ‘the conditions that give rise to custom ... are also conditions under which custom or belief is subsequently altered in our experience.’⁴⁴ For Hume, the emergence ⁴¹ (New Haven, CT: Yale University Press, 2004). ⁴² (Oxford: Blackwell, 1993). ⁴³ Catherine Kemp, ‘Law’s Inertia: Custom in Logic and Experience,’ Studies in Law, Politics, and Society, eds. A. Sarat and P. Ewick (Oxford: Elsevier Science, 2002), 135–49. ⁴⁴ Kemp, ‘Law’s Inertia: Custom in Logic and Experience,’ 137.
58
introductory chapters
of custom out of our experience, construed as a series or succession of perceptions—and for Hume perception has a much broader connotation than sense perception—not only explains the stability and validity of our knowledge, but also the possibility we always have to revise and correct our knowledge. All knowledge, based on no more but also no less than custom, leaves open the imagining of alternatives to received truth. Kemp argues in particular that given Hume’s broad construal of ‘series or succession of perceptions,’ the experience that lawyers and judges gain in the courtroom, and in the discursive review of cases and earlier judgments, is a good illustration of how custom establishes stable but revisable knowledge. She proposes that we consider the formal or conceptual aspect of law and in particular of the common law ... as that artifactual matter first produced by law’s experience, which in turn facilitates subsequent experience and the development of even more artifactual material, in the form of custom. In this picture the notion of the formal or conceptual makes sense only as part of law’s simultaneous resistance and susceptibility to change, an integrated quality I will call here law’s inertia.
This inertia, she observes, raises very interesting questions: ‘What are its conditions? How are stasis and change possible simultaneously in this context? Why is law both susceptible and resistant to argument?’ ⁴⁵ Following an analogy established by both Locke and Leibniz between the law and mathematics, we may also think of mathematics as an historical artifact, the product of a ‘formal experience,’ which emerges out of what is reliable in the experience of mathematicians, and persists as long as it remains reliable. The effect of mathematical experience (a succession of canonical items, problems, procedures and methods, expressed in traditions of representations and associated ‘combinatorial spaces’) is to establish customs that organize and limit the work of mathematicians. The customs that stabilize mathematical practice, recorded in textbooks and scholarly journals, are also the conditions under which custom or belief is subsequently altered in mathematical experience. Having made his famous distinction between ideas and impressions, Hume observes that some ideas (which are in themselves mere conceptions) can be transformed into lively ideas, as lively and efficacious as impressions, able to move us to action: these lively ⁴⁵ Kemp, ‘Law’s Inertia: Custom in Logic and Experience,’ 136–137.
analysis and experience 59 ideas he calls beliefs.⁴⁶ Hume must then answer two questions: How it is that initially faint ideas can become lively? And, why it is that some ideas and not others are enlivened in this way? These two questions may be asked in reference to mathematics. It often happens that in one era certain objects are described as oddities, and then set aside: they only really become part of mathematics when they enter into problems and families of problems where they stand in well understood relation with other, similar objects, and are handled by well understood procedures. Thus, the Greeks knew about the transcendental number pi, and two or three transcendental curves, but for them these entities were mathematically inert, ‘very faint’ as Hume would say. Only in the seventeenth century did they become enlivened—why did this enlivening take place then? Otherwise put, why do instances begin to pile up on the side of a hitherto merely conceived alternative to a customary relation? These are philosophical questions that cannot be answered without historical study, that is, without paying attention to the pragmatic dimensions of mathematical rationality. On a related issue, Hume may be corrected by Leibniz. At his most skeptical, Hume presupposes that regularities establishing relations among certain ideas are externally imposed, so that the ideas play the role of mere placeholders. By contrast, Leibniz claims that relations express internal features of things that are intelligible unities: knowledge of relations arises from the analysis of things we are aware of. The relations a thing can enter into as well as its internal features are controlled and constrained by the intelligible, unified existence of the thing. Thus for Leibniz, what we can think of as possible for a thing is constrained by the nature of the thing; this means that imagination for Leibniz is more constrained than it is for Hume, who seems to assert that we can imagine anything in any relation to anything. Leibniz would say, we might believe we are thinking something when we imagine aRb for arbitrary a and b and arbitrary R, but we are deceived, and further analysis would reveal the hidden contradiction. Moreover, there is no such totality as ‘the set of all relations of a thing’ or ‘the set of all internal features of a thing,’ so that the work of analysis is never finished. In the end I would say, with Leibniz, that Hume’s two good ⁴⁶ D. Hume, Treatise of Human Nature, ed. L. A. Selby-Bigge (Oxford: Clarendon Press, 1888/1978), 1–10.
60
introductory chapters
questions about how and why faint ideas become lively can’t be answered without referring to the content of things, to the constraints that their ‘internal constitution’ and characteristic unity impose upon the relations into which they can enter. In the case of the law and mathematics, the things in question are strikingly formal, so that what must be addressed is the content of form as well as the form of content.
PA RT I I
Chemistry and Geometry
This page intentionally left blank
3 Bioorganic Chemistry and Biology It makes sense to formalize inference. It doesn’t make sense to formalize a molecule; rather, we say that we represent the molecule using a variety of modes and solve problems about it by various reductive strategies. One such strategy is to analyze the molecule in terms of its symmetries, and then use that group of automorphisms, mapped onto suitable groups of matrices, to investigate it as a dynamical system and construct molecular orbitals in terms of atomic orbitals. The reductive strategies employ the formal languages of group theory, linear algebra, differential equations and arithmetic, but the way in which these languages are combined in problem-solving cannot be completely formalized; and no chemist would be interested in doing so. Chemists, in their research, and teaching, focus on paradigmatic solved problems about canonical items and a toolbox of rules and strategies that can help students solve problems about molecules themselves. The interaction between chemist and molecule is mediated by theories, associated models—some more conceptual and some more material, laboratory equipment, laboratory procedures that produce macroscopic ‘purified substances’ and readable patterns of evidence, and paper tools (some of which are now electronic). I want to argue that the same holds true for mathematics. It makes sense to formalize inference, but it doesn’t make sense to formalize a circle, the number 3, the number pi, or the sine wave. Rather, we should say that we represent mathematical items using a variety of modes and solve problems about them by various reductive strategies. The way in which we combine the formal languages employed in problem-solving cannot be completely formalized, and no mathematician would be interested in doing so. Mathematicians, in their research and teaching, focus on
64
chemistry and geometry
paradigmatic solved problems about canonical items, and a toolbox of rules and strategies. The interaction between mathematician and determinate, intelligible, but infinitary mathematical things is mediated by theories, associated models—some more conceptual and some more material, and activity in the ‘laboratory,’ where the inaccessible is rendered accessible to thought. How do mathematicians do that? What is their laboratory? They work within ‘combinatorial spaces’ on the page and on the computer screen, which are produced by traditions of representation and the development of procedures. And there they use paper tools—notation and diagrams that allow them to make the infinitary finite, articulate the continuum, select canonical items, render the linear periodic and the unbounded compact, exhibit shape, induce repetition, compose, compare, negate, compute, and limit. The notion of model and nomological machine are more apt here than the notion of formalization. The stubborn determinacy of the natural numbers and of shape plays the same role as the stubborn causal determinacy of molecules: it resists our investigations in similarly informative ways.
3.1. What Lies Between Representing and Intervening In his book Representing and Intervening, Ian Hacking writes, Science is said to have two aims: theory and experiment. Theories try to say how the world is. Experiment and subsequent technology change the world. We represent and we intervene. We represent in order to intervene, and we intervene in the light of representations. Most of today’s debate about scientific realism is couched in terms of theory, representation, and truth.¹
Having made the distinction, Hacking urges the importance of experiment and causal intervention for scientific rationality. If the terms representing and intervening are severed, we get two incompatible and unsatisfactory views of science, which I give here in caricature to make my point. On the one hand, we have the view that can be inferred from some pronouncements of Rudolf Carnap, discussed above: scientific rationality ¹ (Cambridge: Cambridge University Press, 1983), 31.
bioorganic chemistry and biology 65 is representation. Nature is as it is and we try to describe it truly, in a transparent and univocal language donated by logic to philosophy of science; the true description will be an axiomatized theory, where the first principles are related deductively and inductively to observation statements that report phenomena in the lab and field.² On the other hand, we have a view that might be elicited from the more polemically stated claims of Nancy Cartwright: scientific rationality is intervention. We set up artificial environments as nomological machines, and something happens, causally; in doing so, we change nature. There is moreover no point in pretending that these local effects can be generalized and described truly by a theory whose first principles are universal principles that describe what must happen in all times and all places.³ In his famous essay ‘Mathematical Truth,’ Paul Benacerraf uses a version of this disjunction to show that the enterprise of philosophy of mathematics is hopeless.⁴ If mathematical rationality is representation, then the vehicle of truth (qua derivability) is the axiomatized theory written in the transparent, univocal language donated by logic to philosophy of mathematics. The problem then is that we cannot designate what we are talking about, since any non-trivial first-order theory has an infinity of models. The instrument of designation would be causal procedures, like those employed in experiments; unfortunately, our access to mathematical entities is not causal. Ian Hacking points out that the hand-wringing occasioned by Thomas Kuhn’s book The Structure of Scientific Revolutions isn’t necessary; incommensurability need not lead directly to irrationalism. Ursula Klein’s case study suggests as much. I’d make the same observation about the hand-wringing that followed upon Benacerraf’s essay. We only have to give up hope for a cogent philosophy of mathematics if we cling to a logicist view of representation and a causalist account of intervention, and moreover forget to look for the middle ground between representation and intervention. ² Carnap, Logical Structure of the World, Part I. ³ See, for example, Part II of The Dappled World (Cambridge: Cambridge University Press, 1999). ⁴ P. Benacerraf, ‘Mathematical Truth,’ Journal of Philosophy, 70 (19) (Nov. 1973), 661–79. To illustrate its impact, it is the lead essay in The Philosophy of Mathematics, ed. W. D. Hart (Oxford: Oxford University Press, 1996), and all the other essays, designed to give a composite view of contemporary philosophy of mathematics, respond to it in one way or another. This book was the subject of a study group led by Peter Lipton in the History and Philosophy of Science Department at Cambridge University, which I attended in the fall of 1997.
66
chemistry and geometry
Hacking himself doesn’t pay sufficient attention to this middle ground in Representing and Intervening, so that his ultimate position is an uneasy blend of ‘entity realism’ and skepticism about ‘theory realism.’ In order to profit from his important insights in that book and to transfer them to philosophy of mathematics, I season his semanticist position with a dose of contemporary pragmatism, and point out some important examples of knowing that occupy the middle ground between representing and intervening. Hacking sometimes forgets that language itself can alter what it is about, analogous to the way in which laboratory set-ups and instruments alter what they investigate. One such example is Austin’s performative utterance.⁵ Another is Ursula Klein’s notion of paper tools, discussed above in Chapter 1. Yet another is the way in which mathematical discourse can be hypostatized to precipitate new items that come to stand in determinate relations with other, previously available items, as G¨odel brings the well-formed formulas of predicate logic into novel relation with the natural numbers. The efficacity of performative utterances, paper tools, and hypostatized elements of discourse cannot be explained in material-causal terms alone, but it does show that language (in its many modes) intervenes, constructs, and creates. Scientific rationality understood as a spectrum that includes representation, a middle ground, and intervention, is a clue to a better epistemology for mathematics. Mathematicians represent, construct and intervene (in a semi-causal way, to be explained) in mathematical reality, as chemists represent nature, construct models on paper and in the lab, and create new molecules. The chemical analogy can be pursued further, in terms set out by the first two chapters. As chemists require ambiguous modes of representation to bring chemistry into rational relation with biology on the one hand and physics on the other, and to move between the molecular level and the macroscopic level of the lab, so too mathematicians require ambiguous modes of representation to bring different domains into rational relation in order to solve problems, and to move between the finitary and the infinitary. Once we recognize the broad spectrum of ways in which people interact with nature and employ cultural artifacts (including language) broadly construed as modes of representation-to-intervention, we can ⁵ How to Do Things with Words, eds. J. O. Urmson and M. Sbis`a (Oxford: Oxford University Press, 1976).
bioorganic chemistry and biology 67 discern a positive role for ambiguity in language; and this holds as well for the way in which people interact with the things of mathematics. We can be realist about them, and still critical of the truth of any given theory concerning them, and still willing to admit that some mathematical items are creations precipitated by notation and theory, like some molecules. In order to account for the ability of chemistry to refer and describe and construct and intervene, we must look at the manifold uses of language chemists employ and their ability to exploit the ambiguity of some of those modes. To say something true about the energy levels of the benzene molecule, for example, a chemist must use (inter alia) geometric shape, various differential equations, parts of group theory and representation theory, character tables, and the causal record of certain measurements of the behavior of large quantities of benzene molecules, carefully segregated from other kinds of molecules and subject to certain procedures. These representations-to-interventions, juxtaposed and superimposed, must also sometimes be ambiguous in order to allow for meaningful relations between the microscopic and the macroscopic, and between chemical and physical discourse: Berzelian formulas are a signal instance of such ambiguity. Likewise, to say something true in number theory, for example in the problem context resulting from Andrew Wiles’s and Kenneth Ribet’s proof of Fermat’s Last Theorem, a mathematician must deploy inter alia parts of group theory and representation theory, deformation theory, complex analysis, Arabic notation for the integers and decimal notation for the reals, novel notation for novel algebras, simple geometric forms as the template for certain kinds of diagrams as mappings, and the quasi-causal creation of new items from novel notation; and these modes of representation-tointervention must allow for meaningful relations between the infinitary and the finite.⁶ Indeed, the mathematician’s ability to solve problems by profitably relating the infinitary and the finite, or the realms of number and algebraic geometry and complex analysis, entails that some of the modes be ambiguous. ⁶ A. Wiles, ‘Modular elliptic curves and Fermat’s Last Theorem,’ Annals of Mathematics, 141 (3) (1995), 443–551; K. A. Ribet, ‘On Modular Representations of Gal (Q/Q) Arising from Modular Forms,’ Inventiones Mathematicae, 100 (1990), 431–76, and ‘From the Taniyama-Shimura conjecture to Fermat’s Last Theorem,’ Annales de la Facult´e des Sciences de Toulouse—Math´ematiques, 11 (1990), 116–39. My thanks to Wen-Ching (Winnie) Li for her lucid exposition of this proof during Spring 2001 at Penn State.
68
chemistry and geometry
3.2. The Reduction of a Biological Item to a Chemical Item A successful way to investigate objects and problems in biology is to look at the underlying chemistry. The reduction of biology to chemistry has thus attracted the interest of philosophers of science for many decades, and has become one of the examples that led to serious questioning of the mid-twentieth century syntactic account of theory reduction. Some areas of biology and chemistry are not axiomatized, and biology is quantified in a manner quite different from that of chemistry; the correlation of items in biology with items in chemistry is far from being one to one; the theoretical pathways between biology and chemistry are multiple and heterogeneous; and reduction seems to take place around problems and items, not theories. The case study in this chapter concerns the construction of an antibody mimic. Antibodies are important to medical science because their function is central to the human immune system, but in general antibody molecules are so large and complicated that their chemical structure is difficult to study. One strategy for better understanding them is to construct molecules that are smaller and simpler, but still model or mimic the activities of the natural antibodies. Thus in the paper here subjected to a philosophical analyse du texte, a group of scientists assemble a molecule with some of the structural features of an antibody, except that it is simplified and scaled down. Note that this strategy is in itself reductive, and answers to the Leibnizian conception of analysis, which approaches the complex in terms of the more simple, expecting from such reductive analysis only better understanding, not a complete conceptual reconstitution of the whole. And of course, this reduction does not just involve notation, concepts, sentences in a formal language, or conceptual models: it involves a real entity, the antibody mimic. The mimic is a representation, though it is as real as the antibody. To assess a representation and its role in problem reduction, we must look at the context of use. Chemistry is an empirical science that endeavors to describe, classify, and construct molecules, things that are characterized by chemical and physical theory and too small to be perceived directly by human beings.
bioorganic chemistry and biology 69 Chemists must therefore move, habitually and sometimes unreflectively, between two worlds and levels of description. One is the laboratory, with its macroscopic powders, crystals, solutions, intractable sludge, things that are smelly or odorless, toxic or beneficial, pure or impure, colored or white. The other is the invisible world of molecules, each with its characteristic composition and structure, its internal dynamics and its ways of reacting with the other molecules around it. Perhaps because they are so used to it, chemists rarely explain how they are able to hold two seemingly disparate worlds together in thought and practice. And contemporary philosophy of science has had little to say about how chemists are able to pose and solve problems, and in particular to posit and construct molecules, while simultaneously entertaining two apparently incompatible strata of reality. Yet chemistry continues to generate highly reliable knowledge, and indeed to add to the furniture of the universe, with a registry of over ten million well-characterized new compounds. In the problem context described here, there is no single theory that governs the molecular construction. Instead, we find an amalgam of local theoretical knowledge drawn from organic chemistry, molecular biology, and, by implication, quantum mechanics applied to molecular structure. Nor is there a single theory that governs the know-how of the chemist who moves about a laboratory carrying out the experiment: theoretical knowledge informs the scientist’s understanding of how and why the instruments work and what the representations they produce mean, but instructions in natural and formal languages pertinent to the manipulations of instruments and macroscopic substances are also required. For the working scientist, reality is allowed to include different kinds of things existing in different kinds of ways, levels held in intelligible relation by both theory and experiment, and couched in a multiplicity of languages, symbolic and iconic, formal and natural. No single correct analysis of the complex entities of chemistry expressed in a single adequate language, as called for by the mid-century account of theory reduction, is in evidence here. On the contrary, as I wish to argue, the multiplicity and polysemy of the scientific discourses involved, and the complex ‘horizontal’ interrelations of the sciences, do not preclude but in many ways enhance the reasonableness of the argument and the success of the outcome.
70
chemistry and geometry
In sum, the case study in this chapter involves reduction in three different senses. First, it centers on a problem reduction: the reduction of the study of an antibody, a complex biological entity, in terms of the construction of an antibody mimic, simpler and smaller than the thing it mimics, using the tools of organic chemistry. Note that the mimic is not a component of the antibody to be studied, but rather a simplified analogue. Note too that it is not a ‘conceptual model’ but exists in the same way as the original antibody. It is what Nancy Cartwright would call a nomological machine. Second, the specific problem reduction is embedded in a general reductive trend, the partial unification of biology and organic chemistry, which its success supports. Alongside the kind of strategy just discussed, this unification also includes analysis in a more strongly meriological sense, studying complex molecules in terms of their atomic components, and sometimes invoking the results of quantum mechanics, which studies molecular orbitals in terms of atomic orbitals. What ‘composition’ means here is, however, a non-trivial issue. Third, bringing experimental knowledge to bear on theoretical knowledge requires a further kind of reduction, the partial integration of microscopic and macroscopic levels of description; this is just as true for biologists as it is for chemists, though the objects at the microscopic and macroscopic levels are not the same. The advantage of the macroscopic level of description is that we have direct perceptual access to it; the disadvantage is that it does not exhibit the causes that give rise to its phenomena, which must be sought indirectly and by analysis at a different level. Thus, in the article from Angewandte Chemie, we find a group of scientists reducing one problem (the study of an antibody) to another (the study of an antibody mimic) while relating biological and chemical items and processes, on both the microscopic and macroscopic levels. This multiply reductive rational inquiry employs a variety of chemical diagrams and computer generated images, chemical formulae and tables that record experimental results, whose mutual relations are explained in a natural language (English), an employment to which we now turn in detail. I will argue that the combination, and—in more than one crucial instance—the superposition, of different modes of representation is central to the success of the analysis. Superposition typically promotes productive ambiguity; but as we have seen, even the most ordinary representations in chemistry, Berzelian formulas, exhibit it as well.
bioorganic chemistry and biology 71
3.3. Formulating the Problem The paper drawn from recent literature in chemistry that we shall consider is ‘A Calixarene with Four Peptide Loops: An Antibody Mimic for Recognition of Protein Surfaces,’ authored by Andrew Hamilton, with Yoshitomo Hamuro, Mercedes Crego Calama, and Hyung Soon Park, and published in December 1997 in the international journal Angewandte Chemie (referred to throughout as ‘Hamilton et al.’).⁷ The subfield of the paper could be called bioorganic chemistry. The examination of biology in terms of chemistry is a well-developed program that is both one of the most successful intellectual achievements of the twentieth century, and a locus of dispute for biologists. For many years, organic chemists had let molecular and bio-chemistry get away from chemistry; recently, there has been a definite movement to break down the imagined fences and reintegrate modern organic chemistry and biology. The paper we examine is part of such an enterprise. Scientists know a certain amount about the structure of the large, enigmatic, selectively potent molecules of biology. But describing their structure and measuring their functions does not really answer the question of how or why these molecules act as they do. Here organic chemistry can play an important role by constructing and studying molecules smaller than the biological ones, but which model or mimic the activities of the speedy molecular behemoths of the biological world. The paper opens by stating one such problem of mimicry, important to medical science and any person who has ever caught a cold. The human immune system has flexible molecules called antibodies, proteins of some complexity that recognize a wide variety of molecules including other proteins. ⁷ Angewandte Chemie, International English Edition, 36 (23); 2680–3. I will not give page numbers for each of the quotes, since we will be reading right through the article and discussing almost every paragraph. An earlier version of this chapter was co-authored with the chemist Roald Hoffmann, and still bears the mark of his vast knowledge of chemistry and his philosophical insight. That paper was the result of an invitation from Fran¸coise L´etoublon, Professor of Classics at the University of Grenoble, to collaborate with a scientist on a presentation concerning ‘The Languages of Science,’ the theme of a year-long seminar she organized with her colleagues Yves Br´echet and Philippe Jarry, at the Maison des Sciences de l’Homme-Alpes, Universit´e Stendhal, Grenoble, France. The first volume of papers from that seminar has appeared as M´echanique des signes et langages des sciences (Grenoble: Publications de la MSH-Alpes, 2003). Our paper appeared in the original English in Of Minds and Molecules: New Philosophical Perspectives on Chemistry, eds. N. Bhushan and Stuart Rosenfeld (New York: Oxford University Press, 2000), 230–47.
72 chemistry and geometry The design of synthetic hosts that can recognize protein surfaces and disrupt biologically important protein–protein interactions remains a major unsolved problem in bioorganic chemistry. In contrast, the immune system offers numerous antibodies that show high sequence and structural selectivity in binding to a wide range of protein surfaces.
The problem is thus to mimic the structure and action of an antibody; but antibodies in general are very large and complicated. Hamilton et al. ask the question, can we assemble a molecule with some of the structural features of an antibody, simplified and scaled down, and if so will it act like an antibody? But what are the essential structural features in this case? Prior investigation has revealed that an antibody at the microscopic level is a protein molecule that typically has a common central region with six ‘hypervariable’ loops that exploit the flexibility and versatility of the amino acids that make up the loops to recognize (on the molecular level) the immense variety of molecules that wander about a human body. The paper remarks, ‘This diversity of recognition is even more remarkable, because all antibody fragment antigen binding (FAB) regions share a common structural motif of six hypervariable loops held in place by the closely packed constant and variable regions of the light and heavy chains.’ What is ‘recognition’ at the microscopic level? It is generally not the strong covalent bonding that makes molecules so persistent, but is rather a congeries of weak interactions between molecules that may include bonding types that chemists call hydrogen bonding, van der Waals or dispersion forces, electrostatic interactions (concatenations of regions of opposite charge attracting, or like charge repelling), and hydrophobic interactions (concatenations of like regions attracting, as oil with oil, or water with water). These bonding types are the subject of much dispute, for they are not as distinct as scientists would like them to be.⁸ In any case, the interactions between molecules are weak and manifold. Recognition occurs as binding, but it is essentially more dynamic than static. At body temperature, recognition is the outcome of many thermodynamically reversible interactions: the antibody can pick up a molecule, assess it, and then perhaps let it go. ⁸ See M. D. Joesten, D. O. Johnston, J. T. Netterville, and J. L. Wood, World of Chemistry (Philadelphia: Saunders, 1991).
bioorganic chemistry and biology 73 Whatever happens has sufficient cause, in the geometry of the molecule, and in the physics of the microscopic attractions and repulsions between atoms or regions of a molecule. The paper remarks, Four of these loops ... generally take up a hairpin conformation and the remaining two form more extended loops.∗ ⁹ X-ray analyses of protein-antibody complexes show that strong binding is achieved by the formation of a large and open ˚ composed primarily of residues that are capable interfacial surface (> 600 A) of mutual hydrophobic, electrostatic, and hydrogen bonding interactions.∗ The majority of antibody complementary determining regions (CDRs) contact the antigen with four to six of the hypervariable loops.∗
The foregoing passage depends upon a number of theories concerning the structure and function of antibodies, but it is asserted with confidence and in precise detail. Standing in the background, linking the world of the laboratory where small (but still tangible) samples of antibodies and proteins are purified, analyzed, combined, and measured, and the world of molecules, are theories, instrumentation, and languages. There is no single theory here, but rather an overlapping, interpenetrating network of theories backed up by instrumentation and traditions of representation. These include the quantum mechanics of the atom, and a multitude of quantum mechanically defined spectroscopies, chemistry’s highly refined means for destructively or non-destructively plucking the strings of molecules and letting the ‘sounds’ tell us about their features.¹⁰ There are equally ingenious techniques for separating and purifying molecules, which can loosely be termed chromatographies. They proceed at a larger scale, and when traced are also the outcome of a sequence of holding on and letting go, like antibody recognition. Statistical mechanics and thermodynamics also serve to relate the microscopic to the macroscopic. The theories are probabilistic, but they have no exceptions because of the immensity of the number of molecules—1023 in a sip of water—and the rapidity of molecular motion at ambient temperatures. Thus the average speed of molecules ‘scales up’ to temperature, their minute interactions with light waves into color, the resistance of their crystals to being squeezed to hardness, their multitudinous ⁹ The stars in the quoted passages are bibliographic endnotes in the original Hamilton et al. paper. ¹⁰ Chemistry Imagined, R. Hoffmann and V. Torrence, (Washington DC: Smithsonian Institution Press 1993), 144–7.
74
chemistry and geometry
and frequent collisions into a reaction that is over in a second or a millennium.¹¹ All these theories are silent partners in the experiments described in the paper, taken for granted and embodied, one might say, in the instruments. But a further dimension of the linkage between the two worlds is the languages employed by the chemists, which will now be examined at length.
3.4. Constructing the Antibody Mimic The construction of ‘a calixarene with four peptide loops’ serves two functions in this paper. It serves as a simplified substitute for an antibody, though it is doubtful that the intent of the authors is the design of potential therapeutic agents. More important, the calixarene serves to test the theory of antibody function sketched above: is this really the way that antibodies work? The authors note that earlier attempts to mimic antibodies have been unsuccessful, and propose the alternative strategy which is the heart of the paper: ‘ ... the search for antibody mimics has not yet yielded compact and robust frameworks that reproduce the essential features of the CDRs.∗ Our strategy is to use a macrocyclic scaffold to which multiple peptide loops in stable hairpin-turn conformations can be attached.’ The experiment has two stages. The first is to build the antibody mimic, by adding peptide loops to the scaffolding of a calix[4]arene—a cone-shaped concatenation of four benzene rings, strengthened and locked into one orientation by the addition of small length chains of carbon and hydrogen (an alkylation), with COOH groups on top to serve as ‘handles’ for subsequent reaction. The first stage is illustrated by the four diagrams labeled 1-4 in Figure 3.1. (The benzene ring of six carbons is a molecule with a venerable history, whose structure has proved especially problematic for the languages of chemistry; see Chapter 5.) The authors write, In this paper we report the synthesis of an antibody mimic based on calix[4]arene linked to four constrained peptide loops ... Calix[4]arene was chosen as the core scaffold, as it is readily available∗ and can be locked into the semirigid cone ¹¹ See M. D. Joesten et al., World of Chemistry; P. W. Atkins, The Second Law (New York: Scientific American, 1984) and Molecules (New York: Scientific American, 1987); R. Hoffmann, The Same and Not the Same (New York: Columbia University Press, 1995).
bioorganic chemistry and biology 75
Figure 3.1. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Diagrams 1–4 conformation by alkylation of the phenol groups. This results in a projection of the para-substituents onto the same side of the ring to form a potential binding domain.∗
Diagram 1 in Figure 3.1 is given to illustrate this description, as well as the following ‘recipe’. ‘The required tetracarboxylic acid 1 was prepared by alkylation of calix[4]arene∗ (n-butyl bromide, NaH) followed by formylation (Cl2 CHOCH3 , TiCl4 ) and oxidation (NaClO2 , H2 NSO3 H).∗ ’ The
76
chemistry and geometry
iconic representation offered is of a single, microscopic molecule, but the surrounding language is all about macroscopic matter, and it is symbolic. The symbolic language of chemistry here is Berzelian formulas employed in the laboratory recipe. It lends itself to the chemist’s bridging of the macroscopic and the microscopic because it is thoroughly equivocal, at once a precise description of the ingredients of the experiment (for example, n-butyl bromide is a colorless liquid, with a boiling point of 101.6◦ C, and is immiscible with water), and a description of the composition of the relevant molecules. For example, n-butyl bromide is construed by the chemist as CH3 CH2 CH2 CH2 Br; it has the formula C4 H9 Br, a determinate mass relationship among the three atomic constituents, a preferred geometry, certain barriers to rotation around the carbon-carbon bonds it contains, certain angles at the carbons, and so forth. The laboratory recipe is thus both the description of a process carried out by a scientist, and the description of a molecule under construction: a molecule generic in its significance, since the description is intended to apply to all similar molecules, but particular in its unity and reality. There are parallels in mathematics. Thus in algebraic geometry, a polynomial equation (set = 0) with n variables refers equally to an infinite set of n-tuples of real numbers, to a geometric curve or surface C, and further to another infinite set, all the polynomials whose values are identically zero on C, an ideal in the larger ring of such polynomials. The controlled and precise ambiguity of the equation is the instrument that allows resources of number theory and of geometry to be combined in the service of problem-solving. Likewise here the algebra of chemistry allows the wisdom of experience gained in the laboratory to be combined with the (classical and quantum) theory of the molecule, knowledge of its fine structure, energetics, and spectra. But the symbolic language of chemistry is not complete, for there are many aspects of the chemical substance/molecule that it leaves unexpressed. (a) We cannot deduce from it how the molecule will react with the enormous variety of other molecules with which it may come in contact. (b) We cannot even deduce from it the internal statics, kinematics and dynamics of the molecule in space.¹² A philosopher invested in the ¹² See R. Hoffmann, ‘Nearly Circular Reasoning,’ in American Scientist, 76 (1988), 182–5; and E. Scerri, ‘Has Chemistry Been at least Approximately Reduced to Quantum Mechanics?,’ Philosophy of Science Association, I (1994), 160–70.
bioorganic chemistry and biology 77 mid-twentieth century model of theory reduction might argue that given great computing power and perfected quantum mechanical calculations, one could start from a chemical formula and predict observations accurately. But in practice, the number of isomers for a given formula grows very rapidly with molecular complexity, so the goal is not realistic for a molecule the size of the calixarene. Moreover, complete computability may not be equivalent to understanding. Much of what a chemist means by understanding is couched in terms of chemical concepts—the result of horizontal and quasi-circular reasoning—for which a precise equivalent in physics cannot be found. Molecules identical in composition can differ from each other because they differ in constitution, the manner and sequence of bonding of atoms (tautomers), in spatial configuration (optical or geometrical isomers), and in conformation (conformers).¹³ An adequate description of the molecule must invoke the background of an explanatory theory, but to do so it must also employ iconic languages. Thus the very definition of the calixarene core scaffold given above involves a diagram. (It was also necessary for the authors to identify C4 H9 Br as n-butyl bromide, a nomenclature implying a specific connectivity of atoms.) This diagram of calixarene is worth careful inspection, as well as careful comparison with its counterparts in the more complex molecules (for which it serves as core scaffold) furnished to us, the readers, later on in the article by means of computer-generated images. First of all, it leaves out most of the component hydrogens and carbons in the molecule; they are understood, a kind of tacit knowledge shared even by undergraduate chemistry majors. The hexagons are benzene rings, and the chemist knows that the valence of (the number of bonds formed by) carbon is typically four and so automatically supplies the missing hydrogens. But this omission points to an important feature of iconic languages: they must always leave something out, since they are only pictures, not the thing itself, and since the furnishing of too much information is actually an impoverishment. In a poor diagram, one cannot see the forest for the trees. Not only must some things remain tacit in diagrams, but the wisdom of experience that lets the scientist know how much to put in and how much to leave out, wisdom gleaned by years of translating experimental results into diagrams ¹³ P. Zeidler and D. Sobczynska, ‘The Idea of Realism in the New Experimentalism and the Problem of the Existence of Theoretical Entities in Chemistry,’ Foundations of Science, 4 (1995), 517–35.
78
chemistry and geometry
for various kinds of audience, is itself often tacit. It can be articulated now and then, but cannot be translated into a complete set of fixed rules. Second, the diagram uses certain conventions for representing configurations in 3-dimensional space on the 2-dimensional page, like breaking the outlines of molecules that are supposed to be behind other molecules whose delineation is unbroken. (In other diagrams, wedges are used to represent projection outwards from the plane of the page, and heavy lines are used to represent molecules that stand in front of other molecules depicted by ordinary lines.) Sometimes, though not in the context of a journal article, chemists show three-dimensional configuration three-dimensionally, by using such devices as Pauling’s ball-and-stick models. But all of these representations are limited in their precision, and all of them are static. We may want to see more precise angles and inter-atomic distances in correct proportion; in this case, we may resort to the images produced by x-ray crystallography. Or we may want some indication of the motion of the molecules, since all atoms vibrate and rotate: arrows and other iconographies of dynamic motion can be used in such diagrams. The cloudy, false-color, yet informative photographs of scanning tunneling microscopy come in here, as well as assorted computer images of the distribution of electrons in the molecule.¹⁴ The convention of using a hexagon with a perimeter composed of three single lines alternating with three double lines to represent a benzene ring deserves special mention, for this molecule has played a central role in the development of organic chemistry. No single classical valence structure was consistent with the stability of the molecule. Kekul´e solved the problem by postulating the coexistence of two valence structures in one molecule. In time, practitioners of quantum mechanics took up the benzene problem, and to this day it has served them as an equally fecund source of inspiration and disagreement. The electrons in benzene are delocalized—that much people agree on; but the description of its electronic structure continues to be a problem for the languages of chemistry and part of this debate constitutes the case study in Chapter 5. Philosophers of science working from the mid-twentieth century notion of theory reduction have had little to say about iconic languages. Symbolic languages lend themselves better to regimentation, but pictures tend to ¹⁴ See R. Hoffmann and P. Laszlo, ‘Representation in Chemistry,’ Angewandte Chemie, International English Edition, 30 (1991), 1–16; and S. Weininger, ‘Contemplating the Finger: Visuality and the Semiotics of Chemistry,’ Hyle, 4 (1) (1998), 3–25.
bioorganic chemistry and biology 79 be multiform and hard to codify; thus if they proved to be indispensable to human knowledge, the philosopher disposed to the syntactic approach would be troubled. As any student of chemistry will tell you, conventions for producing ‘well-formed icons’ of molecules exist and must be learned, or else your audience will misread them; but no single iconic language is the correct one. The symbolic language of chemistry is, to be sure, a precisely defined international nomenclature that specifies in impressive detail a written sequence of symbols so as to allow the unique specification of a molecule. But, significantly, the iconic representations of a molecule are governed only by widely accepted conventions, and a good bit of latitude is allowed in practice, especially a propos what may be omitted from such representations. Symbolic languages lend themselves to codification in a way that iconic languages do not.¹⁵ Symbolic languages, precisely because they are symbolic, lend themselves best to displaying relational structure. Like algebra, they are tolerant or relativistic in their ontological import: they may pertain to a variety of subject matters, as long as their objects stand in the appropriate relations to each other. Iconic languages can point, more or less directly, to objects; they are not ontologically neutral, but on the contrary are ontologically insistent. They display what I called in Chapter 2 the unity of intelligible existence. However, there is no way to give an exhaustive summary of the ways of portraying the unity of existence; it is too infinitely rich and thought has too many ways of engaging it. We should not therefore jump to the conclusion that knowledge via an iconic language is impossible or incoherent. Iconic languages, despite being multiform, employ publicly shared conventions; they are constrained by the object itself; and they are explained by their association with symbolic language and natural language. An inference cannot be constructed from icons alone, but icons may play an essential and irreplaceable role in inference. G.-G. Granger has an interesting discussion of the languages of chemistry in chapter 3 of his book Formal Thought and the Sciences of Man, where he ¹⁵ L. Kvasz has many interesting arguments about the distinction and the interactions between symbolic and iconic languages in mathematics and chemistry, though I find he argues too strongly in favor of the codification of iconic languages; see, for example, ‘Changes of Language in the Development of Mathematics,’ Philosophia Mathematica, 8 (2000), 47–83; and ‘Similarities and Differences between the Development of Geometry and the Development of Algebra,’ Mathematical Reasoning and Heuristics, eds. C. Cellucci and D. Gillies (London: Kings College Publications, 2005), 25–47.
80
chemistry and geometry
focuses on the distinction between natural languages and formal languages.¹⁶ He makes the important observation that scientific language will always be partly vernacular and partly formal. Rejecting the claim that science might someday be carried out in a pure formalism, he writes, The linguistic process of science seems to me essentially ambiguous: for if science is not at any moment of its history a completely formalized discourse, it is not to be confused with ordinary discourse either. Insofar as it is thought in action, it can only be represented as an attempt to formalize, commented on by the interpreter in a non-formal language. Total formalization never appears as anything more than at the horizon of scientific thought, and we can say that the collaboration of the two languages is a transcendental feature of science, that is, a feature dependent on the very conditions of the apprehension of an object.
However, Granger does not go on to consider the further linguistic aspect of chemistry, that is, its iconic aspect. How does the iconic form of the chemical structure expressed as a diagram that displays atom connectivities and suggests the three-dimensionality of the molecule, bridge the two worlds of the chemist? The most obvious answer is that it makes the invisible visible, and does so, within limits, reliably. But there is a deeper answer. The iconic chemical structure diagram seems at first to refer only to the level of the microscopic, since after all it depicts a molecule. In conjunction with symbolic formulas, however, the diagram takes on an inherent ambiguity that gives it an important bridging function. In its display of unified existence, it stands for a single particular molecule. Yet we understand molecules of the same composition and structure to be equivalent to each other, internally indistinguishable. (In this, the objects of physics and chemistry are like the objects of mathematics.) Thus the icon (hexagonal benzene ring) also stands for all possible benzene rings, or for all the benzene rings (moles or millimoles of them!) in the experiment, depending on the way in which it is associated with the symbolic formula for benzene. The philosopher in search of a univocal representation might call this obfuscating ambiguity, a degeneracy in what ought to be a precise scientific language that carries with it undesirable ontological baggage. And yet, the iconic language is powerfully efficient and fertile in the hands of the chemist. Now we can understand better why ¹⁶ (Dordrecht: Reidel, 1983), 33.
bioorganic chemistry and biology 81 the kind of world-bridging involved in posing a problem or molecule construction in chemistry requires both symbolic and iconic languages for its formulation. On the one hand, the symbolic language of chemistry captures the composition of molecules, but not their structure (constitution, configuration, and conformation); these aspects are dealt with better, though fragmentarily, by the many iconic idioms available to chemists. Moreover, the symbolic language of chemistry fails to convey the ontological import, the realism, intended by practitioners in the field. Hamilton et al. are not reporting on social constructions or mere computations, but useful realities: the iconic diagrams in Figure 3.1 confidently posit their existence. On the other hand, icons are too manifold and singular to be the sole vehicle of scientific discourse. Their use along with symbolic language embeds them in demonstrations, and gives to their particularity a representative and well-defined generality, sometimes even universality. Hamilton et al. chose cyclic hexapeptides to mimic the ‘arms’ of the antibody because they can be modified so as to link up easily with the core scaffold, and because they form hairpin loops. ‘The peptide loop was based on a cyclic hexapeptide in which two residues were replaced by a 3-aminomethylbenzoyl (3amb) dipeptide analogue∗ containing a 5-amino substituent for facile linkage to the scaffold.’ The recipe for constructing the peptide loops is then given; the way in which it couples the macroscopic and the microscopic is striking, for it describes a laboratory procedure and then announces that the outcome of the procedure is a molecule, pictured in Diagram 2 of Figure 3.1. The 5-nitro substituted dipeptide analogue was formed by selective reduction (BH3 ) of methyl 3-amidocarbonyl-5-nitrobenzoate, followed by deesterification (LiOH in THF) and reaction sequentially with Fmoc-Asp-(tBu)-OH and H-GlyAsp(tBu)-Gly-OH (dicyclohexyl carbodiimide (DCC), N-hydroxysuccinimide) to yield Fmoc-Asp(tBu)-5NO2 3amb-Gly-Asp(tBu)-GlyOH. Cyclization with 4-dimethylaminopyridine (DMAP) and 2-1H-benzotriazole-1-yl-1,1,3,3,-tetramethyluronium tetrafluoroborate (TBTU) was achieved in 70% yield, followed by reduction (H2 , Pd/C) to give the amino-substituted peptide loop 2.
Working in the lab, the chemist has constructed a molecule, at least a dizzying 20 orders of magnitude ‘below’ or ‘inward.’ To be sure, what was made was a visible, tangible material—probably less than a gram of it—but the interest of what was made lies in the geometry and reactivity of the
82
chemistry and geometry
molecule, not the properties of the macroscopic substance. So it is not by accident that the leap to the level of the molecule is accompanied by iconic language. Such language also accompanies the final step in the assembly of the antibody mimic. Four of the peptide loops are attached to the core scaffold; the laboratory procedure begins and ends with a pictured molecule. But this time the resultant new molecule is pictured twice in complementary iconic idioms. Amine 2 was coupled to the tetraacid chloride derivative of 1 ((COCl)2 , DMF) and deprotected with trifluoroacetic acid (TFA) to give the tetraloop structure 3. The molecular structure of this host (Figure 1 [here Figure 3.2]) resembles that of the antigen binding region of an antibody but is based on four loops rather than six.∗
To someone who understands chemical semiotics, the iconic conventions in Diagram 3 of Figure 3.1 (the tetraloop molecule called structure 3 in the quote above) do allow a mental reconstruction of the molecule. But the shape of the molecule is so important that the authors decide to give it again, in another view, in Figure 3.2. The figure is even printed in color in the original! Why should the reader be offered another iconic representation? In part, it is part of a rhetorical strategy to persuade the audience of the cogency of a research program that involves mimicry. The computer-generated image of Figure 3.2 is actually the result of a theoretical calculation in which the various molecular components are allowed to wiggle around any bonds that allow rotation, and to reach a geometry that is presumably the most stable. In that image, the general shape of the molecule (in particular the loopiness of the loops) is beautifully exhibited, emphasizing its resemblance to an antibody. Note that the experimentalist trusts the ability of a theoretical computer program to yield the shape of a molecule sufficiently to insert it in color into a paper; that would not have been the case twenty-five years ago. Diagram 3 in Figure 3.1, and Figure 3.2, are intended to be seen in tandem; they complement each other. Both representations are iconic, though perhaps Figure 3.2 is more so. Diagram 3 in Figure 3.1 has a symbolic dimension due to the labels, and thus serves to link Figure 3.2 to the symbolic discourse of the prose argument. Together with the reproducible laboratory procedure—given in more detail at the end of the article—Hamilton et al. give a convincing picture of this new addition to the furniture of the universe. There it stands: Ecce.
bioorganic chemistry and biology 83
Figure 3.2. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Figure 1
84
chemistry and geometry
3.5. Testing the Antibody Mimic Once the antibody mimic has been assembled, it can be tested to see whether it in fact behaves like an antibody, a test which, if successful, in turn provides evidence supporting the theory of the action of antibodies invoked by Hamilton et al. Note the usefully—as opposed to viciously—circular reasoning here.¹⁷ The antibody mimic correctly mimics an antibody if it behaves like an antibody; but how an antibody behaves is still a postulate, which stipulates what counts as the correctness of the antibody mimic’s mimicry. To see if the antibody mimic, the base scaffold of calixarene with four peptide loops, will bind with and impair the function of a protein (the essence of what an antibody does), Hamilton et al. chose the protein cytochrome, an important molecule that plays a critical role in energy production and electron transport in every cell and has thus been thoroughly investigated. Moreover, it has a positively charged surface region that would likely bond well with the negatively charged peptide loops. We chose cytochrome c as the initial protein target, since it is structurally wellcharacterized and contains a positively charged surface made up of several lysine and arginine residues.∗ In this study the negatively charged GlyAspGlyAsp sequence was used in the loops of 3∗ to complement the charge distribution on the protein.
Note that the antibody mimic is referred to by means of the Diagram 3 in Figure 3.1. In a sense, this is because the diagram is shorthand, but its perspicuity is not trivial or accidental: as a picture that can be taken in at a glance, it offers schematically the whole configuration of the molecule in space. Its visual unity stands for, and does not misrepresent, the unity of the molecule’s existence. Does the antibody mimic in fact bind with the cytochrome? Their affinity is tested by an experiment that is simply a matter of careful physical measurement, an aspect of chemical practice central to chemistry since the time of Lavoisier. The ‘affinity chromatography’ involves a column filled with some inert cellulose-like particles and cytochrome c linked to those particles.¹⁸ The concentration of NaCl, simple salt, controls the degree of binding of various other molecules to the cytochrome c that is in that ¹⁷ See R. Hoffmann, ‘Nearly Circular Reasoning.’ ¹⁸ See ‘Chromotographie,’ by P. Lazlo in Tr´esor. Dictionnaire des sciences (Paris: Flammarion, 1997).
bioorganic chemistry and biology 85 column. If the binding is substantially through ionic forces (as one thinks it is for the antibody mimic) then only a substantial concentration of ionic salt solution will disrupt that binding. At the top of the column, one first adds a control molecule (Diagram 4 in Figure 3.1). It is eluted easily, with no salt. But the antibody mimic 3 turns out to be bound much more tightly—it takes a lot of salt to flush it out. A second kind of chromatography, ‘gel permeation chromatography,’ gives more graphic evidence for the binding of cytochrome c to 3. In this ingenious chromatography the column is packed with another cellulose-like and porous fiber, called Sephadex G-50. The ‘G-50’ is not just a trade name; it indicates that molecules of a certain size will be trapped in the column material, but molecules both larger and smaller will flow through the column quickly. The results of this experiment are shown in Figure 3.3, replete with labeled axes. The vertical axis measures the absorption of light at a certain wavelength; this is related to the concentration of a species, the bound cytochrome c - 3 complex. The horizontal axis is a ‘fraction number’ that is related to the length of time that a given molecule (or compound? the chemical discourse here is equivocal) resides on the column. The pores in the Sephadex retard cytochrome c; it stays on the column longer, that is, it has a higher fraction
Figure 3.3. Hamilton et al., ‘A Calixarene with Four Peptide Loops, Figure 2
86
chemistry and geometry
number. The molecular complex of the mimic and cytochrome c comes out in a different peak, at lower fraction number. This means it is too large to be caught in the pores of the Sephadex, which in turn constitutes evidence for some sort of binding between the cytochrome c and the antibody mimic, creating a larger molecular entity. So there is binding: but does it impair the function of the cytochrome c? Evidence for that is provided by reacting the cytochrome c with ascorbate (vitamin C), with which it normally reacts quite efficiently; here, on the contrary, it doesn’t. We have investigated the effect of complexation with 3 on the interaction of FeIII -cyt c with reducing agents.∗ In phosphate buffer FeIII -cyt c (1.57 × 10−5 M) is rapidly reduced by excess ascorbate (2.0 × 10−3 M) with a pseudo-first-order rate constant 0.1090 ± 0.001 (Figure 4 [here labeled Figure 3.4]). In the presence of 3 (1.91 × 10−5 M) the rate of cyt c reduction is diminished tenfold (kobs = 0.010 ± 0.001 s−1 ), consistent with the calixarene derivative’s binding to the protein surface and inhibiting approach of ascorbate to the heme edge (Figure 3 [here labeled Figure3.5]).
Figure 3.4 records another measurement, with the concentrations measured on the vertical axis, the time on the horizontal; it displays the outcome
Figure 3.4. Hamilton et al., ‘A Calixarene with Four Peptide Loops, Figure 4
bioorganic chemistry and biology 87 of an experiment on the kinetics of ascorbate reduction by cytochrome c, which supports the claim that the antibody mimic does impair the function of the protein, in this case its ability to react with ascorbate. Figure 3.5 is a picture of the antibody mimic binding with cytochrome c. Since the authors admit, ‘The exact site on the surface of the cytochrome that binds with 3 has not yet been established,’ this image is a conjecture; and it is the outcome of the same computer program that generated Figure 3.2. It ‘docks’ ‘a calculated structure for 3’ at the most likely site on the cytochrome c, where the four peptide loops ‘cover a large area of the protein surface.’ Figure 3.5 is a remarkable superposition of several types of iconic representation. The antibody mimic (at top) is shown essentially as it was in Figure 3.2, but from the side. The atoms of cytochrome c are legion, and so are mostly not shown; instead, the essential helical loops of the protein are schematically indicated. But in the contact region, the atoms are again shown in great detail, not by ball-and-stick or rod representations but by tenuous spheres indicating roughly the atomic sizes or electron densities. The reader can make sense of these superimposed iconic idioms only by reference to a cognitive framework of words and symbols.
Figure 3.5. Hamilton et al., ‘A Calixarene with Four Peptide Loops’, Figure 3
88
chemistry and geometry
Iconic representations in chemical discourse must be related to a symbolic discourse; our access to the microscopic objects of chemistry, even our ability to picture them, is always mediated by that discourse rather than by our ordinary organs of perception. So the objects of chemistry may seem a bit ghostly, even to the practitioners for whom acquaintance with their existence is especially robust.¹⁹ Conversely, symbolic discourse in chemistry cannot dispense with iconic discourse as its complement, nor can it escape its own iconic dimension. The side-by-side distinction, iteration, and concatenation of letters in chemical formulae echo the spatial array of atoms in a molecule. Otherwise put, the juxtaposition of symbols often articulates otherness; side-by-sideness is a figure for distinctness. And spatial, graphic isolation of a group of symbols from other things around it is a figure for unity. Thus O=C=O represents a molecule of carbon dioxide, with two distinct oxygen atoms bonded to the carbon. The array in Figure 3.4 also represents, as spatial relations among symbols, the temporal spread of stages of a chemical event, where otherness is priority or posteriority. Just as icons evoke existence, the unity of existence, so they evoke otherness as side-by-side-ness, as externality. Identity and difference, pace the logicians, cannot be fully represented without exploiting the iconic dimensions of symbolic languages. The icon in Figure 3.5 stands for a molecular complex that may or may not exist. It is an intelligible possibility, a guide to future research. For the authors of the paper, it is something they very much hope does exist, a wish that can perhaps be read in the bright, imaginary colors of the original image. And yet chemical icons work their magic of asserting and displaying the unity of intelligible existence only when the symbolic discursive context and the experimental background allow them to do so. Whatever remains still to be worked out, the authors of the paper declare a positive result, and its generalization to a broader research program. The new type of synthetic host 3 thus mimics antibody combining sites in having several peptide loops arrayed around a central binding region. The large surface area in the molecule allows strong binding to a complementary surface on cytochrome c and disrupts, in a similar way to cytochrome c peroxidase, the approach of reducing agents to the active site of the protein. We are currently preparing ¹⁹ See P. Lazlo, ‘Chemical Analysis as Dematerialization,’ Hyle, 4 (1998), 29–38.
bioorganic chemistry and biology 89 libraries of antibody mimics from different peptide loop sequences and screening their binding to a range of protein targets.
3.6. Conclusions Angewandte Chemie, where Roald Hoffmann found the article closely read in this paper, is no longer especially concerned with applied chemistry; indeed, it is arguably the world’s leading ‘pure’ chemistry journal. The December 15, 1997, issue of the journal in which the Hamilton article appears contains one review, two comments or highlights, several book reviews, and 38 ‘communications,’ articles one to three pages in length that, in principle, present novel and important chemistry. Without question, the Hamuro, Calama, Park, and Hamilton article is a beautiful piece of work, deserving of the company it keeps in the pages of Angewandte Chemie. But is this work typical of chemistry, and sufficiently so that any close reading of it might elicit generalities valid for the field? After all, it is not clear what counts as typical in a science whose topics range from cytochrome c, to reactions occurring in femtoseconds, to inorganic superconductors. And perhaps work that strives to redefine the boundaries of a science cannot fully represent what Kuhn called normal science. Nonetheless, the Hamilton et al. article exhibits many of the important features of most work in modern chemistry, especially in the way that it moves between levels of reality. On one line the authors of the article talk of a molecular structure, and on the next of a reaction; a certain linguistic item (symbol or icon) may stand for either or both. Theory and experiment, expressed in artfully intertwined symbolic and iconic languages, relate the world of visible, tangible substances and that of the molecule. This ambiguity is clearly not a kind of intellectual sloppiness that hard science must ultimately abolish. Intelligible things that exist may be analyzed, the tacit may be articulated, but never completely and all at once: certain indeterminacies and logical gaps always remain, even as scientists achieve a consensual understanding of complex reality. Chemists habitually think at both the level of macroscopic substances and their transformations in the laboratory, and the level of the statics and dynamics of microscopic molecules; and here we see them bringing chemistry into the service of biology. The resultant paper is then multiply polyvalent, and this is a source of its fertility. The logical gap
90
chemistry and geometry
between the microscopic and macroscopic, and between the biological and chemical levels of description is never closed (by some kind of reduction), but rather is constantly and successfully negotiated by a set of theories embodied in instruments and nomological machines, and expressed in the paper tools of symbolic and iconic languages. Precisely because these languages are abstract and incomplete (in the sense of being non-categorical, not capturing all there is to say and know about the entities they describe) they are productively ambiguous, and can be understood in reference to both the macroscopic and microscopic. The bridging function—carried out in different but complementary ways by symbolic and iconic idioms—allows chemists to articulate and to solve problems, a task that often takes the form of imagining and then trying to put together a certain kind of molecule. The account in this chapter emphasizes what happens at the frontiers of knowledge rather than retrospective codification, and the investigation and creation of objects rather than the testing of theories.
4 Genetics and Molecular Biology The locus classicus of philosophical discussion about reduction is Ernst Nagel’s discussion in The Structure of Science.¹ Nagel describes reduction as a relation between two theories formalized in first-order predicate logic, where the axioms of the reduced theory are deduced as theorems from the axioms of the reducing theory. The reducing theory is taken to explain the reduced theory, for Nagel interprets explanation in terms of Hempel’s deductive-nomological model.² In this model, the fact or principle to be explained is deduced from a set of covering laws, axioms that ‘cover’ the given domain, in conjunction with a set of appropriate boundary conditions. However, if the theories thus brought into relation are initially about different kinds of things, the vocabulary of the reduced theory will not be found in the reducing theory, in particular in its axioms. When the reduced theory contains terms that do not occur in the reducing theory (as is most often the case), the derivation of the reduced theory is blocked. There must thus be a way of translating the axioms of the reduced theory into the language of the reducing theory; Nagel calls a set of rules or definitions that would effect this translation ‘bridge laws.’³ Depending on context, the bridge laws may be construed as conventional or factual; and philosophers like Nagel hope that the bridge laws may be given as a straightforward and trivial isomorphism, at the same time suppressing the issue of what language can be used to relate the two heterogeneous formal languages.
¹ The Structure of Science (New York: Harcourt, Brace & World, 1961). ² C. Hempel and P. Oppenheim, ‘Studies in the Logic of Explanation,’ Philosophy of Science, 15 (1948), 491–9. ³ Nagel, The Structure of Science (New York: Harcourt, Brace & World, 1961), 97–105.
92
chemistry and geometry
4.1. Objections to Hempel’s Model of Theory Reduction Ever since Hempel put this model forward, philosophers of science have contested it. The historical example of the ‘reduction’ of Newtonian to Einsteinian mechanics has often been used in the literature to call the Hempel/Nagel model of reduction into question. Einstein’s theory is said to supersede and to explain Newton’s theory, but as Kenneth Schaffner argues in an early essay, what is strictly derivable from Einstein’s theory is only a corrected and reconstructed version of Newton’s theory.⁴ Schaffner later argues in addition that the reducing theory may also be corrected by the reduced theory, which after all constitutes some of the evidence that the former theory must take into account.⁵ The two original theories are logically disjunct; bringing them into rational relation requires that both be reformulated. More recently, Schaffner has enriched his picture of reduction by including models as well as (original and corrected) theories, as he examines rational discovery procedures in biology.⁶ Fritz Rohrlich takes a stronger position in his essay ‘Pluralistic Ontology and Theory Reduction in the Physical Sciences’,⁷ arguing that Newton’s theory and Einstein’s theory are logically disjunct because they are about different things. Newton’s universal gravitational force is not a central term of Einstein’s theory, but emerges only in a suitably chosen limit under suitable assumptions; the notion of curved space-time is simply not in Newton’s theory. If Newton’s theory and Einstein’s theory are logically disjunct, then it is tempting to say that we should discard the former and retain the latter; this is just a case of theory displacement. But Rohrlich argues convincingly that the history of science weighs against this conclusion. Newtonian mechanics has not been discarded, like the theories of phlogiston or catastrophism. It is widely used by physicists and astronomers, for it offers a model for the explanation of gravitational phenomena that—within very wide but specified limits—is empirically adequate, and much simpler than ⁴ ‘Approaches to Reduction,’ Philosophy of Science, 34 (1967), 137–47. ⁵ ‘The Peripherality of Reductionism in the Development of Molecular Biology,’ Journal of the History of Biology, 7 (1974), 111–29. ⁶ Schaffner, Discovery and Explanation in Biology and Medicine (Chicago: University of Chicago Press, 1994). ⁷ British Journal for Philosophy of Science, 39 (1988), 295–312.
genetics and molecular biology 93 the model offered by the Einsteinian theory. The philosopher Thomas Nickles, reflecting on the same example, observes that although two theories may be logically disjunct, they may still be formally related, that is, related in mathematically precise terms.⁸ For example, special relativity theory is reduced to classical mechanics when an appropriate limit is taken, for instance, when the speed of light is counterfactually supposed to go to infinity. Notice that with this interpretation of reduction, the roles of reducing theory and reduced theory have been switched, and that the formal relation between the theories cannot be empirically construed, but is ideal in a strong sense. Moreover, there is nothing in predicate logic that corresponds to the operation of taking limits. Rohrlich offers a compelling solution to the dilemma to which that philosophical conversation seems to have led. There ought to be some way in which scientific theories can be understood as standing in (broadly construed) rational relationship to each other, while yet maintaining a certain mutual autonomy. Nagel’s model has the disadvantage of not characterizing historical relations among theories; moreover, his model sacrifices mutual autonomy to rational relation. Schaffner’s and Nickles’s models put so much emphasis on mutual autonomy that they attenuate the rational relatedness of theories. A good philosophical account of reduction should explain why both the rational relatedness of the theories (and their models, and what they are about) and their mutual autonomy contribute to our understanding. Rohrlich characterizes Newton’s theory and Einstein’s theory as ‘mature’ theories: each is accepted by a majority of scientists, expressed in terms of a formal mathematical theory (which allows for the derivation of equations, for precise predictive and explanatory power, and for the transcendence of perceptual experience), and supported by sufficient empirical evidence. Finally, each is rationally coherent with other theories. To keep this definition from rendering his argument about reduction merely circular, Rohrlich must exhibit what he means by rational coherence, and therein lies the interest of his essay.⁹ Rohrlich argues that the formalized structure M of a mature theory is associated with a conceptual model that may be informal, indeterminate and shifting, for that model is developed by confronting M with empirical ⁸ ‘Two Concepts of Inter-Theoretic Reduction,’ Journal of Philosophy, 70 (1973), 181–201. ⁹ Rohrlich, ‘Pluralistic Ontology and Theory Reduction in the Physical Sciences’ British Journal for Philosophy of Science, 39 (1988), 295–312.
94
chemistry and geometry
evidence and with neighboring theories. A mature theory also may have a validity domain D, which he characterizes in these terms: The ‘boundaries’ of D are characterized by the relative magnitudes of physical variables. Typically, there exists a characteristic parameter p, which is the ratio of two physical variables of the same dimensions of length, time, and mass. The boundary of D is reached when p is no longer negligible ... The classic example is D (Newtonian mechanics) which given by p = (v/c), where v is a typical velocity and c is the speed of light.
Then we can construe the rational relation of Newtonian mechanics to Einstein’s special relativity theory as ((v/c)2 → 0) lim M(SR) = M(NM) Thus we can say that Newton’s theory reduces to Einstein’s theory; in the limit, the speed of light c disappears from the reduced theory, and Lorentz invariance becomes Galilean invariance. To use Rohrlich’s vocabulary, a coarser-grained view of reality reduces to a finer-grained view of reality. But although this rational relation can be articulated, Rohrlich urges the mutual autonomy of the two theories: Newtonian mechanics survives within a certain domain D characterized by the parameter p, even though in some sense it has been superseded by the finer theory of special relativity. And he characterizes this autonomy in strong terms. Just as there would be a loss of understanding if the finer-grained theory had not been invented, so too forgetting the coarser-grained theory would lead to loss of understanding. ‘The qualitative character of the coarser theory demands recognition in its own right,’ for ‘these truths complement each other, each theory making a contribution on its own level.’ To make this point especially forceful, Rohrlich reminds us that the coarser theory cannot be reconstructed from the finer theory, despite the existence of a formal reduction like the one just mentioned. The physical parameter p, for instance, cannot be discerned within the finer theory, but becomes evident only when the coarser theory is known; thus, the coarser theory, defined in some fundamental way by its characteristic parameter p, cannot be deduced from the finer theory alone. Moreover, not even the formal structure M of the coarser theory can be written as a function of the finer one; it is impossible to express every central term of the coarser theory as a function of the terms of the finer theory.
genetics and molecular biology 95 Although the mathematical structures of the two theories are rationally coherent because of the reductive relation established by taking the limit to zero of the characteristic parameter p of the reduced theory, so Rohrlich argues, the reduced theory cannot be deduced from the reducing theory. Formally, the reduction is blocked because the laws of formal logic do not include processes like taking a limit to zero. The reduction is also formally blocked by the absence of a set of isomorphisms that would serve as bridge laws, expressing the central terms of the reduced theory by those of the reducing theory. Thus the picture of science with which Rohrlich concludes is not convergence to an ultimate exactly true theory but rather convergence to a rationally coherent net of theories, each with its own domain of validity. The syntactic or formal connectedness of these theories may be made precise, and yet they are deductively independent and semantically heterogeous: they are about different levels or aspects or textures of physical reality. The truths of these theories are complementary. The case Rohrlich considers is rather special because there is an elegant way to formalize the relation between the ‘characteristic parameters’ of the two domains, a condition that does not typically obtain among disparate areas of research. But I find Rohrlich’s example important precisely because it includes formalization: the exhibition of a formal relation between domains is no guarantee that their subject matters can be identified, or even set in isomorphic relation. All the more disparate are the domains that Nancy Cartwright illustrates on the cover of The Dappled World by a series of loosely knotted balloons. Contesting the alleged universality of the laws of physics by arguing that all theory borrows its meaning from the models and nomological machines that make it applicable, she writes, ‘In general, we construct models with concepts from a variety of different disciplines, the arrangements in them do not fit any rules for composition we have anywhere, and the regular behavior depicted in the model does not follow rigorously from any theory we know.’¹⁰ Like Cellucci, Cartwright champions a view of knowledge that is open-ended; like Leibniz, she insists on the capacities of intelligible things, capacities that can never be exhausted by the laws that articulate them. Breger’s examples of the articulation of tacit knowledge in the development of mathematical theories reminds us that bringing objects ¹⁰ Cartwright, The Dappled World, (Cambridge: Cambridge University Press, 1999), 58.
96 chemistry and geometry at higher levels of abstraction into focus entails the forgetting or obscuring by formal systems of the objects that exist at more concrete levels, levels that cannot be retrieved from the formal systems that forget them. Lindley Darden and Nancy Maull, in their article ‘Interfield Theories,’ likewise call into question Hempel’s notion of bridge laws, assumed to be trivial formal conventions or factual compendia.¹¹ Darden and Maull (revealing the influence of Dudley Shapere) argue first of all that a scientific domain is much richer than a formal theory, for it includes a central problem, ... items taken to be facts related to that problem, general explanatory factors and goals providing expectations as to how the problem is to be solved, [experimental] techniques and methods, and sometimes but not always, concepts, laws and theories which are related to the problem and which attempt to realize the explanatory goals.
They then show by detailed case studies drawn from biology that the work of bringing domains into relation in order to solve problems is usually nontrivial, and is carried out, not by a compendium of mere isomorphisms, but by novel inter-domain theories and experimental practices that make substantive claims which must in turn be investigated. Their close examination of the links between genetics and biochemistry reveal that the links are conditional and many-many (not one-one) correlations between terms that belong to different descriptive levels; moreover, when domains are defined in this way, a domain clearly cannot be ‘derived’ from another domain in any case, since derivation holds among sets of propositions. On the basis of the closely allied case study in this chapter, I would add that if we want to assert that the unification of domains arises around the solving of specific problems and requires substantive additions to existing theory and experimental practice, we must also note that domains come already furnished with traditions of representation, certain characteristic modes of representing. Thus the combination of those domains involves the juxtaposition of formerly unassociated modes of representing; the juxtaposition itself may contribute strongly to the growth of knowledge, and it will also require explanation and exposition in natural language.¹² As Robin Hendry observes, ¹¹ Philosophy of Science, 44 (1977), 43–64. My thanks go belatedly to Nancy Maull, who first introduced me to these controversies. ¹² Klein, Tools and Modes of Representation in the Laboratory Sciences, vii–xv.
genetics and molecular biology 97 Logical empiricist models of inter-theory relations involved the explicit formulation of ‘bridge laws’ linking the vocabularies of different theories. Whatever the adequacy of those views of inter-theoretic reduction, to forge any connection between the tools of different disciplines will require the formation of new links between their central traditions of representation. Interactions are dynamically important too. Used together, consortia of representational tools of different kinds can sometimes achieve more than any one alone. Where one representational tool breaks down—where equations are intractable, for instance—others, perhaps visual, and with complementary ‘epistemic virtues,’ may be used to fill the gap.¹³
The notion of analysis, operating at the level of term, proposition, and argument, provides a more supple and accurate way to understand situations that have been called instances of theory reduction, especially when we recall that analysis employs a variety of modes of representation as it brings domains into novel alignment.
4.2. The Transposition of Genes: McClintock and Fedoroff Iconic representations, so Peirce’s useful commonplace runs, look like what they represent while symbolic representations do not. This means that iconic representations, whatever else they may be, are spatial and visual. Because of the doctrine in his Transcendental Aesthetic and the section on the schematism of the pure concepts of the understanding in the Critique of Pure Reason, Kant has taught us to associate the iconic with spatiality and the symbolic with temporality. Logicians—perhaps under the influence of the neo-Kantian Frege—have thus supposed that their symbolic formal languages have negligible spatial and iconic features, and those philosophers and historians concerned with iconicity like Cassirer, Panofsky and W¨olfflin (also neo-Kantians) have neglected temporality in images. Icons are also supposed to have a strong resemblance to what they represent. However, all icons are unlike as well as like what they represent: if an icon perfectly reproduced the thing it represented, then either it would simply be that thing and therefore not a representation, or it would ¹³ ‘Mathematics, Representation, and Molecular Structure,’ in Klein, Tools and Modes of Representation in the Laboratory Sciences, 229.
98
chemistry and geometry
violate the principle of the identity of indiscernibles—and therefore (if we agree with Leibniz) wouldn’t exist. So the qualification ‘strong’ must be one of degree, related to the success of an icon’s representing something in the context of a problem solution or scheme of use, to use Robin Hendry’s phrase. The modulation in the strength of icons also suggests, as I argued in the last chapter, that there is a continuum between iconic and symbolic representations, a continuity that does not undermine the usefulness of the distinction but does require us to pay attention to the unlikeness as well as the likeness of icons in the problem contexts that constrain their use. Representative icons distort, abridge, and idealize; they call attention to their status as representations (in contrast to the thing represented) and hence their metaphorical ambiguity; they exhibit ‘style’ referable to their author or cultural context; and they often function symbolically as well as iconically. The scientific relationship between Barbara McClintock and Nina Fedoroff, in which Fedoroff extends and transforms McClintock’s results concerning the transposition of genes (the displacement of genes within the genome) is both a good example of the reduction of genetics to molecular biology, and an episode in scientific reasoning where iconic representations of many kinds play a central role. Evidence for the transposition of genes includes inter alia images taken via light and electron microscopes of microscopic objects like cells and broken chromosomes, photographs of macroscopic objects like corn kernels and experimental results displayed on plates of gel, and iconic transcriptions of DNA sequences. The episode (ongoing in the work of Fedoroff) makes visible the issue of what happens to modes of representation when scientific domains are brought into novel alignment, and in particular the effect of translating problems that arose in genetics into the idiom of molecular biology. I will argue that this novel alignment is accompanied by a significant shift in modes of representation (both iconic and symbolic), which clarify and help to solve the engendering problems. Chemistry uses a broad spectrum of iconic representations, as we saw in Chapter 3. It includes picturings of molecules (microscopic objects) like two-dimensional stereochemical formulae, sketches of molecular structure, or three-dimensional ball-and-stick models, as well as images composed by light microscopes, electron microscopes, and x-ray crystallography, and ‘imaginary’ computer-generated images that propose the most likely disposition of the parts of a molecule given certain evidence. It also includes pictures of the results of experiments (macroscopic objects) like
genetics and molecular biology 99 the autoradiographs used in the determination of nucleotide sequences, discussed below. I would like to ask what role these modes of representation play when a domain like molecular biology borrows (and modifies) them, and then employs them in the service of other domains, in this case to solve a set of problems that genetics could formulate but could not solve on its own. In the case study under consideration, the expository writings and experimental work of the molecular biologist Nina Fedoroff develop the work in genetics of Barbara McClintock, who discovered the phenomenon of the transposition of genes and thus threw into question the stability of the genome.¹⁴ This was a great and rather shocking scientific innovation; Nina Fedoroff compared it to a homeowner’s discovery that every now and then the basement has a tendency to jump up into the attic. Fedoroff ‘applied the new techniques of molecular biology to isolate and study McClintock’s mobile maize genes,’ analyzing their structure and DNA sequences to deepen understanding of how these elements work, and to answer (and raise) questions. In so doing, she at least temporarily puzzled and displeased McClintock.¹⁵ This was not so much because McClintock was committed as a geneticist to the study of complete organisms, Fedoroff claims; the methods of the geneticist are after all highly abstract and reductive. And at that point in her life, while McClintock herself was not interested in trying to master a whole new set of techniques, she welcomed the application of molecular biology to her work. Rather, what made her uncomfortable with (for example) Fedoroff’s redescription of the significance of McClintock’s work in the lead essay in Mobile Genetic Elements, edited by James Shapiro, was that it suggested substantive additions to and, in a sense, corrections of her work.¹⁶ In particular, McClintock wanted to view the relations among (to use Fedoroff’s vocabulary) autonomous and non-autonomous transposing elements, like the Ac-Ds system, as a hierarchical regulatory system analogous to that of the lac operon. But this analogy was only superficial. Transposons do play some ¹⁴ This episode first came to my attention when I taught Reflections on Gender and Science (New Haven: Yale University Press, 1996) by Evelyn Fox Keller ten years ago; I enjoyed talking about an earlier draft of this paper with her as well as with the electron microscopist Audrey Glauert at Clare Hall, University of Cambridge, in the summer of 2001. ¹⁵ N. Fedoroff, private correspondence. I thank Nina Fedoroff for the information she supplied me with as I wrote the first draft of this paper, and for helpful comments along the way. ¹⁶ N. Fedoroff, ‘Controlling Elements in Maize,’ in Mobile Genetic Elements, ed. J. Shapiro (Orlando, FL, and London: Academic Press, Inc., 1983), 1–63.
100
chemistry and geometry
role in the regulatory systems that guide the development of the organism, but in a less central and more complex way than McClintock supposed. The vocabulary, instrumentation and lab techniques, and modes of representation peculiar to molecular biology were needed to make precise the various roles played by transposons and the transposase they encode. Nonetheless, despite McClintock’s reservations, Fedoroff’s recasting of her work did much to persuade the recalcitrant scientific community to take McClintock’s ideas seriously. It was hard for most scientists to believe that a transposable segment of DNA could literally be inserted into another gene until the inserted nucleotide sequence was laid before their eyes—Fedoroff was the first to demonstrate this for McClintock’s elements—or until a biochemical mechanism, the activity of transposase in excising the transposon from its site, was suggested.¹⁷ That acceptance also depended on demonstrating that transposition was a widespread phenomenon, not just peculiar to maize. Indeed, the extension of McClintock’s results to other organisms is the organizing principle of Shapiro’s volume. But the phenotypical expression of transposition differs so much from organism to organism that it can be identified as the same kind of phenomenon only by appealing to evidence at the molecular level. In an article on pictorial evidence, Carla Keirns does an excellent job of explaining why the visual means employed by McClintock in her articles (mostly photographs of the maize kernels which were her primary evidence, as well as camera lucida images drawn from microscopic images of maize chromosomes in the pachytene stage) inhibited her ability to communicate with the broader community of scientists.¹⁸ In this chapter, I will concentrate on why the broader spectrum of images employed by Fedoroff, in combination with the latter, were more persuasive. This episode where genetics and molecular biology intersect in complex ways shows not that the study of genetics and the natural processes that drive evolution comes down merely to the sequencing of genomes, but rather that the techniques of molecular biology have become an indispensable ¹⁷ N. Fedoroff, J. Mauvais, and D. Chaleff, ‘Molecular Studies on Mutations at the Shrunken Locus in Maize Caused by the Controlling Element Ds,’ Journal of Molecular and Applied Genetics, 2 (1983), 11–29; N. Fedoroff, S. Wessler, and M. Shure, ‘Isolation of the Transposable Maize Controlling Elements Ac and Ds,’ Cell, 35 (Nov. 1983), 235–42; N. Fedoroff, R. Polhman, and J. Messing, ‘The Nucleotide Sequence of the Maize Controlling Element Activator,’ Cell, 37 (1984), 635–43. ¹⁸ ‘Seeing Patterns: Models, Visual Evidence, and Pictorial Communication in the Work of Barbara McClintock,’ Journal of the History of Biology, 32 (1999), 163–96.
genetics and molecular biology 101 part of a repertory. To speak ontologically, the organic world is organized in a stunningly multi-level fashion, worthy of the metaphysical vision of Plato and Leibniz, and every level has its characteristic unities, which in turn have their own characteristic effects. To invoke Rohrlich’s article on theory reduction just discussed, the characteristic parameters of the theories that study ‘coarser’ objects cannot be retrieved from theories that study ‘finer-grained’ objects. Molecular biology supersedes but does not replace the field and greenhouse studies of genetics.¹⁹ And this seems to have been McClintock’s view: discussing the latter’s assessment of the operon explanation of gene regulation, Keirns argues, ‘McClintock’s vision of life was in layers of complexity, so even if the operon were present in every organism, it did not begin to explain evolution, development, or organismal complexity. McClintock sought a new synthesis.’²⁰ Philosophers of science who treat the relations between scientific domains as relations of logical derivation between formal theories assume that the language of science is, or ought to be, symbolic rather than iconic, and have had little to say about pictures or picturing. They have also tended to look at diachronic relations between predecessor and successor theories; the philosophical literature, as I noted above, is studded with discussions of the logical relation between Newtonian Mechanics and Einstein’s Theory of Relativity. In the rest of this chapter, by contrast, I examine an important case of scientific reduction in which iconic representation is central, and the two fields co-exist historically. Science includes many such examples of problem-solving, where one domain does not supplant another but rather domains are brought into mutual relation by interdomain theories, models, and modes of representation, and persist in relative autonomy. Once we look at the history of science (or mathematics) as a series of problems to be solved rather than as a set of completed theories to be set in inferential relation, certain features of the advance of scientific knowledge leap to the eye. Scientific (and mathematical) domains are often constituted around a paradigmatic problem, and just as often this problem proves to transcend the problem-solving abilities of that domain. Darden and Maull in the article discussed above give a good example: ‘In genetics, ¹⁹ See, for example, the main argument in L. Keller’s Levels of Selection in Evolution (Princeton: Princeton University Press, 1999). ²⁰ Keirns, ‘Seeing Patterns,’ 304.
102
chemistry and geometry
the question arose: where are the genes located? But no means of solving that question within genetics were present since the field did not have the techniques or concepts for determining physical location; cytology did have such means.’²¹ Scientific domains often become linked when one is called in to aid the other, against prior background knowledge that indicates there is an overlap in the phenomena they study. The items of linked domains may coincide, though each domain studies different aspects of them for different ends; or the items of one may serve as parts of the items of the other viewed as wholes, or as causes of the items of the other viewed as effects, or as underlying structures of the items of the other viewed as functions or processes or even behavior. When Bas van Fraassen or Kenneth Schaffner urges the consideration of models as well as theories, they seem to be thinking of models in a conservative sense, as models that depart only slightly from the concept of model in mathematical logic; thus for them a model ‘instantiates’ a theory and indeed only one theory at a time. Nancy Cartwright notes that parts of various theories can be brought to bear (in ways that depart from instantiation) simultaneously on a model. I am trying to explain why this is so, and why it is indeed typical of scientific and mathematical practice; the combination of not only various theories, but also various modes of representations, is useful for solving problems. The work of Barbara McClintock in genetics, centered on her discovery of the surprising phenomenon of transposition, raised problems that McClintock herself could not solve, and could formulate only with difficulty or incompletely. A few such problems were: What is the chemical basis on which genes may be excised in transposition? And, what is the relation of processes of transposition to the processes of gene regulation (which leave the structure of the genome intact but turn genes off and on in transitory and reversible ways)? How precisely does the insertion of a transposon change the structure of the genome? The solution to and reformulation of these problems, which called for the importation of the theory, experimental techniques, and modes of representation of molecular biology and biochemistry, guided by nascent ‘inter-domain theories,’ finally persuaded the broader scientific community to see the central importance of the phenomenon of transposition. ²¹ L. Darden and N. Maull, ‘Interfield Theories.’
genetics and molecular biology 103
4.3. McClintock’s Studies of Maize For the purposes of this chapter, I want to pay close attention to McClintock’s language, and the images in the articles that appeared when she first announced her discovery. McClintock used a technique that induced breakage in chromosomes, after which the broken arms fused with each other: these rearrangements disturbed the genome in significant ways, and brought about ‘an unusual and unexpected series of new mutants ... , characterized by types of instability known in genetic literature as mutable genes, variegation, or mosaicism.’²² The patterns on maize (which is popularly, rather imprecisely, known as ‘Indian corn’) are a striking example of such variegation; indeed, maize is an especially apt experimental plant, for each ear, with its hundreds of kernels, is like the microbiologist’s petri dish: each kernel represents the outcome of a distinct mating event, and the pigmentation leaves a highly ‘readable’ record of genetic events in the development of the kernel’s tissue.²³ In the 1930s, Marcus Rhoades had determined that a mutant gene can become unstable in the presence of another particular gene, that is, its instability is conditional upon the presence of another gene; but he never supposed the genes moved in the genome.²⁴ Rhoades’ predecessor, R. A. Emerson, had entertained the hypothesis that ‘distinct gene elements’ might be transferred from one allele to another, but could not see how to pursue that hypothesis.²⁵ In the mid-1940s, McClintock realized that the chromosome breakage she had been studying was occurring repeatedly at a single site, which she named the Ds (Dissociation) locus; she also realized that a second locus was needed for breakage at the Ds site to occur, which she named Ac (Activator).²⁶ Shortly thereafter, McClintock realized that Ds could move: It is now known that the Ds locus may change its position in the chromosome ... One very clear case has been analyzed, and, through appropriate selection of crossover chromatids, strains having morphologically normal chromosome 9 have ²² ‘Maize Genetics,’ Carnegie Institution of Washington Year Book, 45 (1946), 178. ²³ McClintock, ‘Controlling Elements in Maize,’ 2. ²⁴ M. Rhoades, ‘Effect of the Dt gene on the mutability of the a1 allele in maize,’ Genetics, 23 (1938), 377–97. ²⁵ R. A. Emerson, ‘The Frequency of Somatic Mutation in Variegated Pericarp in Maize,’ Genetics, 14 (1929), 488–511. ²⁶ ‘Maize Genetics,’ Carnegie Institution of Washington Year Book 45 (1946), 176–86; ‘Cytogenetic Studies of Maize and Neurospora,’ Carnegie Institution of Washington Year Book 46 (1947), 146–52.
104
chemistry and geometry
been obtained. As a consequence of this aberration, the Ds locus in these strains has been shifted from a position a few units to the right of Wx to a position between I and Sh. This is a very favorable position for showing the nature of the Ds mutation process.²⁷
She also surmises the transposition of the Ac locus on the basis of its interaction with the Ds locus: The Ac locus may have been removed from its former position and inserted into a new position in chromosome 9 in a manner similar to that observed for the transposition of the Ds locus, described above. Because Ac induces breaks at specific loci and gives evidence of undergoing a specific breakage process itself, this latter explanation is not improbable.²⁸
As Fedoroff summarizes, Unstable mutations of the type analyzed by both Emerson and Rhoades could be understood as the result of transposable element insertions into a locus, from which it frequently transposed during development, restoring gene function. McClintock was able to make the connection between transposition of a genetic element, the Ds locus, and the origin of a mutable gene giving a variegated phenotype, because the particular Ds element she first isolated had a second property, chromosome breakage, by which she was able to track the Ds element independently.²⁹
Soon thereafter, McClintock could show that Ac did indeed move.³⁰ These papers of the late 1940s include no diagrams or images, only reports of the breeding experiments and the visible patterns on kernels that resulted from them, and comments about the accompanying cytological examination of the chromosomes. All the same, the import of her results, and the evidence on which her claims are based—gleaned from experiments using perfectly conventional experimental methods—are quite clear. McClintock presented her results to the larger scientific community a number of times: ‘The Origin and Behavior of Mutable Loci in Maize’ in the Proceedings of the National Academy of Sciences in 1950, ‘Chromosome Organization and Genic Expression’ in Cold Spring Harbor Symposia on Quantitative Biology in 1951, and ‘Induction of Instability at Selected Loci ²⁷ ²⁸ ²⁹ ³⁰
‘Mutable Loci in Maize’ Carnegie Institution of Washington Year Book, 47 (1948), 158. McClintock, ‘Mutable Loci in Maize’ (1948), 159. ‘Marcus Rhoades and Transposition,’ Genetics, 150 (1998), 957–61. ‘Mutable Loci in Maize,’ Carnegie Institution of Washington YearBook, 48 (1949), 142–54.
genetics and molecular biology 105 in Maize’ in the widely read journal Genetics in 1953 (the year that Watson, Crick, and Franklin elucidated the double helical structure of DNA).³¹ The article presented in the journal of the National Academy of Sciences is remarkable for its boldness and clarity. McClintock announces her thesis that transposition occurs, and that certain systems of autonomous and nonautonomous loci make possible a compelling explanation for ‘mutable loci’ and the variegation or mosaicism they produce. The origin and behavior of this mutable c locus has been interpreted as follows: Insertion of the chromatin composing Ds adjacent to the C locus is responsible for complete inhibition of the action of C. Removal of this foreign chromatin can occur. In many cases, the mechanism associated with this removal results in restoration of the former genic organization and action. The Ds material and its behavior are responsible for the origin and the expression of instability of the mutable c locus. The mutation-producing mechanisms involve only Ds. No gene mutations occur at the C locus; the restoration of its action is due to the removal of the inhibiting Ds chromatin.³²
She forcefully suggests that since mutable loci have been recognized in a wide variety of organisms—especially Drosophila melanogaster, that most crucial of all experimental beasts—the occurrence of transposition must be considered not as peripheral or aberrant, but as central to the understanding of biological processes. ‘The author believes that the behavior of these new mutable loci in maize cannot be considered peculiar to this organism. The author believes that the mechanism underlying the phenomenon of variegation is basically the same in all organisms.’³³ The 1951 article is richly illustrated with two kinds of photographs. The first set are photographs of various chromosomes (magnified 1800×) at the pachytene stage of meiosis, accompanied by diagrams that select and highlight certain elements of the photograph, in particular, the strand of Ds-carrying chromosome and the position of the break, as well as the centromere region on the chromosome. (See Figure 4.1.) The need for such diagrams, explained in natural language in the text that accompanies ³¹ ‘The origin and behavior of mutable loci in maize,’ Proceedings of the National Academy of Sciences USA, 36 (1950), 344–55; ‘Chromosome Organization and Genic Expression,’ Cold Spring Harbor Symposia on Quantitative Biology, 1951/52, 13–47; ‘Induction of Instability at Selected Loci in Maize,’ Genetics, 38 (1953), 579–99. ³² McClintock, ‘The Origin and Behavior of Mutable Loci in Maize,’ 350. ³³ McClintock, ‘The Origin and Behavior of Mutable Loci in Maize,’ 345.
106
chemistry and geometry
each figure, shows that the image of a set of chromosomes at pachytene is informative only when the scientist explains it for the reader, bringing out one element from the complex image presented and explaining its features: then it counts as compelling evidence for the theory. Recall too that the chromosomes have been killed, dyed and fixed to make them visible; the photograph is of material on a slide, a slice of a dead and frozen knot of chromosomes that represents what the chromosomes once were in their
Figure 4.1. McClintock, ‘Chromosome Organization and Genic Expression,’ Figure 8
genetics and molecular biology 107 living state. A whole theory and practice of the preparation of slides that makes precise in what way the slide can and can’t give information about organisms and their parts is brought to bear in interpreting such photographs as reliable witnesses.³⁴ The other set of images are photographs of macroscopic items, an ear of Indian corn (slightly reduced) and various kernels (slightly enlarged). (See Figure 4.2). One immediate consequence of McClintock’s theory that genes may transpose into and out of a locus on the chromosome was a causal explanation of mosaicism in corn; and thus corn exhibiting mosaicism, viewed in the context of the proper commentary, could count as evidence for her theory. Because of the way in which the endosperm and pericarp develop in corn, each kernel of corn records on its surface interruptions of the production of various phenotypical features due to Ds transposing in and out of a locus in the presence of Ac. But this makes each photograph of a corn kernel an oddly symbolic icon, because the information it conveys in this context is not how a kernel of corn looks, but signs of where and when bio-chemical processes turned off and on—and obviously corn pigmentation does not ‘resemble’ a bio-chemical process. In sum, pictures of microscopic things must be highly constructed in order to make those things visible at the macroscopic level; and pictures of macroscopic things may become symbolic in order to yield information about microscopic things. The context of use determines whether an image is to be read as symbolic, or iconic, or both. In these cases, living things that move and develop in time are arrested and frozen by the process of picturing, even though the picturing is designed to yield information about processes. The 1953 presentation in Genetics likewise includes no images, but instead a series of tables, six in all, that set out in perspicuous form McClintock’s experimental crosses and the resultant kernel phenotypes. (Figure 4.3 shows Tables 4 and 5). The paper concludes, Extra-genic units, carried in the chromosomes, are responsible for altering genic expression. When one such unit is incorporated at the locus of a gene, it may affect genic action. The altered action is detected as a mutation ... The extra-genic units undergo transposition from one location to another in the chromosome ³⁴ My many conversations with Audrey Glauert during 1997–98 at Clare Hall, University of Cambridge, stand behind this observation. She is one of the world’s experts on the application of electron microscopy to biological problems, and a pioneer in fixation and embedding procedures for slides.
108
chemistry and geometry
Figure 4.2. McClintock, ‘Chromosome Organization and Genic Expression,’ Photographs 10 and 12, p. 23 complement. It is this mechanism that is responsible for the origin of instability at the locus of a known gene; insertion of an extragenic unit adjacent to it initiates the instability. The extragenic units represent systems in the nucleus that are responsible for controlling the action of genes. They have specificity in that the mode of control of genic action in any one case is a reflection of the particular system in operation at the locus of the gene.³⁵ ³⁵ ‘Induction of Instability at Selected Loci in Maize,’ Genetics, 38 (1953), 598.
genetics and molecular biology 109
Figure 4.3. McClintock, ‘Induction of Instability at Selected Loci in Maize,’ Tables 4 and 5
She illustrates her claim by the example of the Ds-Ac system and two loci it controls, Sh(shrunken) and Wx(waxy). The control by the Ds-Ac system of the expression of the dominant and recessive forms of these traits is set out in her six tables. One thing to note in these tables is the ambiguity of the terms ‘Sh’ and ‘Wx,’ each of which stands for both a genic unit on the
110
chemistry and geometry
chromosome, and a phenotypical trait; by contrast, the term ‘Ds’ refers to a mobile extragenic unit on the one hand, while the term ‘Ac’ refers—at the phenotypical level—to no single trait but rather to a modality of the appearance of a trait. What ‘Ds’ and ‘Ac’ refer to phenotypically is rather abstract and elusive; indeed, ‘Ds’ doesn’t refer to anything on its own at the phenotypical level, since it is a non-autonomous element. The philosophers of science who wanted genetics to reduce to biochemistry or molecular biology according to the standard model of theory reduction posited a one-to-one map between terms of molecular biology (presumably referring to entities at the molecular level) and terms of genetics (presumably referring to entities at the visible, macroscopic level). But the present example reveals the oversimplification involved in this view. First of all, the terms of molecular biology, like the terms of chemistry, often refer ambiguously to both microscopic entities (macromolecules) and macroscopic entities (the ‘purified’ materials that are sorted and analyzed in the molecular biologist’s lab). So too do the terms of genetics, as we have just seen. Second, the theoretical move from structure to function (so characteristic of inter-domain theories and experimental practices) is typically much more complicated than any one-one mapping, for many structures may have no function or a diffuse function; some structures may have a function only in conjunction with other structures; some structures may be functional only in response to the environment that surrounds all the structures in question; and some of the same functions may be taken over by different structures. Both these presentations, in two centrally important venues, seem to have been entirely ignored. As McClintock observed, rather dryly, ‘It was clear from responses to this report that the presented thesis, and evidence for it, could not be accepted by the majority of geneticists or by other biologists ... I had already concluded that no amount of published evidence would be effective.’³⁶ While it is quite probably true that the puzzled (and puzzling) reception to McClintock’s ideas was due in part to the fact that she was a woman, and that her work ran counter to central assumptions held at the time by most geneticists, these explanations do not go far ³⁶ The Discovery and Characterization of Transposable Elements: The Collected papers of Barbara McClintock (New York: Garland Publishers, 1987), x; repr. in The Dynamic Genome: Barbara McClintock’s Ideas in the Century of Genetics, eds. N. Fedoroff and D. Botstein (Cold Spring Harbor: Cold Spring Harbor Laboratory Press, 1992), 208.
genetics and molecular biology 111 enough. The acceptance and recognition that McClintock did in fact get all along (on a restricted scale, to be sure) was because she was recognized as a great scientist; no one ever denied she was a great scientist because she was a woman, though the fact that she was a woman probably led many scientists simply not to pay attention to her work. Moreover, we must wonder how assumptions about the stability of the genome (as we might put it now) got entrenched so quickly.³⁷ Genetics as a scientific enterprise in the work of Hugo DeVries, Carl Correns, and Emerson and Rhoades had only been going for a few decades, and the results of the latter showed that examples of instability were known and studied early on. Why was there so little response to McClintock’s discovery of transposition?
4.4. J. D. Watson’s Textbook In 1965, J. D. Watson published a college textbook entitled Molecular Biology of the Gene, intended to report the state of the art to the educated public. The first chapter, entitled ‘The Mendelian View of the World,’ is an exposition of classical genetics, and ends with the observation, ‘In general, the tools of the Mendelian geneticists, organisms such as the corn plant, the mouse, and even the fruit fly, Drosophila, were not suitable for chemical investigations of gene-protein relations. For this type of analysis, work with much simpler microorganisms became indispensable.’³⁸ This dogmatic statement or rather prophecy has proven unfounded; but it indicates that McClintock’s organism, maize, was regarded as distinctly unfashionable. The second chapter is entitled ‘Cells Obey the Laws of Chemistry;’ it presents the student with many chemical diagrams, including those of ‘small biological chemicals,’ like purine and pyrimidine, aromatic hydrocarbon, alcohol, nucleotide, amino acid, and sugar; ‘important functional groups,’ like phosphate, methyl, amino and carbonyl; and illustrations of oxidation and ³⁷ This observation is due to the botanist and geneticist Ruth Geyer Shaw, private correspondence. My thanks go to my friend Ruth, her husband Frank Shaw, and her brothers Charlie and Pleas Geyer, as well as to my own brother the marine biologist Ted Grosholz, for many illuminating conversations about science and method. My thanks also go to my other brother, Rob Grosholz, for many illuminating conversations about our family magazine National Geographic; and to Ohad Nachtomy for helpful insights into the philosophy of biology, especially in discussions about my brother’s and Ruth Shaw’s work, and that of Leibniz. ³⁸ Molecular Biology of the Gene (New York and Amsterdam: W. A. Benjamin, Inc., 1965), 31.
112
chemistry and geometry
reduction, and various metabolic pathways. It ends with this programmatic claim: Complete certainty now exists among essentially all biochemists that the other characteristics of living organisms (for example, selective permeability across cell membranes, muscle contraction, nerve conduction, and the hearing and memory processes) will all be completely understood in terms of the coordinative interactions of small and large molecules.³⁹
(One wonders if Watson intends ‘the hearing and memory processes’ to include the works of Mozart and Proust.) And the subsequent chapter summary ends with the modest claim, ‘So far the greatest impact on biological thought has come from the realization that DNA has a complementary double-helical structure. This structure immediately suggested a mechanism for the replication of the gene, and initiated a revolution in the way biologists think of heredity.’ Watson’s reductionism here is quite striking and, as history shows, unfounded.⁴⁰ In the third chapter, ‘A Chemist’s Look at the Bacterial Cell,’ the reader finds chemical diagrams of the twenty amino acids from which proteins are built, the four main nucleotide building blocks of DNA, and a portion of a polynucleotide chain. Later on, the arrangement of genes on a chromosome is discussed, as well as the structure and function of DNA and RNA, the synthesis of proteins, and the replication of viruses. Despite an admiring discussion of the work of Jacob and Monod, operon theory and the theory of allosteric regulation in a chapter on protein synthesis and function, and despite an admission at the end of a chapter on cell differentiation that ‘virtually nothing is known about the molecular basis of the control of protein synthesis in the cells of the multicellular higher organisms,’ Watson has nothing to say about McClintock’s discovery of transposition.⁴¹ The terms ‘transposition’ and ‘transposon’ are not in the index of Molecular Biology of the Gene, and there are no citations of her work in the references given at the end of each chapter. Watson’s textbook shows the extent to which the study of genetics had been transformed by its complex new relations with physical chemistry, biochemistry, and molecular biology. It also suggests some reasons why Watson was unfamiliar with McClintock’s work, ignored it, or found it ³⁹ Watson, Molecular Biology of the Gene, 67. ⁴¹ Watson, Molecular Biology of the Gene, 438.
⁴⁰ Watson, Molecular Biology of the Gene, 69.
genetics and molecular biology 113 unworthy of mention if he was aware of it. Watson’s remarks about the relation of the study of living systems to chemistry, quoted above, reveal his commitment to a view of reduction that is totalizing and dogmatic: the geneticist must adopt the idiom of chemistry. McClintock, by contrast, was a geneticist, and thirty years older than Watson; while she was interested in biochemistry and molecular biology and a decade later welcomed its use in behalf of her theory of transposition, she herself was not about to pick up a whole new set of scientific tools. Moreover, Watson may well have believed that McClintock’s results were linked to an idiosyncratic feature of maize, an organism he regarded as peripheral in any case. If we recall that Watson was able to acknowledge the importance of the work of Jacob and Monod, we might note that what distinguished it from McClintock’s is that it spoke in the idiom of molecular biology and was anchored in research on bacteria and viruses. Indeed, though McClintock had claimed explicitly that the phenomenon of transposition was to be found in many other organisms, the sameness underlying quite different phenotypic manifestations of transposition could only really be demonstrated by means of molecular biology.
4.5. Fedoroff’s Translation of McClintock McClintock needed a translator. One of the most faithful and successful was Nina Fedoroff, whose own work shows very clearly how such rewriting by means of novel paper tools is not merely formal and certainly not trivial, for it substantively extends McClintock’s results—solving problems and allowing unforeseen problems to be formulated—and in certain ways corrects them. In the late 1970s, Fedoroff was a postdoctoral fellow in the laboratory of Donald D. Brown, where the techniques of molecular biology were widely employed. Two papers authored jointly by Fedoroff and Brown, for example, map out the nucleotide sequence of an especially important gene cluster, ‘the repeating unit in the oocyte 5A ribosomal DNA’ on the chromosomes of a frog, Xenopus laevis.⁴² The diagram that ⁴² ‘The Nucleotide Sequence of the Repeating Unit in the Oocyte 5S Ribosomal DNA of Xenopus laevis,’ Cold Spring Harbor Symposium on Quantitative Biology, 42 (1977), 1195–200; ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-rich Spacer,’ Cell, 13 (1978), 701–15.
114
chemistry and geometry
Figure 4.4. Fedoroff and Brown, ‘The Nucleotide Sequence of the Repeating Unit in the Oocyte 5S Ribosomal DNA of Xenopus laevis,’ Figure 1
presents the final outcome of their work as a linear sequence of nucleotides includes a multiple articulation.⁴³ (See Figure 4.4.) First, the sequence is distinguished into two regions, A and B, one of which is AT rich and one of which is GC rich. The first region is further divided into two regions which are quite stable, Al and A3 , and a third region which may vary dramatically in length from instance to instance, A2 . The second region is divided into three sections: region B1 (thought to encode directions that initiate and guide transcription); the gene itself; region B2 (which duplicates the end of A and B1 ); and the ‘pseudogene,’ which duplicates most of the gene but seems inactive. Note how the display of the sequence, horizontal but also carefully given a vertical articulation, makes clear the duplications within it. In the 1978 paper, the same strategy of using vertical as well as horizontal ⁴³ Brown and Fedoroff, ‘The Nucleotide Sequence of the Repeating Unit in the Oocyte 5S Ribosomal DNA of Xenopus laevis,’ 1196.
genetics and molecular biology 115
Figure 4.5. Fedoroff and Brown, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-Rich Spacer,’ Figure 11
display to highlight duplications is employed.⁴⁴ (See Figure 4.5.) Here, the strategy also displays possible deletions. That is, the mode of presentation makes visible not only what is present in the sequence but also what is—arguably—absent. The paper concludes that ‘‘spacers’’ like A, because they tend to enhance the overall duplication/deletion rate, may be critically important to both the stability and the evolutionary flexibility of the multigene family [the repeating unit A-B].’⁴⁵ Note that while horizontal side-by-side-ness is iconic in import (these nucleotides do lie next to each ⁴⁴ Brown and Fedoroff, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-rich Spacer,’ 712. ⁴⁵ Brown and Fedoroff, ‘The Nucleotide Sequence of Oocytess’.
116
chemistry and geometry
other on the gene), vertical side-by-side-ness is much more symbolic in import, indicating theoretically significant repetitions and deletions. Indeed, the interplay of duplication and deletion in the dynamic processes of evolution is the focal point of the discussion section at the end of that paper. The ability of discourse to represent negation, nothingness, absence, unrealized possibles, and so forth, is one of its most spectacular and troublesome features, as Plato was among the first to expound philosophically in the Parmenides and elsewhere.⁴⁶ The chemist’s Table of Elements, also a horizontal display with careful vertical articulation, illustrates the same point. The linearity—the rectilinearity—of this sequence suggests other philosophically interesting aspects of the representation here. Significantly, this representation is iconic (it is a picture that looks like what it represents, in this case a string of code elements) but appears to be symbolic. This is because what it pictures is itself symbolic: that is, the nucleotide sequence is a ‘linear’ code that functions as a language that ‘refers to’ things that it does not resemble. Both words I have just put in quotation marks deserve explanation. The way the nucleotide sequence refers to things is by guiding the construction of messenger RNA, which in turn guides the construction of proteins with a complex spatial configuration. Thus the distinction between symbolic and iconic representations is blurred when the object represented (in this case the nucleotide sequence) has a communicative function that is carried out in a mediated fashion. Yet while the nucleotide sequence CAAAGCTTCA ... etc. is a picture, it is also a highly stylized picture that leaves out a great deal of the original chemical complexity and moreover distorts it. And this distortion and stylization are due in part to our own human, Western, English conventions of the printed word. The lines in an English book are read from left to right, and a page of horizontal lines is read from top to bottom; so here, the direction in which ribosomes, for example, ‘read’ parts of the nucleotide sequence, is depicted as moving from left to right, and also (line by line) from top to bottom. Moreover, the lines are straight. While the nucleotide sequence is ‘linear’ in that its nucleotides are read one by one, in order, the ‘linearity’ of CAAGCTTCA ... is itself an artifact of certain conventions of writing and printing, for the shape of DNA is ⁴⁶ On this topic, see K. Wood, Troubling Play: Meaning and Entity in Plato’s Parmenides (Albany: State University of New York, 2006).
genetics and molecular biology 117 certainly not a straight line. First of all, each nucleotide has its own complex chemical structure, which moreover sits in a complex chemical scaffolding (a repeating series of a phosphate group and a hydroxyl group). Watson’s textbook representation of it, for example, elides all that for the sake of clarity. Second, the nucleotides-with-scaffolding sit in a double helix, which itself is wound around molecular bobbins and then braided and rebraided into chromatin, the material of which chromosomes are made. The straight lines of the printed nucleotide sequence are a tidy abstraction. In the 1978 article, Fedoroff and Brown describe at length the experimental processes that allowed them to map the region A-B in X. laevis oocyte 5S DNA, citing in particular their adaptation of the methods of Sanger and Coulson, and Maxam and Gilbert. These methods for determining nucleotide sequences are emphatically chemical, for they require processes of purification, the cleaving of complex molecules by chemical means, the initiation and termination of reactions, fractionation, electrophoresis, and autoradiography of the resultant substances on acrilamide gel at the end. The description of the process, given in the Maxam and Gilbert abstract summarizes: DNA can be sequenced by a chemical procedure that breaks a terminally labeled DNA molecule partially at each repetition of a base. We describe reactions that cleave DNA preferentially at guanines, at adenines, at cytosines and thymines equally, and at cytosines alone. When the products of these four reactions are resolved by size, by electrophoresis on a polyacrilamide gel, the DNA sequence can be read from the pattern of radioactive bands.⁴⁷
(See Figure 4.6.) Fedoroff and Brown present a number of similar autoradiographs in the 1978 paper, and describe in some detail how they were used to map out the A-B repeating unit (See Figure 4.7). The autoradiographs of the radioactive bands are icons of a macroscopic object in a laboratory, a pattern on gel; however, the point of the image is not to show a container of gel on a shelf but rather to convey the pattern, which like the trace of ocean waves left on the beach records a process. And the process, because it is an experimental process, has been designed to yield specific ⁴⁷ F. Sanger and A. R. Coulson, ‘A Rapid Method for Determining Sequences in DNA by Primed Synthesis with DNA Polymerase,’ Journal of Molecular Biology, 94 (1975), 441–8; A. M. Maxam and W. Gilbert, ‘A new method for sequencing DNA,’ Proceedings of the National Academy of Sciences USA, 74 (1977), 560–4.
118
chemistry and geometry
Figure 4.6. Sanger and Coulson, ‘A Rapid Method for Determining Sequences in DNA,’ Plate 1
information about microscopic structure, as the imposition of the letters A, T, C, and G on the autoradiograph of the pattern of bands indicates. The intent of the photograph is to depict a nomological machine, to use Nancy Cartwright’s vocabulary, as the accompanying prose tells us. The iconic dimension that remains in the photographs of McClintock’s kernels of corn is missing here. In the latter, the organic unity of the kernels is biologically significant and pertinent to the evidence: when the kernel of corn is mature, the process of pigmentation is finished; moreover, the carrier of the phenotypical traits is the kernel of corn.
genetics and molecular biology 119
Figure 4.7. Fedoroff and Brown, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-Rich Spacer,’ Figure 4
In a spin-off paper, Fedoroff investigated the effects of the transposon Tn9 (transposed from bacteriophage P1) when it is transposed into and deleted from the spacer sequence (region A, above) which she and Brown had mapped out, exhibiting precisely the resultant changes in the nucleotide sequences.⁴⁸ Towards the end of the postdoctoral phase of her research, ⁴⁸ N. Fedoroff, ‘Structure of Deletion Derivatives of a Recombinant Plasmid Containing the Transposable Element Tn9 in the Spacer Sequence of Xenpus laevis 5S DNA,’ Cold Spring Harbor Symposium on Quantitative Biology, 43 (1979), 1287–92.
120
chemistry and geometry
Fedoroff met McClintock and became deeply interested in her program, and decided to make use of her own skills in molecular biology to take McClintock’s research on maize ‘to the next stage.’ A series of studies of maize followed. One is a description at the molecular level of the effects of the controlling element Ds on the structure and expression of the Sh locus; this article is dominated by images of the results of experimental processes like electrophoretic analysis and blot hybridization analysis.⁴⁹ These images—like the autoradiographs just discussed—are almost entirely symbolic in import, and the processes that produce them (pulverization, purification, freezing, hybridization, linearization, fractionation, digestion by enzymes, precipitation, drying, and so forth) are highly chemical in the sense that they break down biological organization in order to free up the chemical components. Another is a report of the molecular similarity of Ac and Ds (‘the 4.1 kb Ds is almost completely homologous to the Ac element, differing by a central deletion of less than 0.2 kb’) in light of a description of their insertion in the Wx locus; here too images of blot hybridization analysis play a key role, as well as site maps of pertinent DNA fragments and drawings based on electron micrographs of those fragments.⁵⁰ A last example in the series is a determination of the full nucleotide sequence of Ac along with evidence that its ‘open reading frame 1’ (the first of two) encodes a transposase, as well as evidence of strong structural similarity between the maize transposon and the bacterial transposon Tn3.⁵¹ It is dominated by a nucleotide sequence which takes up two full pages of text. Fedoroff began to work on the maize Spm transposon when she became a staff member at the Carnegie Institution of Washington. The Spm transposon is involved both in processes of transposition and in a regulatory system, and so provides an instructive case study of how these two distinct functions (transposition and regulation) may be interrelated.⁵² The exposition of the Spm controlling element family plays an important role in the last pages of the essay ‘Controlling Elements in Maize’ mentioned above, which ⁴⁹ N. Fedoroff, J. Mauvais, D. Chaleff, ‘Molecular Studies on Mutations at the Shrunken Locus in Maize Caused by the Controlling Element Ds,’ Journal of Molecular and Applied Genetics 2 (1983), 11–29. ⁵⁰ N. Fedoroff, S. Wessler, and M. Shure, ‘Isolation of the Transposable Maize Controlling Elements Ac and Ds,’ Cell, 35 (1983), 235–42. ⁵¹ N. Fedoroff, R. Polhman, and J. Messing, ‘The Nucleotide Sequence of the Maize Controlling Element Activator,’ Cell, 37 (1984), 635–43. ⁵² N. Fedoroff, M. Schlappi, and R. Raina, ‘Epigenetic regulation of the maize Spm transposon,’ Bioessays, 17 (1994), 291–7.
genetics and molecular biology 121 leads off the Shapiro volume designed to give a full-dress presentation of McClintock’s work on transposition to the scientific community. Investigating the Spm element at the molecular level, Fedoroff came to see that she must respectfully differ with McClintock, even as she tried to elaborate McClintock’s program in genetics in terms of molecular biology. The article begins with an extended discussion of the reproductive biology of maize, which counters Watson’s claim that ‘organisms such as the corn plant, the mouse, and even the fruit fly, Drosophila, were not suitable for chemical investigations of gene-protein relations’ by showing how and why the development of maize in fact allows it to wear genetically significant events on its sleeve, or more precisely, on its endosperm and pericarp.⁵³ It is replete with pictures of corn kernels exhibiting various kinds of significant mosaicism. This is followed by an overview of the Ac-Ds, Spm, and Dt controlling element families, and then a long section on Ds and Ac, where only a few images typical of molecular biology appear, although the footnotes lead from McClintock’s own work to that of her successors, including articles by Fedoroff in which such images frequently appear.⁵⁴ The section on the Spm controlling element family and the concluding pages counter or qualify a claim that McClintock often made in the 60s and 70s, namely that systems of transposable elements were regulatory systems.⁵⁵ For example, in a paper presented to the Brookhaven Symposium on Genetic Control of Differentiation, ‘The Control of Gene Action in Maize,’ McClintock writes, ‘That genetic mechanisms are involved in the control of actions of genes is now well established,’ and that in particular ‘This report will be concerned mainly with control of gene action by regulatory systems whose elements have been identified and characterized.’ And she concludes, It should again be stated that limitations of space preclude a comprehensive review of the properties of known controlling elements in maize, or of the relation of the described systems to those in other organisms. It is hoped, nevertheless, that the outlined modes of operation of these systems may serve to indicate not only their extraordinary versatility in regulating gene action during development, but also their potential economy. A number of different genes, related or unrelated in function, can come under the control of a single regulatory system; and neither ⁵³ Fedoroff, ‘Controlling Elements in Maize,’ 1–11. ⁵⁴ Fedoroff, ‘Controlling Elements in Maize,’ 16–40. ⁵⁵ Fedoroff, ‘Controlling Elements in Maize,’ 40–53.
122
chemistry and geometry
their times of action during development nor the levels of their action at any one time need be the same.⁵⁶
Here McClintock conflates the action of transposition and that of regulation, as if she hoped that transposition would be the key to gene regulation. But what Fedoroff (and others involved in similar research) discovered was that transposition and regulation must be distinguished, and described at the molecular level, before their interrelation or interaction could be properly discerned. At the end of ‘Controlling Elements in Maize,’ Fedoroff writes, McClintock focused on the interactions between transposable elements and genes, attributing a regulatory function to the elements ... Although it is evident that controlling elements have genetic mechanisms for sensing developmental time and position, the view that they are fundamental to gene regulation has not been widely accepted.⁵⁷
There are a few well-documented cases where the rearrangement of DNA is known to regulate gene expression—indeed, Fedoroff’s own studies of Spm pointed in that direction. But Fedoroff emphasizes that this is a set of problems still to be explored, and not to be decided by fiat. Her 1994 article ‘Epigenetic Regulation of the Maize Spm Transposon,’ published a decade later, describes the regulatory function of the Spm transposon in molecular terms, a function which is clearly to be distinguished from its ability to transpose: Spm is epigenetically inactivated by C-methylation near its transcription start site. We have investigated the interaction beween TnpA, an autoregulatory protein that can reactivate a silent Spm, and the promoter of the element. The promoter undergoes rapid de novo methylation and inactivation in stably transformed plants, but only if it includes a GC-rich sequence downstream of the promoter. TnpA activates the inactive, methylated promoter and leads to reduced methylation. By contrast, TnpA represses the active, unmethylated Spm promoter ... TnpA is therefore a unique regulatory protein with a conventional transcriptional repressor activity and a novel ability to activate a methylated, inactive promoter.⁵⁸ ⁵⁶ ‘The control of gene action in maize,’ Brookhaven Symposia on Biology, 18 (1965), 162 and 181. ⁵⁷ Fedoroff, ‘Controlling Elements in Maize,’ 56. ⁵⁸ Fedoroff et al., ‘Epigenetic regulation of the maize Spm transposon,’ Summary.
genetics and molecular biology 123 The figures in this article include both schematic representations, cartoons, of the action of TnpA on the nucleotide sequence, and highly ‘chemical’ charts of the results of an ongoing experimental research program, used in tandem to support the authors’ conclusions. (Figures 4.8a and b include Figures 1 and 2 from the article.)
Figure 4.8a. N. Fedoroff et al., ‘Epigenetic Regulation of the Maize Spm Transposon,’ Figure 1
4.6. The Future of Molecular Biology In the early 1980s, Nina Fedoroff published two important expositions of Barbara McClintock’s work: the 1983 essay discussed above aimed at the scientific community, and another from 1984, a popularization published in Scientific American.⁵⁹ The spectrum of figures and illustrations that Fedoroff employs in these articles reflect the bridging that she accomplished in translating the work of McClintock into the idiom of molecular biology. For example, the 1983 essay begins with drawings of a maize kernel, a mature plant, a pair of spikelets with anthers, and the growing egg cell in its embryo sac, as Fedoroff explains at length why maize was a particularly ⁵⁹ ‘Transposable Genetic Elements in Maize,’ Scientific American, 250 (1984), 84–98.
124
chemistry and geometry
Figure 4.8b. N. Fedoroff et al., ‘Epigenetic Regulation of the Maize Spm Transposon,’ Figure 2
happy choice of experimental plant for McClintock, and illustrates her point with a photograph of an ear of maize. These photographs are at once iconic, recording the biologically significant organism and its phenotypical features that are the concern of genetics, and symbolic, since the patterns on the kernels record events that are the subject of molecular biology. In the pages that follow, cartoons like that given in Figures 4.8a and b are conventions for representing genetic mechanisms that molecular biologists habitually use. They hover between ‘summaries of experimental data and diagrams of real microstructure’⁶⁰ and are therefore also symbolic and iconic, though in this case what is pictured is microscopic, and the macroscopic experimental data being summarized is symbolic in import. The Scientific American article begins with many beautiful color illustrations of maize ears and individual kernels, and then turns to the ‘cartoons’ of molecular biology, including especially impressive depictions of the Ac and several Ds elements of maize, and the mechanism of cycling Spm. ⁶⁰ Keirns, ‘Seeing Patterns,’ 177–80.
genetics and molecular biology 125 Fedoroff’s molecular biology (as well as, of course, the work of other colleagues) was able to answer certain questions that McClintock’s genetics could formulate but not fully investigate. This is a pattern of a localized but not thoroughly global or completable relation between domains of research that Darden and Maull aptly dub ‘interfield theories.’ And indeed it is still true that genetics continues to generate problems (and solutions) that cannot be directly formulated or solved in terms of molecular biology, but must be addressed by other means. The ‘mechanisms’ of selection appear to operate at the level of the genome (where there are biochemical processes analogous to reproduction), the individual, and the group. To investigate precisely the rates and effects of spontaneous mutation and its evolutionary consequences may require, for example, statistical methods (and a great deal of counting and measuring in the laboratory, field, or greenhouse) to infer the genomic mutation rate and distribution of mutational effects in successive generations of a large population of macroscopic individuals, taking into account differences in proliferation as well as differences in survival.⁶¹ Much of this research may, and indeed must, be conducted without paying attention to changes at the molecular level, even when it deals with an organism whose genome has been completely mapped. ⁶¹ See, for example, R. Shaw, D. Byers, and E. Darmo, ‘Spontaneous mutational effects on reproductive traits of Arabidopsis thaliana,’ Genetics, 155 (2000), 369–78.
5 Chemistry, Quantum Mechanics, and Group Theory The mid-twentieth-century account of theory reduction, like the deductive-nomological model of explanation, served the useful function of providing a simple model of an important kind of scientific reasoning expressed in a certain symbolic language, the theories of predicate logic. However, I have been arguing that philosophers of science were carried away by their enthusiasm for predicate logic as an instrument of philosophical analysis, as were philosophers of mathematics a generation earlier, an enthusiasm that limited their ability to address several important features of scientific and mathematical rationality, and indeed of predicate logic itself as a vehicle of that rationality. The case study that I investigate in this chapter, that of a widely used methodology in chemistry applied to a particular problem, is designed to reveal a striking short-sightedness in the philosophical account of the use of symbols. The method ‘reduces’ a physical system (like a molecule) to its group of symmetries, thence to a collection of matrices, thence to sets of numbers which can be used to compute important features of the physical system.
5.1. Symbols, Icons, and Iconicity My examination of how this analysis helps to solve problems in chemistry supports the following claims. (1) Other important formal idioms besides predicate logic organize science and mathematics, reducing spatial configuration and dynamic processes to numerical computation; they make possible analyses that are quite different from those offered by predicate
chemistry, quantum mechanics, and group theory 127 logic. When different formal languages are used to analyze intelligible objects, they reveal different kinds of conditions of intelligibility. (2) The use of symbolic notation to investigate e.g. a chemical object typically makes use of iconic representations in tandem with the symbolic notation, and their conjunction is mediated and explained by natural language. (3) While symbolic notations may in certain carefully defined situations be treated as uninterpreted and manipulated in ‘purely formal’ ways, their rational deployment in the sciences—as in mathematics—requires that their interpretations be present and reinstated into the problem context, and these presences and reinstatements are often, though not always, indicated by means of icons. (4) Symbolic notations themselves have spatial and iconic dimensions that play important (and irreducible) roles in the knowledge they help to generate. We saw this in the chemical table and in the nucleotide sequences of genes, where not only horizontal but also vertical correspondences in the representations exhibited important features of the gene; horizontal correspondences were rather more iconic (representing spatial side-by-sideness or addition of component parts), and vertical correspondences were more esoteric. (5) A representation that is symbolic with respect to one kind of thing may become iconic with respect to another kind of thing depending on context. The use of numerals in chemical applications of group theory, for example, are symbolic with respect to molecules but have iconic features with respect to numbers; and if a computer programmer is inspecting them as part of a formal language (considered as an object of mathematical study) then they are in a different way even more iconic. Whether, or to what extent, a representation is iconic or symbolic cannot be read directly off the representation, but must take into account the discursive context—the context of use. To be effective in the reductive study of molecules by representation theory, those numerals for example must refer ambiguously to molecules in one sense and to numbers in another. Likewise, a hexagon must represent ambiguously a geometric figure, a molecule, a purified substance in the laboratory, a cyclic group, and a character table. The tincture of iconicity that nuances every symbolic representation is important to my claim that there is no single correct symbolic representation of an intelligible thing. Icons, precisely because they are like the things they represent, are distortions, and distortions may take many forms; iconicity introduces style, and styles are manifold and changeable. Symbols, though
128
chemistry and geometry
they are relatively unlike the things they represent, must nonetheless share certain structural features with what they represent in order to count as representations at all, and just as there may be more than one set of structural features essential to the thing (hence different symbolizations that capture them), there may also be more than one way to represent a given set of structural features. Consider the representation of the natural numbers by Roman and Arabic notation, and by the notation 0, S (0), SS (0), ... , and so on. (I discuss this issue in Chapter 10.) Moreover, as we have seen, in general symbolic representations must be supplemented by icons and natural language to function as symbols. Another way to make the point is to observe that symbols are always icons of idealized versions of themselves: the symbol (x) f (x) is also an icon of a certain well formed formula of predicate logic, considered as a mathematical thing. G¨odel shows us that every well formed formula can also be represented (much more symbolically) by an integer whose prime decomposition encodes information about it, and that this representation yields important facts about the axiomatized systems of predicate logic, in particular, the results known as G¨odel’s two Incompleteness Theorems. In many areas of human endeavor, discourse precipitates new intelligible things (laws, university charters, Hamlet, the sonnet) to add to the furniture of the world. What is distinctive about mathematics is that things precipitated by its discourse (like the well formed formulas of predicate logic, considered as objects of study rather than just as modes of representation) are highly determinate, and come to stand in highly determinate relations with things that were already there, like the natural numbers, as happens in G¨odel’s proof. And they are moreover not exhausted or summarized by the notations that precipitated them; they typically turn out to have further features whose investigation requires other kinds of representations in order to succeed.
5.2. Representation Theory Representation theory studies a group by mapping it into a group of invertible matrices; it is a kind of linearization of the group. (This technical term in mathematics unfortunately uses a philosophical term central to this book, which appears on almost every page, but I think it will be clear
chemistry, quantum mechanics, and group theory 129 from context when I am using the technical term.) Another way to formulate this definition is to say that a representation of a group G, which maps elements homomorphically from G into a group of n × n invertible matrices with entries in (for example, though it could be other fields) R, sets up the action of G on the vector space Rn and so directly yields an RG-module. (These terms are explained below.) Representation theory thus combines the results of group theory and linear algebra in a useful way.¹ The application of representation theory is an inherently reductive procedure, because it offers a means of studying highly infinitary and nonlinear mathematical entities (for example, all the automorphisms of the algebraic completion of the rationals that leave the rationals invariant), as well as complex physical entities like molecules, in terms of finite and linear entities, matrices, that also lend themselves well to computation. And since automorphism groups can be used to measure the symmetry not only of geometrical figures, but also of physical systems and algebraic systems, the applications of representation theory are many and various. The reductive study of molecules in terms of symmetry groups is not a theory reduction in the sense of explaining molecular structure and behavior in terms of group theory by assuming that molecules are ‘really’ mathematical objects. (This is the odd thesis found in the cosmogony of Plato’s Timaeus, where atoms are literally supposed to be composed of triangles).² It is instead reductive analysis: a problem involving molecules can be approached by first solving a problem about certain groups and matrices. In order for the problem reduction to be effective, some of the representations involved must in certain respects stand for molecules, in other respects for mathematical objects, and in yet other respects for macroscopic things in the laboratory. Moreover, it must be true that a molecule has a certain geometrical configuration in space which exhibits certain symmetries; a molecule has its configuration truly but approximately, since it vibrates and since its components are ‘here, now’ only in a highly qualified sense. And its components are a heterogeneous collection of atoms, themselves energetic systems interacting in complex ¹ See, for example, G. James and M. Liebeck, Representations and Characters of Groups (Cambridge: Cambridge University Press, 1993), ch. 3 and 4. ² See my discussion in Sec. II of ‘Plato and Leibniz against the Materialists,’ Journal of the History of Ideas, 57(2) (April 1996), 255–76.
130
chemistry and geometry
ways. Despite all these qualifications, geometric shape is an indispensable condition of intelligibility for chemical investigations of molecular structure. Reductive analysis—what Peirce called creative abduction—is distinctive, because it tries to understand something in terms of what it is not: molecules in terms of groups of symmetry operations, groups in terms of matrices, matrices in terms of numbers (characters and determinants). Reality presents things that exist; and to exist is to be intelligible, to be possible for thought. However, intelligibility is not transparency, for the intelligibility of things is always problematic: things don’t explain themselves. Things must be investigated to be understood, and investigation requires precisely articulated and organized representations. In the cases reviewed here, the representations have an algebraicity that allows them both to be provisionally detached from their referents and to refer ambiguously, allowing unlike things to be thought together. Formal languages may precipitate further novel, intelligible things, with their own expressive limitations and their own problematicity. The instruments of reorganization may add something to the inventory of the universe, which, while helpful in solving problems, can also introduce new kinds of problems. Because algebraic notations refer ambiguously and also draw attention to themselves as they refer, they stand both for ways of organizing other things (and typically more than one kind of thing) and for themselves as intelligible things. As we shall see in the case study of this chapter, a hexagonal representation of a benzene molecule is an iconic representation of the shape of that molecule, but it is also the key to a series of symbolic representations in a highly refined strategy—the LCAO (linear combination of atomic orbitals) method—for making quantum mechanical calculations about the structure of that molecule more tractable, a method that exploits the constraints imposed by the symmetries of the system. Thus too predicate logic precipitates both the study of certain kinds of recursively constructed formulas in the theory of recursive functions, and also model theory, the study by logical means of other mathematical fields, reorganized when they are recast into the idiom of a theory in predicate logic. This duality of formal languages is what G¨odel exploits in his great proofs. Whether the employment of a representation is symbolic or iconic cannot be read off the representation: pictures have both symbolic and iconic uses depending on context, and so do obviously ‘algebraic’ representations.
chemistry, quantum mechanics, and group theory 131
5.3. Molecules, Symmetry, and Groups Symmetry is measured by the number of automorphisms under which an object or system remains invariant. The bigger the automorphism group, the more symmetrical the thing is. For example, if we consider a molecule to be a certain figure in R3 , the pertinent set of automorphisms is all the isometries of R3 (structure-preserving mappings of R3 to itself that in particular preserve distances); and the isometries under which that figure remains invariant form a group, which is called its symmetry group. Every physical system has a symmetry group G, and certain vector spaces associated with the system turn out to be RG modules. (An RG module is a vector space over R in which multiplication of elements of the group G by elements of R is defined, satisfying certain conditions; there is a close connection between RG modules and representations of G over R.)³ For example, the vibration of a molecule is governed by various differential equations, and the symmetry group of a molecule acts on the space of solutions of these equations. Another example concerns electronic internal energy: the kinds of energy levels and degeneracies that a molecule may have are determined by the symmetry of the molecule. Symmetry conditions alone will tell us what the qualitative features of a problem must be: how many energy states there are, and what interactions and transitions among them may occur under the action of a given field. The chemist qua mathematician (deploying group theory, representation theory, geometry, and analysis) can determine that transitions between certain quantized states (leading to absorption or emission of light) in the electronic or vibrational spectrum of a molecule are ‘allowed’ (and others are ‘forbidden’), but to learn how great their intensity will be requires experiment and calculation aided by computer. A further example is molecular orbital theory, which seeks out solutions of the Schr¨odinger equation by approximating them as ‘molecular orbitals,’ linear combinations of atomic orbitals that extend over the entire molecule, so that the electrons occupying these orbitals may be delocalized over the entire molecule. Considerations of molecular symmetry properties are extremely useful in this theory and allow the chemist ³ James and Liebeck, Representations and Characters of Groups, ch. 4, 5, and 30; see also K. Mainzer, Symmetries of Nature: A Handbook for Philosophy of Nature and Science (Berlin and New York: Walter de Gruyter, 1996), ch. 2 and 4.
132
chemistry and geometry
to draw many conclusions about bonding with few or highly simplified quantum computations.⁴ In the case study, I examine an instance of the latter kind of problem solving. It is the study of benzene, C6 H6 (and a family of related carbocyclic molecules, C4 H4 , C8 H8 , C10 H10 , and so on) by means of a theory that competes with the valence bond theory: the molecular orbital theory, which analyzes molecular orbitals as linear combinations of atomic orbitals. I borrow my account from textbooks as well as from some of the original chemical articles (the interaction of the mathematics and empirical results is tidier in the former and messier in the latter) and show how different traditions of ‘paper tools,’ some mathematical and some chemical, are composed or superimposed on the same page. So that the meaning of the symbols and icons will be clearer, I will briefly explain some aspects of representation theory, and the real and complex analysis used by quantum mechanics. As is well known, group theory was articulated in the work of Felix Klein in his inaugural address at Erlangen in 1872, and his research project that became known as the Erlanger Programm. It took its inspiration from the study of symmetries in geometry, and classified geometries in terms of those properties of figures that remain invariant under a particular group of transformations. Thus, Euclidean geometry is characterized by the group of rigid transformations (rotations and translations) of the plane, and projective geometry by the group of projective transformations. These infinite groups contain finite groups that can be associated with the symmetries of individual figures, like the square or equilateral triangle.⁵ Modern textbook expositions of group theory include very little in the way of iconic representations of the geometrical figures that gave rise to the theory in the first place. A Survey of Modern Algebra by Garrett Birkhoff and Saunders MacLane⁶ makes use of a picture of a square, a regular hexagonal network, and a rectangle to introduce group concepts, but the chapter on group theory in I. N. Herstein’s Topics in Algebra⁷ has no figures at all and even avoids geometrical examples in the problem sets. Representations and Characters of Groups by Gordon James and Martin Liebeck contains only one ⁴ F. A. Cotton, Chemical Applications of Group Theory, 3rd edn (New York: John Wiley and Sons, 1990), part II (Applications). ⁵ Mainzer, Symmetries of Nature, ch. 2, sec. 2.31. ⁶ (New York: MacMillan, 1953). ⁷ (Waltham, MA, and Toronto: Xerox College Publishing / Ginn and Company, 1964).
chemistry, quantum mechanics, and group theory 133 figure (a regular n-sided polygon, only schematically indicated) in the first 29 of its 30 chapters. Geometrical figures finally appear in the last chapter, which is devoted to the application of representation theory to chemistry! And indeed, Chemical Applications of Group Theory by F. Albert Cotton (a popular textbook whose many editions testify to its continued utility in chemistry) contains more geometrical figures than all the textbooks put together that I used as a student of mathematics at the University of Chicago forty years ago when Bourbaki held sway. But of course the geometry is re-organized because the geometrical figures have been reinstated in order to represent various symmetries and spatial configurations pertinent to molecules; what Euclid sought in his triangles, hexagons and tetrahedra was quite different from what these authors seek in theirs, though like Descartes (see Chapter 6) they must presuppose and use Euclid’s results. Only five types of symmetry elements and operations need to be considered in the study of the molecular symmetry of individual molecules. A symmetry plane passes through a molecule; if, when one drops a perpendicular line from each atom in the molecule to the plane, extends that line an equal distance in the other direction, and moves the atom to the other end of the line, an equivalent configuration results, that plane is a symmetry plane. The operation of reflection—which takes every (x, y, z) to (x, y, -z) given an appropriate choice of coordinate system—is called σ, and the identity operation is designated E. Clearly, σn = E when n is even and σ when n is odd. A regular tetrahedral molecule AB4 possesses six planes of symmetry; a regular octahedral molecule AB6 possesses nine. If the mapping (x, y, z) to (-x, - y, - z) (again with an appropriate choice of coordinates) takes the molecule into an equivalent configuration, then it has an inversion center and the operation is inversion, designated i. Once again, in = E when n is even and i when n is odd. Note that octahedral AB6 and planar AB4 each have a center of inversion, but tetrahedral AB4 does not. If, when a molecule rotates about an axis by 2π/n degrees, it moves into an equivalent configuration, then the axis is called a proper axis and the operation of rotation a proper rotation: it is symbolized Cnm , which means that the rotation 2π/n has been carried out m times. The molecule H2 O possesses a single twofold axis; a regular tetrahedral molecule AB4 possesses three twofold axes and four threefold axes. Improper axes and rotations also exist: they are the result of effecting first a proper rotation and then a reflection in a plane perpendicular to the rotation axis. These operations
134
chemistry and geometry
are analogously designated Snn . It is clear that Cnn = E and Snn = σ while Sn2n = E. Tetrahedral AB4 possesses three C2 axes, as just mentioned, and each of these is simultaneously an S4 axis (see Figure 5.1).⁸
Figure 5.1. F. A. Cotton, Chemical Applications of Group Theory, Figure 3.2
All symmetry operations on a given molecule return the molecule to an equivalent state; if we think of the product of two such operations as the effecting of first one and then the other, written right to left (so that YX = Z means first X is applied and then Y), then clearly a complete set of symmetry operations for a particular molecule can be thought of as a closed algebraic structure, and indeed as a group that acts on an n-dimensional vector space: an RG-module. There is a nomenclature for the kinds of symmetry groups that chemists encounter among the molecules they study. (The symmetry groups pertinent to the molecules of interest to chemists are for the most part finite groups, though some are infinite, like the group of proper rotations of a linear molecule, and the linear and planar space ⁸ Cotton, Chemical Applications of Group Theory, ch. 3. I would like to thank F. Albert Cotton for granting me permission to follow the exposition in his textbook very closely, here and in the next section. I could not have made my philosophical points without this borrowing. I would also like to thank Roald Hoffmann for extensive comments on this chapter, and for introducing me to Cotton’s book via the link of Representation Theory, which is so important to Wiles’ and Ribet’s proof of Fermat’s Last Theorem and also the key to Cotton’s textbook.
chemistry, quantum mechanics, and group theory 135 groups of crystals, or one-, two-, and three- dimensional polymers, or extended metallic structures.) The special groups corresponding to linear molecules are designated C∞ν and D∞ν ; those corresponding to molecules with multiple higher-order axes are designated T, Th , Td , O, Oh , I, and Ih . Groups corresponding to molecules with no proper or improper rotation axes are designated C1 , Cs , Ci , and those with only Sn (n even) axes S2n . Finally, groups corresponding to molecules with a Cn axis that is not a simple consequence of an S2n axis are designated Cnh , Cnv , Cn and Dnh , Dnd and Dn , depending on various conditions. For example, water (H2 O) belongs to the group C2ν ; allene (C3 H4 ) belongs to the group D2d .⁹
5.4. Symmetry Groups, Representations, and Character Tables As just noted, the convention has been established to employ five types of operations in describing the symmetry of a molecule: E, σ, i, Cn , Sn (along with various subscripts, superscripts, and primes, details not pertinent here). Each of these operations can be associated, for example, with a 3 × 3 matrix (multiplying a vector describing an arbitrary point (x, y, z)) in a straightforward way. The identity operation E may be expressed as 1 0 0 and the inversion operation i as −1 0 0 0 1 0 0 −1 0 0 0 1 0 0 −1 while the matrix for the reflection operation σ depends on which Cartesian plane we choose as the plane of reflection. Appropriate 3 × 3 matrices for the rotations Cn and Sn can easily be constructed if we recall that the matrix for clockwise rotation around one axis through an angle θ can be expressed as cos θ sin θ − sin θ cos θ ⁹ Cotton, Chemical Applications of Group Theory, ch. 3, sec. 3.11.
136
chemistry and geometry
and for counter-clockwise rotation as cos θ − sin θ sin θ cos θ All of these matrices are invertible: their inverses are obtained by transposing rows and columns. Thus, a possible representation of the group C2ν (the point group of oxygen, H2 O) which contains the four operations E, C2 , σν and σν will be the following group of matrices: E: 1 0 0 0 1 0 0 0 1
C2 : −1 0 0 0 −1 0 0 0 1
σν : 1 0 0 0 −1 0 0 0 1
σν : −1 0 0 0 1 0 0 0 1
It is straightforward to check that the group multiplication table for the four operations maps in the right way to the multiplication of the matrices; and each operation and its inverse map in the right way to a matrix and its inverse. The representation we have just constructed takes each element of the group C2ν to a certain 3 × 3 matrix, allowing us to interpret the action of C2ν on the vector space R3 in terms of matrix multiplication (of a 3 × 1 matrix by a 3 × 3 matrix), and constitutes RG as a module.¹⁰ But this representation is only one of an unlimited number of possible representations that use n × n matrices for some n, depending on how many dimensions we need to characterize the molecule in a given situation. In cases where we want to characterize the molecular orbital in terms of n atomic orbitals (or, in terms of certain kinds of linear combinations of atomic orbitals), the matrices may be very big. Of the many possible representations of a given symmetry group, those that are of fundamental importance, however, are called ‘irreducible.’ An irreducible representation is an RG-module that is non-zero and has no RG-submodules apart from {0} and itself; ρ : G → GL (n, R) (the n × n matrices with entries in R) is irreducible if the corresponding RG-module is irreducible. Any reducible representation (a set of n × n matrices) can be rewritten in terms of irreducible representations. The canonical representation will be a set of m × m matrices all block-factored along the diagonal in the ¹⁰ Cotton, Chemical Applications of Group Theory, ch. 4, sec. 4.2.
chemistry, quantum mechanics, and group theory 137 same way, where corresponding blocks are irreducible representations of possibly varying size and with possible repetitions. Thus if for a given symmetry group there are four one-dimensional irreducible representations and one two-dimensional irreducible representation, any reducible n × n representation of that group can be mapped by a similarity transformation to the canonical representation, a set of 6 × 6 matrices with zeroes everywhere except for four appropriately chosen 1 × 1 matrices (just numbers) and one 2 × 2 matrix along the diagonals of each. The canonical representation is thus itself reducible, but exhibits the irreducible representations in what we might call iconic fashion. The role played by canonical forms in symbolic systems is often to exhibit structure in an especially perspicuous way; and this notion of perspicuity has little to do with physical vision. It has to do rather with the way in which complex things (here, reducible sets of matrices) may be analyzed in terms of simpler things (irreducible sets of matrices). Canonical forms exhibit the results of analysis; and this display may be called, in an abstract sense, iconic. The trace of an n × n matrix (the function from the set of n × n matrices to R indicated by tr) is the sum of its diagonal entries. The function χ : G → R, called the character of a representation ρ : G → GL (n, R), is defined in terms of the function tr. Independent of the choice of basis for the vector space, it is defined as the character of the corresponding RGmodule, χ(g) = tr (gρ) for all g in G; we then say that χ is an irreducible character of the group G if χ is the character of an irreducible RGmodule. Chemists construct ‘character tables’ that exhibit the characters of each symmetry operation of a given group (like C2ν above) at each irreducible representation, which can be used in light of the following remarkable theorems: The sum of the squares of the dimensions of the irreducible representations of a group is equal to the order of the group (how many distinct elements it has), and so is the sum of the squares of the characters in any one irreducible representation of the group. The number of irreducible representations of a group is equal to the number of distinct classes in the group. Finally, if you think of the characters of two distinct irreducible representations as components of two vectors, those vectors are orthogonal. The canonicity of irreducible representations means that a great deal of important information can be read directly out of or into character tables.
138
chemistry and geometry
Symmetry considerations and the theory of group representations immensely reduce the labor required for effective calculation of the properties of molecules, as we will see in the next section. A change in the geometry of a molecule results in significant physical alterations: things that did not interact before suddenly begin to interact, and these changes are reflected, or predicted, in the ‘register’ of the irreducible representations, which of course shift with the geometry. The antireductionist drift of this observation can be made even more strongly, as Robert Bishop does in his essay ‘Patching Physics and Chemistry Together.’ He writes, ‘Molecular structure or shape plays a crucial causal role in chemistry. It dominates the interpretations of the calculations and experiments of chemists. More importantly, it has empirical and practical import.’ Citing various examples where the causal import of the shape and in particular the chirality of molecules is paramount, he goes on to remind us that ‘the ‘‘true’’ molecular Hamiltonian at the level of quantum mechanics—could we actually write it [its solutions] down—would not exhibit any features corresponding to molecular structure ... ’ Thus, the requirement of a variety of modes of representation in applications of quantum mechanics to chemistry is neither a superficial demand nor an accident of history: it expresses a fundamental differentiation (both epistemological and ontological) between the two sciences.¹¹ The textbook Chemical Applications of Group Theory comes equipped with ‘Character Tables for Chemically Important Symmetry Groups’ twice: once as an appendix and once reprinted as a handy booklet in the pocket of the cover. Most applications worked out in detail in the book (with the exception of the treatment of crystals) involve a character table at some point. Here is a simple example. Recall that the symmetry group for water is C2ν ; it has four elements, E, C2 , σν , and σν . Each element is in a separate conjugacy class of the group, and since (as noted above) the number of irreducible representations of a group is equal to the number of conjugacy classes in the group, there are four irreducible representations for this group. Moreover, since (as noted above) the sum of the squares of the dimensions of the irreducible representations of a group is equal to the order of the group, these four irreducible representations must all be ¹¹ R. C. Bishop, ‘Patching Physics and Chemistry Together,’ Philosophy of Science, 72(5) (2006).
chemistry, quantum mechanics, and group theory 139 one-dimensional. The character table for the group can be easily worked out and is: C2ν A1 A2 B1 B2
E 1 1 1 1
C2 1 1 −1 −1
σν 1 −1 1 −1
σν 1 −1 −1 1
The nomenclature in the leftmost column is a set of so-called Mulliken symbols: one-dimensional representations are designated A or B, depending on whether they are symmetric or antisymmetric with respect to rotation by 2π/n around the principal Cn axis; the subscripts 1 and 2 indicate that the representation is symmetric or antisymmetric with respect to a C2 axis perpendicular to the principal axis or to a vertical plane of symmetry. Twodimensional representations (there aren’t any in this case) are designated E, with a further set of subscripts and superscripts to distinguish kinds of two-dimensional representations; three-dimensional representations are designated T with appropriate subscripts and superscripts, and so forth. Characters are used to facilitate the reduction of a reducible representation, that is, they can be used to find, or circumvent the need to find, the similarity transformation which will reduce each matrix of a reducible representation to another belonging to the set of matrices in canonical form, with mostly zero entries and block-factored in the appropriate way along the diagonal. These tables are thus abridgments of an enormous amount of information from geometry, group theory, matrix algebra, and representation theory in relation to the field of the real (and sometimes complex) numbers.
5.5. The Benzene Ring and Carbocyclic Systems I come now at last to the case study. The investigation of benzene and related carbocyclic molecules by means of LCAO-MO theory, where molecular orbitals are conjectured in terms of linear combinations of atomic orbitals, is treated in Chapter 7 of Cotton’s Chemical Applications of Group Theory. The exposition is based on a series of chemical articles by Robert Mulliken,
140
chemistry and geometry
Robert Parr, F. Hund, Erich H¨uckel, and others beginning in the 1930s and continuing into the 1960s when the first edition of Cotton’s textbook appeared. The leading competitor theory was valence bond theory, developed in the late 1920’s by Walther Heitler and Fritz London, and then by John Slater and Linus Pauling. Valence bond theory actually provided the first quantum mechanical description of the benzene molecule, and in the 1930’s fared better than the LCAO-MO explanation provided by E. H¨uckel, thanks mainly to the charismatic presentation offered by Pauling. But later the tide turned; valence bond theory was not easily extended to molecules larger than benzene, and LCAO-MO theory made important, speculative predictions that were then verified. Apropos the benzene molecule, the LCAO-MO theory entails that the six p orbitals (perpendicular to the plane of the molecule) interact by lateral overlap to give six molecular orbitals, three bonding and three anti-bonding; taken together, they constitute the ‘extended π bond.’ According to the ‘pi system’ and Pauli exclusion principle, the six pertinent electrons will fill the three bonding orbitals, each of which is delocalized over all six carbon atoms. The application of LCAO-MO theory to benzene is an especially nice example for my purposes, because it clearly exhibits the scientific importance and fruitfulness of the reduction under consideration, while illustrating the theses I am trying to develop about symbolic and iconic representations. The representations used here are obviously highly symbolic; numbers, differential equations, and constructed orbitals are not ‘like’ and do not picture molecules. Yet these symbolic representations must be used in tandem with iconic representations (geometric figure, in this case a hexagon) which are essential to their use, given the causal importance of shape, and which require a context in natural language to explain the rational relation between icons and symbols. Moreover, the symbolic expressions themselves have an iconic dimension, in a sense that need not involve the visual-perceptual, as when canonical expressions exhibit the results of analysis. The use of molecular orbital theory, using the LCAO method, requires character tables; differential equations (wave equations) involving the Hamiltonian operator, the energy of the system and pertinent wave functions; geometrical forms augmented to represent molecules and their orbitals; and ‘theoretical’ energy level diagrams: the chemist strives to bring the theoretical computations into relation with tables of empirical results produced in lab.
chemistry, quantum mechanics, and group theory 141 To bring quantum mechanics to bear on chemistry, in order to solve certain especially puzzling problems, atoms and molecules must be thought of as functions that are solutions to certain differential equations. Quantum mechanics calls such functions ‘orbitals’, and much ink has been spilled over how orbitals are to be interpreted physically; in broad terms, they indicate where electrons may possibly be located given a certain energy state of an atom. A set of diagrams of increasingly energetic states of the hydrogen atom (one proton, one electron) from a 1931 article in a physics journal, shows the orbitals of the atom. (see Figure 5.2.) These images should be read as three-dimensional clouds surrounding the nucleus of the atom; note that they exhibit determinate symmetries, even though the 1966 textbook in which they is reproduced warns in italics: the shapes of orbitals (electron cloud diagrams) ... are pictorial presentations of the mathematical solutions of the Schr¨odinger equation. They do not represent reality; the shapes are not pictures of electric charges or matter. Also, in wave mechanics, the electron may be regarded as neither particle nor wave. It is an indescribable entity whose properties are (presently) best elucidated by the Schr¨odinger equation.¹²
The textbook claims that the shape of the orbitals doesn’t represent reality. The arguments presented by Robert Bishop, cited briefly above, suggest that while ‘shape’ cannot be properly attributed to orbitals, it may be properly attributed to molecules; and this only shows that the representations of chemistry must be given not only in terms of differential equations but also in terms of geometry and (more broadly) topology. Important structural features of molecules, which may be truly attributed to them, are lost when they are represented by differential equations. Another difficulty also arises: the application of quantum mechanics to chemistry is vexed because the differential equations, which are hard enough to solve in the case of the simplest atom, hydrogen, become rapidly intractable as one moves up the chemical table, even given recent advances in computer technology; and also as one considers the combination of different kinds of atoms in a molecule, trying to determine what the molecular orbitals might be. A good strategy for simplifying the computations involved ¹² The image is taken from H. E. White, Physical Review, 37 (1931), 1416, reproduced in F. Brescia, J. Arents, H. Meislich, A. Turk, Fundamentals of Chemistry: A Modern Introduction, (1966), 184, next to the quotation on 184–6.
142
chemistry and geometry
Figure 5.2. F. Brescia et al., Fundamentals of Chemistry: A Modern Introduction, Figure 8.20
is to consider the symmetries of a molecule—taking into account its component atoms—as constraints on the solutions of such problems, for these symmetries allow one to conclude expeditiously that certain integrals in the calculations must be identically zero and exhibit other integrals in an especially perspicuous form. Despite the fervid disclaimer in the 1966 textbook, orbitals apparently do belong to configurations that are to a certain extent well-depicted by geometrical figures and their associated symmetry groups and irreducible representations, as well as by differential equations. Indeed, if orbitals were really ‘indescribable,’ scientists would have a hard time investigating them. However, to be well described, they require a multiplicity of modes of representation working in tandem.
chemistry, quantum mechanics, and group theory 143 The central equation of quantum mechanics is the wave equation H = E
(Eq. 5.1)
which states that if the Hamiltonian operator H is applied to an eigenfunction , the result will be that same function multiplied by a constant called an eigenvalue, E.¹³ We can initially think of i as an atomic orbital and the corresponding Ei as its energy. If several eigenfunctions give the same eigenvalue, the eigenvalue is called ‘degenerate,’ as when three 2p orbitals in an atom have equal energy; in this case, any linear combination of the initial set of eigenfunctions is a correct solution of the wave equation giving the same eigenvalue. If a symmetry operation (like E, C2 , σν , or σν discussed above) is carried out on a system, clearly the Hamiltonian and the energy of the system must remain unchanged (since the system before is physically indistinguishable from the system after); thus the Hamiltonian operator commutes with any symmetry operation, as well as with any constant factor c. Eigenfunctions are always constructed to be orthonormal, that is, orthogonal to each other and one ‘unit’ in length, so that the integral of the product of a pair of eigenfunctions is identically 0 if i = j and 1 if i = j . The eigenfunctions of a molecule—linear combinations of the atomic orbitals—are bases for irreducible representations of the symmetry group to which the molecule belongs. In the case study at hand, the chemist takes the atomic orbitals of a molecule as a set of orthonormal functions, and seeks to make orthonormal linear combinations of them so that the combinations form bases for irreducible representations of the symmetry group of the molecule, corresponding to the molecular orbitals and so to the energy levels of the molecule. These ‘symmetry adapted’ linear combinations of atomic orbitals (SALCs) are acceptable formal solutions to the wave equation of the molecule, which must then however be cashed out in order to calculate the energies. Once we can write ψk = cik ϕi (denoting the kth molecular orbital by a linear combination of atomic orbitals), this expression can be plugged into the wave ¹³ The equation given here is time independent. Robert Bishop notes that ‘the fundamental equation of quantum mechanics is Schr¨odinger’s time dependent equation; in quantum chemistry, the time independent form is the equation of interest. It only holds when is in a stationary state, but that is the case of most interest to chemists’ (private correspondence).
144
chemistry and geometry
equation written as (H − E)ψk = 0, and the left side integrated over all the spatial coordinates of the wave function. This results, however, in a rather daunting series of equations involving certain integrals called ‘matrix elements’ (described below) which are re-written as a system of homogeneous linear equations for which nontrivial solutions are sought. This process is sometimes streamlined by approximations such as the H¨uckel approximation, which makes certain simplifying assumptions about the integrals, matrix elements, in the wave equation. The Hu¨ ckel approximation posits that all the Hij = ∫ ϕi Hϕj dτ, which record the energies of interaction between pairs of atomic orbitals, as well as all the overlap integrals, Sij = ∫ ϕi ϕj dτ, which in a sense constitute the metric of the basis set, can be assumed to be 0 unless the ith and jth orbitals are on adjacent atoms. (Hii = ∫ ϕi Hϕi dτ gives the energy of the atomic orbital ϕi . Hii is conventionally abbreviated α; Hij is conventionally abbreviated β.) The H¨uckel Approximation, in its attempt to reduce complexity as much as possible, even assumes that the Sij between nearest neighbors is zero. Because the H¨uckel Approximation sets so many integrals equal to zero, the difficulty of computation is greatly reduced. Symmetry considerations further simplify the computations. In the case study at hand, the problem is to describe the internal structure of the planar, hexagonal molecule benzene (C6 H6 ) in terms of the internal bonding of its atoms and in relation to empirical evidence about energy levels or states of the molecule; and this problem can be generalized to include the investigation of other carbocyclic molecules, Cn Hn . Despite a certain revival of valence bond theory in recent times, the present consensus is that, in the case of benzene and related carbocyclic systems, molecular orbital theory provides greater insight into molecular structure than does the valence bond theory. Molecular orbital theory begins, as we have just noted, with the atomic orbital wave functions, and then uses the symmetry constraints imposed by the configuration of the molecule as a whole to determine how the latter combine to make the molecular orbital wave functions. Mathematically, the theory makes use of certain projection operators to construct the requisite ‘symmetryadapted’ linear combinations of atomic orbitals, each of which forms a basis for an irreducible representation of the symmetry group of the molecule, corresponding to a molecular orbital. In order to calculate the energies of the molecular orbitals, the chemist must solve the wave
chemistry, quantum mechanics, and group theory 145 equation for that particular LCAO; the H¨uckel approximation, which as we just saw employs certain simplifying assumptions about integrals in the wave equation, is a useful device for simplifying that calculation. Each molecular orbital then corresponds to a line on an energy level diagram. In the analysis of benzene, the essential atomic orbitals are the six socalled pπ orbitals, perpendicular to the plane of the molecule at every vertex of the hexagon. All the essential symmetry properties of the LCAOs sought are determined by the operations of the uniaxial rotational subgroup C6 (see Figure 5.3).¹⁴ When the set of six pπ orbitals—one on each carbon atom—is taken as the basis for a representation of the group C6 , this character table results:¹⁵ C6 A B E1 E2 φ
E 1 1 1 1 1 1 6
C6 1 −1 ε ε∗ −ε∗ −ε 0
C3 1 1 −ε∗ −ε −ε −ε∗ 0
C2 1 −1 −1 −1 1 1 0
C32 1 1 −ε −ε∗ −ε∗ −ε 0
C65 1 −1 ε∗ ε −ε −ε∗ 0
φ = A + B + E1 + E2 The reducible representation ϕ is decomposed into the irreducible representations A, B, E1 , and E2 . (These are Mulliken symbols, briefly described above.) This decomposition becomes A2u , B1g , E1g , and E2u in the group D6h when the full symmetry of the benzene molecule is studied, but the advantage of working with the subgroup is that the computations are greatly simplified and the generality of the resulting rule is not impugned. The character table produces, by means of an application of the projection operator technique which in this case is especially straightforward, another array in which each irreducible representation is associated with a wave function that is the sum of the n pπ orbitals, each multiplied by the nth character entry in that row.¹⁶ ¹⁴ Brescia et al., Fundamentals of Chemistry, 289. ¹⁵ Cotton, Chemical Applications of Group Theory, 144. ¹⁶ Cotton, Chemical Applications of Group Theory, 145.
146
chemistry and geometry
Figure 5.3. F. Brescia et al., Fundamentals of Chemistry: A Modern Introduction, Figure 11.27
A : ψ1 = φ1 + φ2 + φ3 + φ4 + φ5 + φ6 B : ψ2 = φ1 − φ2 + φ3 − φ4 + φ5 − φ6 ψ = φ1 + εφ2 − ε∗ φ3 − φ4 − εφ5 + ε∗ φ6 E1 : 3 ψ4 = φ1 + ε∗ φ2 − εφ3 − φ4 − ε∗ φ5 + εφ6 ψ = φ1 − ε∗ φ2 − εφ3 + φ4 − ε∗ φ5 − εφ6 E2 : 5 ψ6 = φ1 − εφ2 − ε∗ φ3 + φ4 − εφ5 − ε∗ φ6 This array is then rewritten to get rid of imaginary coefficients and to normalize the molecular orbital wave functions to unity, which in turn
chemistry, quantum mechanics, and group theory 147 produces the array on the left hand page (see Figure 5.4a).¹⁷ Here we have a set of wave functions that we can regard as the molecular orbitals: they are linear combinations of the atomic orbitals, they are orthonormal (as planned), and they correspond to the six irreducible representations, and to the energy levels of the molecule. Their energies are calculated making use of the H¨uckel approximation, and the result is as follows; we recall that Hii is conventionally abbreviated α and Hij is conventionally abbreviated β. EB = α − 2β EE1 a = EE1 b = α + β EE2 a = EE2 b = α − β 1 EA = (6α + 12β) = α + 2β 6 By convention, α is taken to be the zero of energy and β is taken to be the unit of energy. As we see on the right-hand page (see Figure 5.4b), these wave functions are correlated with a set of six hexagonal schemata that exhibits the molecular orbitals and their nodal planes; and an energy level diagram, which shows the six electrons located in the three bonding orbitals. Notice how the articulation of rows of numbers in the character table has been mapped onto the articulation of the lines of the energy level diagram (in a different sequence), via the array of the molecular orbital wave functions. The problem is solved by a consortium of modes of representation working in tandem.
5.6. Measuring Delocalization Energy in the Benzene Molecule The delocalization of electrons, or, the ability of electrons to spread out more in a molecule than in a single atom, drives down the energy of the system and so makes the molecule more stable. Valence bond theory notes that the experimentally measured energy of the system of a benzene molecule differs from the energy calculated for either of the two Kekul´e structures, depicted as the middle pair in Figure 5.5;¹⁸ this difference is ¹⁷ Cotton, Chemical Applications of Group Theory, 146–7. ¹⁸ Cotton, Chemical Applications of Group Theory, 158.
148
chemistry and geometry
Figure 5.4a. F. A. Cotton, Chemical Applications of Group Theory, 146
called the ‘resonance energy’ and it is valence bond theory’s way of measuring the stabilization of benzene. LCAO-MO theory instead assesses that stabilization by a ‘delocalization energy.’ One task for LCAO-MO theory is to verify that the delocalization energy it theoretically calculates
chemistry, quantum mechanics, and group theory 149
Figure 5.4b. F. A. Cotton, Chemical Applications of Group Theory, 147
for benzene accords with the results of experiment. In Cotton’s textbook presentation, an empirical procedure for making this comparison is briefly sketched and the author notes ‘the LCAO method is at least empirically valid.’ However, he also adds in a footnote, ‘See Appendix III for some
150
chemistry and geometry
important qualifications concerning the evaluation of β.’ Appendix III is entitled ‘Some Remarks about the Resonance Integral,’ and explains why ‘the seemingly straightforward and obvious method for evaluating the integral β’ is not so straightforward after all; examination of the scientific articles that lie behind this qualification confirm the point.¹⁹ The 4n + 2 Rule From the results we have obtained for the systems C4H4, C6 H6, and C8H8 can infer a rule, first discovered by Huckel and now rather well known concerning the aromaticity of planar, carbocyclic systems of the type(C). According to valence bond theory, any such system in which the numer of carbon atoms is even would be expected to have resonance stabilization because of the existence of canonical forms of the type illustrated below by the first three members of the homologous series:
Figure 5.5. F. A. Cotton, Chemical Applications of Group Theory, unnumbered figure, 158
In the suite of articles that address themselves to this project, we see that the process of bringing the theoretical model in line with empirical data does not simply serve to validate the theory; nor is it a suspect attempt to glue ‘ad hoc hypotheses’ onto LCAO-MO theory in order to reproduce the good correspondence between the valence bond model and empirical results. Rather, using LCAO-MO theory in tandem with the differential equations of quantum mechanics, chemists generate new hypotheses and extensive articulation of the model. That is, what counts as the LCAO-MO model of the benzene molecule is rather plastic, and the modifications it undergoes in order to accord with empirical data are substantive and add content to the theory. ¹⁹ Robin Hendry analyzes this debate in section 4 of his ‘The Physicists, the Chemists, and the Pragmatics of Explanation,’ forthcoming. He argues that John Clarke Slater, trained as a physicist, and Charles Coulson, trained as a chemist, approached the application of quantum mechanics to the study of molecular structure differently because they held different explanatory ideals. For the former, it was more important to get the fundamentals right in terms of physics; for the latter, it was more important to provide workable models of chemically interesting molecules. Like Bishop, quoted above, Hendry emphasizes the autonomy and explanatory success of chemical structure theory.
chemistry, quantum mechanics, and group theory 151 An important paper published by R. S. Mulliken and R. G. Parr in The Journal of Chemical Physics, ‘LCAO Molecular Orbital Computation of Resonance Energies of Benzene and Butadiene, with General Analysis of Theoretical versus Thermochemical Resonance Energies,’ rehearses the earlier stages of the process of getting the model to answer to the data.²⁰ In 1938, M. Goeppart-Mayer and A. L. Sklar used the LCAO approximation to calculate the lower excited levels of benzene; ‘using no other empirical data than the carbon-carbon distance and considering the six π-electrons alone, they obtained excitation energies which agreed fairly well with experiment.’²¹ This result was, however, corrected a number of times over the course of a decade because of initially neglected terms and incorrect values for several integrals; in a later article by Parr, Craig, and Ross²² the theoretical predictions of valence bond theory seem to accord better with experiment. Apparently, however, the calculations of the energies of MO’s using LCAO approximation were good enough to encourage Mulliken and Parr to use them to compute resonance (or rather, delocalization) energies. The Mulliken and Parr article, ‘LCAO Molecular Orbital Computation of Resonance Energies of Benzene and Butadiene, with General Analysis of Theoretical versus Thermochemical Resonance Energies,’ begins with an equation.²³ W R = WK − W N The authors observe that in valence bond theory, this means that the ‘resonance energy’ WR of benzene is the difference between the energy WK of a single Kekul´e valence bond wave function ψK and the actual ground-state energy WN . But in LCAO-MO theory, it means something different. They write, In the LCAO MO method, following H¨uckel, conjugation or resonance energy in olefinic or aromatic molecules may be conceived as the difference WR between the energy WK of a wave function in which the πx electrons are ²⁰ Journal of Chemical Physics, 19 (10) (Oct. 1951), 1271–8. ²¹ This is cited by C. C. J. Roothan and R. G. Parr in ‘Calculations of the Lower Excited Levels of Benzene,’ Journal of Chemical Physics, 17 (July 1949), 1001. ²² Parr et al., Journal of Chemical Physics, 18 (12) (Dec. 1950), 1561–3. ²³ Mulliken and Parr, ‘LCAO Molecular Orbital Computation of Resonance Energies,’ 1271.
152
chemistry and geometry
assigned to two-center or ‘localized’ LCAO MO’s (like those in ethylene) and the energy WN of one in which they are assigned to the best n-center LCAO MO’s obtained by solving a secular equation (n = number of atoms carrying πx electrons).²⁴
We recall that Cotton calls WR ‘delocalization energy.’ Attaining good theoretical values for the terms in the master equation given above requires an analysis of all three, an analysis that is not a mere ‘unpacking of what is already there,’ but one that adds theoretical content made possible by the interaction of the modes of representation with each other and ultimately constraints imposed by experiment. Section 2, ‘The Normal State of Benzene,’ addresses the term WN , rewriting it as the sum of a core energy and a πx-electron energy: WN = WN,core + WNπ . The πx -electron molecular orbitals are expressed as orthonormal linear combinations of 2pπx atomic orbitals on the six carbon atoms: j = p Cjp χp The term WNπ is expanded into a twelve term sum involving Ii , the energy of an electron in the MO ϕi in the field of the core, Jij the ‘coulomb integral’ between the MO’s ϕi and ϕj , and Kij the ‘exchange integral’ between these MO’s. The coefficients Cjp in the equation given above are then found by minimizing the energy WNπ with respect to them, subject to orthonormalization of the ϕj . Section 3, ‘The LCAO MO Kekul´e Structure of Benzene,’ addresses the term WN in much the same way, arriving at a twelve term sum for WKπ formally analogous to the one just described for WNπ . Section 4, ‘Compression Energy Corrections and the Theoretical LCAO Resonance Energy of Benzene,’ analyzes WR in terms of the following equation, in which the constant bond length of 1.39 A for normal benzene is invoked. WR = WR1.39 − CK ²⁴ Mulliken and Parr, ‘LCAO Molecular Orbital Computation of Resonance Energies.’
chemistry, quantum mechanics, and group theory 153 The term CK denotes ‘compression energy’; Mulliken, Rieke, and Brown computed a reliable value for it.²⁵ The term WR1.39 is called the ‘gross vertical resonance energy,’ an electronic change with no accompanying change in internuclear distances. Section 5, ‘Comparison of Thermochemical and Theoretical Resonance Energies of Benzene’ explains an empirical value of WR that can usefully be compared with the theoretical value, but section 6, ‘Theoretical Analysis of Meaning of Thermochemical Resonance Energies, with Application to Benzene,’ cautions that further correction terms must be included before the comparison is meaningful. These involve ‘a perhaps very appreciable amount of what may be called higher order resonance energy, corresponding to numerous minor contributions of miscellaneous VB [valence bond] structures present in the actual wave function.’²⁶ In particular, the authors are concerned with ‘second order hyperconjugation energy,’ which they then elaborate upon, introducing further new terms, some of them correction terms, into the analysis of WK . Mulliken and Parr summarize the results of their efforts in the abstract of the paper: The decrease in πx -electron energy for the change from a Kekul´e to a proper benzene structure is computed purely theoretically by the method of antisymmetrized products of MO’s (molecular orbitals), in LCAO approximation, using Slater 2pπx AO’s (atomic orbitals) of effective charge 3.18, and assuming a carbon–carbon distance of 1.39 A. The result (73.1 kcal/mole) is a theoretical value for the gross (vertical) resonance energy of benzene taken for constant C–C distances of 1.39 A. In order to make a comparison with the net or ordinary empirical resonance energy, several corrections to the latter are required. The principal one is for the ‘compression energy’ required to compress the single and stretch the double bonds of the Kekul´e structure from normal single and double-bond distances to 1.39 A. The others (not hitherto clearly recognized) involve hyperconjugation and related effects. The corrections are discussed and their magnitudes estimated, but a reliable value can be obtained only for the compression energy. Allowing for this alone, the computed net resonance energy is 36.5 kcal. This agrees, within the uncertainties due to the omitted correction ²⁵ Mulliken, R. S. et al., ‘Hyperconjugation,’ Journal of the American Chemical Society, 63 (1941), 41–56. ²⁶ Mulliken and Parr, ‘LCAO Molecular Orbital Computation of Resonance Energies of Benzene and Butadiene,’ 1276.
154
chemistry and geometry
terms, with the value (41.8 kcal) of the ‘observed’ resonance energy based on thermochemical data.²⁷
What we see here is a rational process that cannot be characterized either as inductive justification (the model is supported by the empirical data) or as deductive falsification (the model is falsified by the empirical data). Rather, it is not clear what counts as the best theoretical model for evaluating the resonance energy of benzene, especially since the model suggests a search for other things, unexpected by either Kekul´e or the valence bond theorists: delocalization energy WR that redefines what is meant by WK and WN , as well as ‘compression energy’ and ‘hyperconjugation energy.’ It is moreover not clear what counts as empirical evidence for resonance or delocalization energy. The article just discussed records a process of mutual adjustment, where empirical data serves as a somewhat plastic set of constraints on the model, and the model offers somewhat plastic means of articulating the components of the molecule’s energy, which in turn suggest new ways of empirically detecting the molecule’s energy. Here we see that the paper tools affect both theory construction and experimental practice, in a way that lies between representation and intervention. And, as in Klein’s account of the role of Berzelian formulas as paper tools in the development of the theory of the binary constitution of organic substances and later in the articulation of the concept of substitution, the effectiveness of the paper tools is not explained by appeal to theory alone or by appeal to empirical data. The paper tools are a third something, a tertium quid, which shapes both theory and the collection of data. In this case, LCAO-MO theory does not seem more reliable than valence bond theory, though the authors suggest that the latter does not lead as clearly into the investigation of hyperconjugation energy. Indeed, in the paper cited above and written ten years earlier, Mulliken, Rieke, and Brown tell us that chemists earlier had conjectured that groups like CH3 should have the power to conjugate with other groups containing double or triple bonds, but all previous discussion had been qualitative. However, as they demonstrate in this article, quantum mechanical methods using the LCAO-MO method with numerical parameters derived from empirical data, open up this conjecture to computation and experiment. ²⁷ Mulliken and Parr, ‘LCAO Molecular Orbital Computation of Resonance Energies of Benzene and Butadiene,’ 1271.
chemistry, quantum mechanics, and group theory 155 Mulliken and Parr, ‘unpacking’ the terms of the equation WR = WK − WN , exhibit a process of analysis iconically, even though what is exhibited by means of spatial articulation term by term is not visual but rather the highly abstract dissection of the energy of the benzene molecule. The iconicity of the elaborate equation that emerges from this article might be compared to Dante’s spatial articulation of human vice and virtue in the Commedia. The conceptual work of icons need not be confined to the domain of human perception. LCAO-MO theory does, however, seem superior to valence bond theory in a different context, when we examine a series of molecules in which benzene figures: planar, carbocyclic molecules Cn Hn , where n is even. Cotton observes, ‘According to valence bond theory, any such system in which the number of carbon atoms is even would be expected to have resonance stabilization, because of the existence of canonical forms,’ of the following types depicted for the first three members of the series: two squares for C4 H4 ; two hexagons (the two Kekul´e structures for benzene, C6 H6 ), and two octagons for C8 H8 (see Figure 5.5). However, LCAOMO theory does not make this prediction. The energy level diagrams generated by LCAO-MO theory for C4 H4 and C8 H8 exhibit, as Cotton tells us, the same general arrangement of levels [as for C6 H6 ], namely, a symmetrical distribution of a strongly bonding, non-degenerate A level and a strongly antibonding, non-degenerate B level, with a set of E levels between them. It can be shown that such a pattern will always develop in an even-membered Cn Hn system ... ²⁸
But to attain a closed configuration, 4x + 2 electrons are required to fill the lowest nondegenerate A level and then to fill completely the first x pairs of degenerate levels above it. Systems with 4n electrons, by contrast, are more stable with a set of alternating single and double bonds. Experiment shows that a planar, carbocyclic system C4 H4 is too unstable to have any permanent existence; cyclobutadiene proves instead to possess a singlet electronic ground state that is a reactive rectangular molecule with two short double and two long single bonds.²⁹ And C8 H8 , cyclooctatetraene, is not planar, but more or less tub or boat form, and has no resonance ²⁸ Cotton, Chemical Applications of Group Theory, 158–9. ²⁹ P. Reeves, J. Henery, and R. Pettit, Journal of the American Chemical Society, 91 (1969), 5888–90; M. J. S. Dewar and G. J. Gleicher, Journal of the American Chemical Society, 87 (1965), 3255.
156
chemistry and geometry
stabilization of the kind predicted.³⁰ LCAO-MO theory offers a direct explanation of these facts, and moreover leads to the ‘4n + 2 Rule’ just cited, which generalizes the result for benzene to other carbocyclic systems with 4n + 2 electrons.³¹ This is H¨uckel’s Rule, one of the great theoretical achievements of LCAO-MO Theory. ³⁰ W. B. Person, G. C. Pimental, and K. S. Pitzer, Journal of the American Chemical Society, 74 (1952), 3437. ³¹ Some chemists have objected that Cotton overstates the case, though he himself (and I) remain unconvinced, because full valence bond treatment of these molecules has improved in the recent past. See R. Hoffmann, S. Shaik, and P. C. Hiberty, ‘A Conversation on VB or MO Theory: A Never-Ending Rivalry?’ Accounts of Chemical Research, 36 (10) (Oct. 2003), 750–6.
PA RT I I I
Geometry and Seventeenth Century Mechanics
This page intentionally left blank
6 Descartes’ Geometry Locke and Leibniz had no admiration for ‘enthusiasm,’ a kind of dogmatism grounded in alleged religious revelation. Enthusiasm is a self-reinforcing subjectivity that refuses to examine its own grounds by objective methods; the enthusiast is above reproach, or criticism. Both philosophers, in different ways, offer an epistemology where improvement and correction are always possible and indeed required, where the one who knows must always be interested in the correction of moral and scientific knowledge. Locke believes that methods for assessing, criticizing, and improving knowledge claims must be empirical in order to be objective; Leibniz believes they must be formal in order to be objective. Yet Locke misunderstands the virtue of formality, and Leibniz overstates it, or does so at least in the New Essays on Human Understanding. In the preceding chapters, I have tried to show how the management of good modes of representation in science makes possible improvements in acquiring empirical data and formulating theory, in order to find a middle way between the interventionist ideals of empiricism and the theoretical ideals of rationalism. And I have tried to mitigate twentieth century enthusiasm for predicate logic by arguing that it is one mode of representation among many, in need of reflective criticism and the examination of its own grounds. Thus a return to the debate between Locke and Leibniz should be instructive, as we turn our attention from chemistry to mathematics.
6.1. Locke’s Criticism of Syllogistic For Locke, the formal schema for finding new knowledge is the discovery of ‘middle terms.’ This vocabulary comes variously from the theory of ratios and proportions found in Eudoxus and Euclid, and from Aristotle’s theory
160
geometry and 17 th century mechanics
of the syllogism. If we know A and C, and assert the proportion A : B :: B : C, then ‘B’ is the middle term to be discovered, which brings A and C into rational relation. If we assert the standard valid syllogism, ‘All P is M and All M is S, then All S is P,’ then ‘M’ is the middle term to be discovered, bringing the minor term P and the major term S into rational relation. Towards the end of Book Four: Of Knowledge and Opinion in An Essay Concerning Human Understanding (sections that begin with chapter XII: ‘Of the Improvement of Our Knowledge’), Locke asserts that the improvement of knowledge depends on two things. First is the development of a good taxonomy: ‘determined ideas of those things whereof we have general or specific names.’ As Descartes urged, our complex ideas should be complete concatenations of clear and distinct simple ideas; and the spelling out of complex ideas in terms of simple ideas is what leads to good taxonomy. Second is the ‘art of finding out those intermediate ideas, which may show us the agreement or repugnancy of other ideas, that cannot be immediately compared.’¹ Locke begins chapter XII with the claim, ‘Knowledge is not from Maxims.’ That is, a satisfactory account of truth cannot simply be deduction from first principles which serve as ‘foundations.’ It has appeared to some thinkers, he allows in section 2, that this way of proceeding is successful in mathematics; but in section 3 he disputes this appearance. He claims instead that even advances in mathematics were not made by derivation from maxims laid down in the beginning, but rather ‘from the clear, distinct, complete ideas their thoughts were employed about, and the relation of equality and excess so clear between some of them, that they had an intuitive knowledge, and by that a way to discover it in others; and this without the help of those maxims.’² We might take the celebrated proof of the Pythagorean Theorem in Book I of Euclid’s Elements, discussed above, as an example of what Locke means here. Later on, in chapter XVII: ‘Of Reason,’ section 2, Locke argues that we need reason ‘both for the enlargement of our knowledge and regulating our assent,’ and therefore must not identify it with the use of formal logic. For Locke as for Leibniz, whose common inspiration was Descartes, the work of reason must involve procedures that discover and at the same time improve knowledge. Good ¹ John Locke, An Essay Concerning Human Understanding, ed. A. D. Woozley (New York: New American Library, 1974), 395–424. ² Locke, An Essay Concerning Human Understanding, 395–6.
descartes’ geometry
161
discovery procedures not only exhibit new knowledge but also explain it, that is, they articulate and organize our reasons for believing it. In section 4, he asserts, ‘Syllogism not the great Instrument of Reason,’ and argues for this claim in two ways. On the one hand, the formal system of syllogistic does not show us how to find the middle terms we seek; it only gives us a convenient way of arranging them once we have them in hand, setting a series of middle terms in order as a kind of bookkeeping. The ability to find middle terms depends on acquaintance with the peculiar nature of the given subject matter (in the case of the Pythagorean Theorem, lines, triangles, trapezoids, and circles), as formal rules of inference do not. Indeed, formal rules of inference, to carry out their proper role, should not express the peculiar nature of any subject matter. On the other hand, the formal system of syllogistic is awkward and limited even in the way it represents the rules of inference we do use in thinking (about anything). Locke writes, ‘If we will observe the actings of our own minds, we shall find that we reason best and clearest, when we only observe the connexion of the proof, without reducing our thoughts to any rule of syllogism.’³ Locke develops his arguments against the hegemony of syllogistic by making use of both Cartesian intuitionism and his own nominalism. Descartes’ epistemology begins from intuitions, the seeing of things that must exist due to their intelligibility: God, the soul, figures and numbers. Thus Locke writes, God has not been so sparing to men to make them barely two-legged creatures, and left it to Aristotle to make them rational ... He has given them a mind that can reason, without being instructed in methods of syllogizing; the understanding is not taught to reason by these rules; it has a native faculty to perceive the coherence or incoherence of its ideas, and can range them right, without any such perplexing repetitions.
The better way of attaining knowledge, he sums up, is ‘not by the forms themselves, but by the original way of knowledge, i.e., by the visible agreement of ideas.’⁴ Descartes’s and Locke’s intuitionism stands in uneasy tension with the strongly reductive account of knowledge they offer; Locke borrows from Descartes the view that every complex object of knowledge ought to be ³ Locke, An Essay Concerning Human Understanding, 415–21. ⁴ Locke, An Essay Concerning Human Understanding, 418.
162
geometry and 17 th century mechanics
‘led back’ to the simples that compose it. Descartes’ simples are, e. g., God, the soul, geometrical figures and numbers in the Meditations; straight line segments in the Geometry; and particles of matter in inertial motion in the Principles. Locke’s simples are simple ideas of perception and reflection. Both of them, however, use the term intuition to assert a ‘seeing’ that is more fundamental than and prior to any formalization of discursive reasoning. In section 14 of chapter XVII, Locke argues, Our Highest Degree of Knowledge is intuitive, without Reasoning. Some of the ideas that are in the mind are there so that they can be by themselves immediately compared one with another; and in these the mind is able to perceive that they agree or disagree as clearly as that it has them. In this consists the evidence of all those maxims which nobody has any doubt about, but every man ... knows to be true, as soon as ever they are proposed to his understanding.⁵
If we refer this claim to the proof of the Pythagorean Theorem, Locke is here asserting that our ability to see that the equality of the area of the square raised on the hypotenuse of the right triangle with the areas of the two squares raised on the legs is prior to any formalization of the proof in terms of maxims and rules of inference. This entails, of course, that we be able to ‘see’ a triangle as a triangle, an intelligible, unified, existing shape. Whether this triangle-vision can be explained in terms of a reconstruction out of simple ideas of sense or reflection (Locke) or straight line segments (Descartes) is another question, which I here leave aside. Locke also invokes nominalism in his attack on the hegemony of syllogistic, which is also, of course, an attack on Scholasticism. In section 8 he announces, ‘We reason about Particulars,’ and goes on, it is fit ... to take notice of one manifest mistake in the rules of syllogism: viz. that no syllogistical reasoning can be right and conclusive, but what has at least one general proposition in it. As if we could not reason, and have knowledge about particulars; whereas, in truth, the matter rightly considered, the immediate object of all our reasonings and knowledge is nothing but particulars ... Universality is but accidental to [our reasoning], and consists only in this, that the particular ideas about which it is are such as more than one particular thing can correspond with and be represented by.⁶ ⁵ Locke, An Essay Concerning Human Understanding, 422. ⁶ Locke, An Essay Concerning Human Understanding, 421–2.
descartes’ geometry
163
Once again, this reflection may be referred to the proof of the Pythagorean Theorem. The proof proceeds by the combined use of an icon, the diagram; of symbols, e. g. lines and figures tagged by letters of points on the diagram, and conjoined in ratios and proportions; and of natural language that explains the combination of the icon and symbols. The icon denotes a specific right triangle, which can stand as a representative of all right triangles because its symbolic handling does not make use of any of its peculiar features, features that distinguish it from other right triangles. The denotation of a particular, dependent on the use of an icon, is essential to the proof. If the proof holds for this right triangle, then it must hold for any other; yet it must hold for this triangle or the proof has no content. The proof requires the triangle both to be an icon of a particular triangle, and to represent symbolically, in conjunction with the ratios and proportions, all other right triangles. The natural language exposition is therefore also needed to guide the reader in dealing with the ambiguity of the diagram, that is, to distinguish and relate its iconic and symbolic roles. I can sum up Locke’s arguments against ‘truth as derivation from maxims’ and ‘syllogistic as the key to reason’ in the following way. When we re-write our best efforts to attain knowledge—like the proof of the Pythagorean Theorem in mathematics—in the formalism of maxims and syllogisms, we substitute a mode of representation that exhibits correct inference quite well for a combination of modes of representation that exhibits geometrical knowledge quite well, allowing for denotation and successful problem solving by displaying the kinds of things we are dealing with, their characteristic unity, and the reasons why they are the kinds of things they are (the middle terms). Moreover, even if our main concern were exhibiting correct inference, the formalism of syllogistic is defective, since there are many forms of inferring that we cannot capture in that idiom. Geometry, Euclid tells us, is concerned first of all with certain distinguished objects, like the straight line, circle, and triangle on the plane; and with certain distinguished procedures like the use of ruler and compass to construct points and figures. To represent his investigation of the objects by means of the procedures, he uses diagrams where wholes and parts are indicated by letters; ratios and proportions, involving wholes and parts thus indicated; and natural language that expresses the inferential progress of the argument as well as the relations among the different kinds of representations. In the work of Descartes, some geometrical things are represented by
164
geometry and 17 th century mechanics
algebraic equations, whose constants and variables are used to indicate parts and wholes of objects represented by diagrams; and by ratios and proportions, where parts and wholes are also still indicated by letters. Twentieth century philosophers of mathematics have impugned the epistemological status of diagrams on the grounds that the knowledge they furnish is allegedly intuitive, and so metaphysically suspect. However, their criticism often turns on the assumption that ‘intuition’ is immediate, self-evident, and incorrigible; and I have urged in the opening chapters of this book that the notion of intuition should be replaced by that of experience generated in productive analysis, where diagrams are never used alone but in tandem with other modes of representation. Then the objection to diagrams loses its strength. We can admit that knowledge claims in mathematics sometimes require diagrams, without supposing that such knowledge is conferred by the diagram alone. The use of diagrams is part of mathematical experience, but it need not involve the discredited connotations of intuition. When we look at the use of geometrical diagrams in tandem with other modes of representation, I argue that philosophically interesting features of their use appear. First, clearly, diagrams are icons. Some objects of mathematics are shapes, and many shapes cannot be adequately represented and studied without icons of them; and even shapes that elude iconic representation, like infinite-sided polygons or ‘monster’ functions, may be indicated indirectly by analogy, often by icons or symbols with strongly iconic aspects. Second, diagrams may represent symbolically, depending on their discursive context: that is, they may represent things they do not resemble. Third, some diagrams may represent both iconically and symbolically within one and the same argument, where a determinate and contained ambiguity is exploited by the mathematician to further the argument. Finally, the interaction of iconic and symbolic modes of representation within geometrical arguments repays philosophical study. It often exhibits the confluence of different domains, each with its own traditions of representation, as well as the development of new means of expression, novelties that affect how mathematical knowledge is both justified and discovered. This kind of interaction has not, for obvious reasons, attracted the attention of philosophers of mathematics whose concern is to purify mathematics and re-write it in a single idiom. In the rest of this chapter, I will illustrate all of these points by looking at Descartes’ Geometry.
descartes’ geometry
165
6.2. Descartes’ Geometry as the Exemplar of Cartesian Method The first and second Books of Descartes’ Geometry include diagrams in which circles are depicted, but there is something noteworthy in their depiction. For the most part they are represented not by continuous lines, but by dotted lines, as if they were ghosts of—or references to—circles, instead of the circles themselves. And this is in a sense the case, for Descartes intends these circles to function either as constructing curves or as loci. As a constructing curve, a circle is something of a revenant in the house of Cartesian method and can enter only through the back door or windows; and as a constructed locus it is not all there. Although the Geometry inaugurates the study of higher algebraic curves, Descartes’ reductionism prevents curves from coming into focus as objects of study and curtails his own investigation of them. This makes his depiction of them as symbolic as it is iconic: they are the instrument of construction (and as such their unity is not examined) or the product of construction (and as such their unity is problematic). Mathematical things become objects of knowledge for Descartes when they are constructed according to the ‘order of reasons,’ but in the case of the circle such construction presupposes the prior availability of the circle; it also involves the disparity between rational and real numbers, which Descartes however does not want to take up.⁷ In the late Middle Ages, the progression of algebra from a problemoriented treatment to a more purely symbolic, abstract treatment carried with it a change in the conception of its uses and the objects to which it was applied. Though originally designed for problems involving numbers, algebra also came to be used in the treatment of geometrical problems, as well as in various practical applications. As abstract notation for constants and variables was developed, algebra increasingly came to be thought of as a science of abstract magnitude in general, mathesis universalis. The increasing polysemy of algebra led to a change both in the conception of geometrical figure and in the conception of number, which ultimately resulted in the nineteenth century hybrid, the real number line. ⁷ These issues are discussed at length in chapters 1 and 2 of my Cartesian Method and the Problem of Reduction (Oxford: Clarendon Press, 1991). The phrase ‘order of reasons’ is taken from M. Gueroult, Descartes selon l’ordre des raisons, 2 vol (Paris: Aubier, 1968).
166
geometry and 17 th century mechanics
Thus geometry around 1600 was undergoing a transformation. On the one hand, fundamental differences existed between arithmetical and geometrical methods and operations: in particular, multiplication and division were linked by problematic, limited, and unpromising analogies. The product of two line segments was interpreted as an area, and that of three line segments as a volume; thus multiplication was seen to involve a change of dimension and moreover could not be represented or interpreted graphically for cases of more than three dimensions. Even Descartes treated multiplication this way in the Regulae (composed around 1628). Division was often interpreted as resulting in a ratio, that is, a relation rather than another magnitude. And yet, on the other hand, mathematicians like Simon Stevin and Fran¸cois Vi`ete were increasingly interested in the application of algebra, as well as the use of numbers, in problems of geometrical analysis.⁸ Descartes’ Discourse on Method was first published in 1637 as an introduction to three mathematical and scientific treatises, one of which was the Geometry. The Discourse includes a pr´ecis of the argument of the Meditations on First Philosophy; and that argument is in turn recapitulated at the beginning of the Principles of Philosophy. Descartes carefully locates his mathematics on the one hand and his physics on the other with respect to the Meditations, which is prior in the methodological ‘order of reasons’ and provides their metaphysical justification and legitimation.⁹ His conception of method is intuitionist and reductionist in the following sense, as I argued in my book Cartesian Method and the Problem of Reduction. Cartesian method organizes items of knowledge within a domain, and domains within the sphere of human knowledge as a whole, according to an order of reasons which begins with self-evidently indubitable, clear and distinct ideas, and proceeds by a chain of reasoning intended to be both truth-preserving and ampliative. Another formulation of the order of reasons is that the unfolding of knowledge must begin with simple things, which are in themselves transparent to reason, and move on by means of a constructive procedure that brings the simple things into relation to constitute complex things.¹⁰ Thus, it is hoped, any more mediately known, complex item can ⁸ See my historical review in ‘The Cartesian Revolution. La G´eom´etrie. Understanding Descartes: Reception of the G´eom´etrie,’ (in Italian) Storia della scienza, ed. S. Petruccioli, 10 vols. (Rome: Istituto della Enciclopedia Italiana), vol. V (2002), 440–52. ⁹ See the opening pages of M. Gueroult’s Descartes selon l’ordre des raisons. ¹⁰ Gueroult, Descartes selon l’ordre des raisons, vol. 1, ch. 1.
descartes’ geometry
167
always be led back to, and indeed recovered from, the simples from which it arose. Descartes presents both the simples, and his relational means of concatenating them, as if they were obvious, unique, and transparent—not requiring further explanation; and he moreover assumes that the simples along with the complex things are homogeneous. In one sense, these ‘reductionist’ assumptions are the key to Descartes’ mathematical success. According to Descartes, the simple starting points of geometry are straight line segments, as he announces at the beginning of the Geometry: ‘All the problems of Geometry can easily be reduced to such terms that knowledge of the lengths of certain straight lines is sufficient for its construction.’¹¹ The reductionist thrust of his method makes it an admirable problem-solving device, for his next revelation is that both the product and the quotient of straight line segments ought to be interpreted as further straight line segments. His algebra of geometrical magnitudes is closed; in one brilliant insight, he has freed the algebra of magnitude from the constraints and complications that hampered Vi`ete. His construction procedure (how simples are combined) is given first of all by his interpretation of the arithmetic operations, including the extraction of roots, and then by his methods for constructing more algebraically complex problems on the basis of the solutions of simpler problems; in relation to that hierarchy, methods are then also given for constructing more algebraically complex curves on the basis of algebraically simpler curves. But the assessment of Descartes’ success must be nuanced, for in many respects he does not adhere to his own strict requirements for reduction; moreover, his reductionism inhibits his mathematical discoveries. In every one of his constructions, Descartes is forced to smuggle in entities (especially curves, and first and foremost the circle) that are not strictly licit at that stage of construction; his ‘starting points,’ the straight line segments, require the prior availability of curves as means of construction, just as later the construction of certain higher curves require as means of construction curves that are strictly speaking not yet available. His construction procedure itself
¹¹ G 297, AT 369. The first page number refers to R. Descartes, ‘La G´eom´etrie’ in Discours de la m´ethode pour bien conduire sa raison et chercher la verit´e dans les sciences (Leiden, 1637), 297–413. (This first edition is reproduced in facsimile in The Geometry of Ren´e Descartes, eds. D. E. Smith and M. L. Latham (New York: Dover, 1954).) The second page number refers to the edition of this text reproduced in Oeuvres de Descartes, eds. C. Adam and P. Tannery (Paris: Vrin, 1897–1913/1964–74), 367–485.
168
geometry and 17 th century mechanics
turns out to be a combination of algebraic, geometrical, and mechanical means that cannot be contained by a linear ‘order of reasons.’ The primary aim of the Geometry is to exhibit geometry as an ordered domain of construction problems that can be solved not by mere ingenuity, but by a method; algebra is important for the resources and the order it offers, but plays an auxiliary role. The simplest such problems are inherited from the Greek canon: for example, given two line segments, construct the line that is the mean proportional between them. The three famous problems of classical antiquity were the squaring of the circle, the duplication of the cube, and the trisection of the angle. The Greeks considered construction by means of ruler and compass (the intersections of lines and circles) to be the most rigorous and well-defined, and of course could not solve those three problems by such means. For the Greeks, acceptable constructing curves were primarily the circle and straight line; Descartes, by contrast, wished to generalize the very means of construction in an orderly way, and postulated a hierarchy of problems associated with constructing curves of increasing complexity. According to Descartes, a geometrical problem had to be translated into an algebraic equation in one unknown; then the roots of the equation had to be constructed by geometrically acceptable means: the constructing curves he allowed and catalogued included the conic sections and a few cubic curves, but he envisaged the use of higher algebraic curves. The task of making precise what he meant by ‘increasing complexity’ led Descartes to combine speculatively considerations about the properties of the curves themselves, the structure of the algebraic equations associated both with the problems and with the curves, and various novel mechanical devices for tracing curves that had interested him twenty years before.¹² Thus in the Geometry Descartes embarks on a program of classifying problems according to the complexity of the curves needed for their construction. He has first of all to explain which curves are acceptable as constructing curves, and then to find criteria according to which such curves could be set in order. Finally, he has to show how to choose the simplest constructing curves for a given sort of problem. Overall, he hoped to provide a universal method, using algebra to analyze problems, ¹² See the interesting discussion in Chikara Sasaki’s Descartes’s Mathematical Thought, (Dordrecht: Kluwer, 2003), ch. 3.2.
descartes’ geometry
169
of finding the constructions for any problem that arose within the tradition of geometrical problem-solving, as well as to identify and order all means beyond ruler and compass for these constructions. In the end, Descartes accomplished both more and less than he intended.
6.3. Diagrams as Procedures On the very first pages of the Geometry, Descartes shows how the operations of addition, subtraction, multiplication, division, and the extraction of square roots can be interpreted geometrically, as operations on straight line segments that produce straight line segments. The order of reasons dictates that geometry begins with straight line segments, and proceeds to more complex entities by the following operations. Addition and subtraction are straightforward, but multiplication and division are not (see Figure 6.1, top). To multiply BD by BC, take AB as the unit; join the points A and C, and draw DE parallel to CA; then BE is the product of BD and BC. To divide BE by BD, take AB as the unit, join E and D, and draw AC parallel to DE; the segment BC is the quotient. Note that Descartes sets up two proportions, AB : BD :: BC : BE and BE : BD :: BC : AB, where AB = 1. This diagram thus represents a procedure for finding products and quotients, and as such is a schema for a simple tracing device.¹³ In this capacity, it is symbolic: it represents something it does not resemble. However, the cogency of the procedure rests on Euclidean results that follow from the similarity of the two triangles ACB and DEB; in this capacity the diagram is iconic. And it must function as both to serve Descartes’ purposes. The same ambiguity holds for Figure 6.1, bottom, which explains how to find a line segment that represents the square root of a given line segment GH. To find the square root of GH, add to GH the line segment FG taken equal to the unit; bisect FH at K and describe the circle FIH around K taken as the center. Draw a perpendicular at G, intersecting the circle at I, and GI is the required root.¹⁴ This diagram is symbolically the schema of a procedure and a tracing device, and iconically a circle and a right triangle divided into two similar right triangles. Descartes draws his inference because of ¹³ G 298, AT 370.
¹⁴ G 298, AT 370.
170
geometry and 17 th century mechanics
Figure 6.1. Descartes, Geometria, 2
the Euclidean result that the triangle FIH is similar to the triangles GFI and GIH, due to the way the circle constrains inscribed triangles (the angle FIH must be a right angle) and to the Pythagorean Theorem. The triangles and the circle are not present as objects of investigation; they are auxiliary constructions that help to advance the argument, not what the argument is about, that is, the relation between GH and the sought-for line segment GI. All the same, they and the Euclidean results about them must be assumed to be available. The role of the circle in the diagram is both iconic and symbolic, and moreover its iconic role undermines Descartes’ reductionist intention that we read the diagram as a configuration of straight line segments. Descartes fails to draw in the remaining sides of the triangles FIH, GFI, and GIH; their absence testifies to Descartes’ wish to downplay
descartes’ geometry
171
their iconic role. In sum, the construction of the square root requires similar triangles as well as a curve—a circle—and the construction of other roots will require other curves of higher degree. To get all the line segments he needs as representatives of rational and algebraic numbers, Descartes requires antecedently available curves to serve as means of construction; but he also want curves to be introduced as complex constructions derived from proportions holding among line segments and does not wish to admit them as starting points. Descartes also never explains how the Greek theory of ratio and proportion based on similitude stands in rational relation to the algebra of arithmetic based on the equation. He simply posits the link between geometrical problems given in terms of ratios and proportions, and algebraic equations, explaining how a geometer can use the foregoing interpretation of the operations to derive an algebraic equation, whose solution will yield the solution to the problem. The construction of plane problems by means of algebra is exemplified in Figure 6.2, where the circle is represented by a dotted line. A problem that can be constructed by means of ruler and compass can be expressed by a quadratic equation in one unknown: the square of an unknown quantity, set equal to the product of its root by some known quantity, increased or diminished by some other quantity also known, or z2 = az + b2 . A solution of this equation can then be found by the following geometrical construction. The side LM of the right triangle NLM is equal to b (the square root of the known quantity b2 ) and the other side LN is equal to half of a. Prolong MN to O, so that NO is equal to NL, and the whole line OM will be z, the required line and the root of the equation. A simple application of the Pythagorean Theorem shows that √ this construction is expressed by the formula: z = 1/2 a + 1/4 a2 + b2 .¹⁵ The construction, again, depends on both the prior availability of the circle as a constructing curve, and the validity of the Pythagorean Theorem. The circle is dotted because of its symbolic and auxiliary role.
6.4. Generalization to the Construction of a Locus Having established his novel geometrical interpretation of the arithmetic operations, Descartes must then consider two different ways of generalizing ¹⁵ G 302–3, AT 374–5.
172
geometry and 17 th century mechanics
Figure 6.2. Descartes, Geometria, 6
his approach. The first is the generalization to problems whose algebraic expression requires two variables, not just one. This he does by discussing Pappus’ problem, a family of problems where what must be constructed is not just a single point (for example, O or I on the two preceding diagrams) and therefore a line segment determined by it, but a rather a locus, which must be constructed point-wise. The second, which may also be illustrated in terms of Pappus’ problem, is the generalization to problems whose algebraic expression requires exponents higher than 2; when the degree of the equation is, for example, 3 or 4, there are general formulas for the roots, but they involve the extraction of cubic roots, and these cannot be constructed by ruler and compass.
descartes’ geometry
173
Descartes was introduced to Pappus’ problem in 1631 by a Dutch mathematician Jacob von Gool (Golius) who thought Descartes might want to try out his new method on it. In Book VII of his Collectio, Pappus of Alexandria proposed the generalization of a problem that had been around since Euclid’s time, and which implied a whole new class of curves. Since it was a problem that the Greek mathematicians could neither solve in a methodical way nor properly generalize, it exhibited particularly well the power of Cartesian method. It asks for the determination of a locus whose points satisfy one of the conditions illustrated by Figure 6.3. Let the di denote the length of the line segment from point P to Li which makes an angle of ϕi with Li . Choose α/β to be a given ratio and a a given line segment. The problem is to find the points P which satisfy the following conditions. If an even number (2n) of lines Li are ‘given in position,’ the ratio of the product of the first n of the di to the product of the remaining n di should be equal to the given ratio α/β, where α and β are arbitrary line segments. If an odd number (2n − 1) of lines Li are ‘given in position,’ the ratio of the product of the first n of the di to the product of the remaining (n − 1) di times a should be equal to the given ratio α/β. The case of three lines is exceptional, since it arises when two lines coincide in the four-line problem; the condition there is (d1 · d2 )/(d3 )2 = α/β. There are in fact
Figure 6.3. Bos, ‘On the Representation of Curves in Descartes’ G´eom´etrie,’ 299
174
geometry and 17 th century mechanics
points that satisfy each such condition, and they will form a locus on the plane.¹⁶ In the middle of Book I of the Geometry, Descartes announces, ‘I believe I have in this way completely accomplished what Pappus tells us the ancients sought to do,’ as if he had solved the problem in a thoroughgoing way for any number of lines.¹⁷ While it is true that his combination of algebraic-arithmetical and geometrical devices produces an important advance in the solution of the problem, it is not true that his treatment of the problem is complete. In fact, there turn out to be important and unforeseen complexities which arise with respect to geometrical curves, polynomial equations, and number itself, as well as the relations among them. Descartes’ explanation of his method for solving the family of problems collected under Pappus’ problem is given at the end of Book I, accompanied by a diagram of the four line version (see Figure 6.4). He reduces the problem, which concerns proportions among lines ‘given in position’—as well as areas and volumes—and the loci they determine, to problems that are part of his geometrical program of the construction of equations, and so involve equations among line segments.¹⁸ Treating the problem as an analysis in Pappus’ sense, he depicts it as if it were already solved, and reduces the complexity of the diagram ‘by considering one of the lines given in position and one of those to be drawn (as, for example, AB and BC) as the principal lines’ in terms of which all the other lines will be expressed. That is, all the lines must be labeled, and the segments whose lengths we know carefully distinguished from those we don’t; and we must write down all the equations we can that express the relations between known and unknown segment lengths. Thus, he chooses y equal to BC (d1 ) and x equal to AB, and then shows how all the other di can be expressed linearly in x and y. For 2n lines, the equation will be of degree at most n; for 2n − 1 lines, it will be of degree at most n, but the highest power of x will be at most n − 1. (For 2n and 2n − 1 parallel lines, where y is the sole variable involved, the result is an equation in y of degree ¹⁶ See H. J. M. Bos’ discussion of Pappus’ problem in his ‘On the Representation of Curves in Descartes’ G´eom´etrie’ in Archive for History of Exact Sciences, 24 (1981), 298–302, from which Figure 6.3 is reproduced. Apropos a conference on the Geometry at REHSEIS in honor of Henk Bos, Karine Chemla, Jean-Jacques Szczeciniarz, Marie-Jos´e Durand-Richard, Paolo Mancosu, Marco Panza, and Ken Manders gave me useful feedback on earlier drafts of this paper in the spring of 2005. ¹⁷ G 309, AT 382. ¹⁸ G 310–12, AT 382–4.
descartes’ geometry
175
Figure 6.4. Descartes, Geometria, 26
at most n.) One highly significant feature of this diagram is that there is no locus depicted, only the point C which functions as its representative. By its minimally schematic depiction, the point C seems to promise that all curves that ought to belong to geometry can be understood as a nexus of line segments, which is of course the dogma of Descartes’ reductionist method applied to mathematics.¹⁹ The point-wise construction of the locus is then undertaken as follows. One chooses a value for y and plugs it into the equation, thus producing an equation in one unknown, x. For the case of three lines, the equation ¹⁹ Bos, ‘On the Representation of Curves in Descartes’ G´eom´etrie,’ 298–302.
176
geometry and 17 th century mechanics
in x is in general of degree 2; for 2n lines, it is of degree at most n; and for 2n − 1 lines, it is of degree at most n − 1. (For 2n − 1 parallel lines, the equation already has only one variable, y, and is of degree n.) The roots of this equation can then be constructed by means of intersecting curves which must be decided upon. This procedure generates the curve point by point and is thus potentially infinite, yet it also clearly does not generate all the real points on the line. It seems that Pappus’ problem has been reduced to the geometrical construction of roots of equations in one unknown: the construction of line segments on the basis of rational relations among other line segments. However, this reduction has an odd effect upon Descartes’ own understanding of what he has accomplished. One of the first things that he says about Pappus’ problem is that he has found a methodical way to classify cases of the problem; however, his classification is based not on some feature of the locus generated but rather on what kind of constructing curve ought to be chosen in the point-wise construction of the locus, that is, in the construction of the line segment x given the relevant equation in x and y and a definite value of y. He sets out this classification of cases at the very end of Book I in more explicitly algebraic terms; it does not pertain to curves (describable by indeterminate equations in two unknowns) but to problems (describable by determinate equations in one unknown). Recall that Figure 6.4 contains no hint of the locus, but consists only of the nexus of line segments with their specified relations to an arbitrary point C of the locus. The official subject of these diagrams is line segments; constructing curves also intervene, but the choice of such curves is vexed for Descartes, and they are not what the diagram is about. Book II, entitled ‘On the Nature of Curved Lines,’ attempts to classify curves, but it classifies them first of all in their role as means of construction for problems. Descartes begins by observing, The ancients were familiar with the fact that the problems of geometry may be divided into three classes, namely, plane, solid, and linear problems. This is equivalent to saying that some problems require a conic section and still others require more complex curves. I am surprised, however, that they did not go further, and distinguish between different levels of these more complex curves, nor do I see why they called the latter mechanical, rather than geometrical.²⁰ ²⁰ G 315, AT 388.
descartes’ geometry
177
Descartes criticizes the Greeks for failing to generalize their means of construction, and to subject them to rational constraint and methodical classification. In his view, the Greeks were rather haphazard in their approach, experimenting with the spiral and the quadratrix (curves which we now call transcendental, since they cannot be expressed by an algebraic equation) as constructing curves, while at the same time failing to generalize on ruler and compass.
6.5. Generalization to Higher Algebraic Curves We might expect Descartes to make his generalization simply on the basis of the algebraic equation. But at the beginning of Book II, rather than invoking equations Descartes appeals to tracing machines. For he understands his project not only as the exploration of the new algebra in the service of geometrical problem-solving, but also, and primarily, as the geometrical construction of the roots of algebraic equations, the discovery of line segments on the basis of a given configuration of line segments. Moreover, as Descartes knew very well, while the point-wise construction of loci articulated in Book I might look like a good way to generate more and more complex curves starting from rational relations among line segments, it is not a satisfactory way to generate curves that will be used as constructing curves. The indefinitely iterated, point-wise construction of loci does not guarantee the existence of all the points of intersection required when curves are used as means of construction; stronger continuity conditions are required.²¹ It is instructive to compare the classification of cases that occurs at the end of Book I, which seems rather more algebraic, and the classification of curves by genre at the beginning of Book II, which invokes tracing machines. The classification at the end of Book I stems from a prior classification of problems: when three or four lines are given in position, the required point may be found by using circle and straight line; when there are five, six, seven, or eight lines given in position, the required point may be found by a curve ‘of next higher level,’ which includes the conic ²¹ H. J. M. Bos, Redefining Geometrical Exactness: Descartes’ Transformation of the Early Modern Concept of Construction (Frankfurt: Springer-Verlag, 2001), ch. 24.
178
geometry and 17 th century mechanics
sections; and when there are nine, ten, eleven, or twelve lines, the required point may be found by a curve of ‘next highest level,’ though it is unclear what that level includes. Descartes’ definition of genre occurs right in the midst of his discussion of two tracing machines in the first pages of Book II. He explains why they are acceptable generalizations of ruler and compass, and therefore likely sources of constructing curves for the higher levels of problems. Descartes’ first tracing machine (see Figure 6.5) is a system of linked rulers that allows the user to find one, two, three, or more mean proportionals between two given line segments. This is, clearly, an ordered series of problems. As it is opened, the machine traces out certain curves AD, AF, AH, and so on, which then function as constructing curves, since their intersections with the circles determine the sought-for line segments, the mean proportionals. Descartes recognizes that these constructing curves form a series of higher algebraic curves of increasing complexity, though he does not give an equation for any of them; nor does he explain the relation between this series of problems and the series of problems united under Pappus’ problem. Descartes cannot rely on his tracing machine alone to generate a complete series of problems and constructing curves; the series they generate is too special. His hinged rulers do not, for example, produce the curve he is most interested in exhibiting as the fruit of his method, the ‘Cartesian parabola,’ which is not a parabola at all but a cubic curve.
Figure 6.5. Descartes, Geometria, 20
descartes’ geometry
179
Figure 6.6. Descartes, Geometria, 22
Descartes does trace the Cartesian parabola by using another tracing machine (see Figure 6.6), which generates new, more complex curves from the motion of simpler curves and straight lines. A ruler GL is linked at L to the device NKL, which can be moved along the vertical axis while the direction of KN is kept constant; as L slides up or down the vertical axis, GL turns around G, and the line KN is moved downwards remaining parallel to itself. The intersections of the line KNC and GL trace out a curve GCE, which is represented by a dotted line. When the line KN is a straight line, GCE is a hyperbola, whose equation Descartes derives. When KN is a circle GCE is a conchoid; when KN is a hyperbola, GCE is the Cartesian parabola, for which he also derives an equation. Once again there is a series of constructing curves, and once again it is rather special. Descartes cannot guarantee that all the curves needed to trace Pappian loci, or the loci themselves, can be traced by such machines. Indeed, later on in Book II, in the second case of the five line locus problem, Descartes describes a curve which he did not know how to trace by continuous motion.²² In sum, the point-wise construction of the Pappian loci do not give enough points to underwrite those curves as constructing curves; the tracing machines guarantee continuity, but produce series that are too special and ²² H. J. M. Bos, Redefining Geometrical Exactness, ch. 16, 23, 24.
180
geometry and 17 th century mechanics
incompletely understood algebraically; and in any case, though Descartes believed that all algebraic curves could be described as Pappian loci, this is in fact not true. Moreover, not all tracing procedures are acceptable to Descartes. He distinguishes between ‘geometrical’ and ‘mechanical’ curves, a distinction that corresponds to the modern distinction between algebraic and transcendental curves. The former are all and only those that correspond to algebraic equations; the latter are not acceptable in geometry. Yet transcendental curves can be traced by well-specified tracing methods. Since in his exposition he affirms that tracing procedures supplement algebraic approaches, Descartes must give a demarcation criterion. At the beginning of Book II, Descartes rejects the spiral and quadratrix (two transcendental curves) by specifying that they belong not to geometry but to mechanics, ‘because we imagine them described by two separate motions, which have no relation between them which can be precisely measured.’²³ Thus, in the tracing motions Descartes finds acceptable, the tracing point is the intersection of two moving straight or curved (algebraic) lines, and the motions of the lines are continuous and strictly coordinated by an initial motion. However, exactly what Descartes intended by continuous motion and strict coordination is never really spelled out in the Geometry. Ultimately, Descartes’ segregation of algebraic from transcendental curves (and his belief that the coordination of moving lines in the tracing of the latter is not strict) rests on his belief that straight and curved lines do not stand in rational relation to each other, a belief that underwrites his reductive method for geometry, where everything in the order of reasons is built up from the concatenation of straight line segments.²⁴ So when Descartes announces, ‘Having now made a general classification of curves, it is easy for me to demonstrate the solution that I have already given of the problem of Pappus,’ he is overstating his case. Referring to the diagram in Figure 6.7, which is reprinted no less than four times in the pages that follow, he explains under what conditions the locus will be a circle (depicted in Figure 6.7), a parabola, a hyperbola, and an ellipse, and what features its associated equation will have. Note that only the four lines ‘given in position’ are printed as continuous lines; all the others, including the locus, are printed as dotted lines. As a solution to this problem, the locus is constructed point-wise; it is a sort of compendium of solutions to ²³ G 317, AT 390.
²⁴ Bos, Redefining Geometrical Exactness, ch. 28.
descartes’ geometry
181
an indefinitely infinite series of problems about relations among straight line segments, and so is the associated algebraic equation. They both sum up and exhibit that set of problems very nicely, and in that capacity they are both symbolic rather than iconic. As a symbolic representation of that set of problems, however, the circle is not continuous.
Figure 6.7. Descartes, Geometria, 28
Thus when Descartes presents his most novel curve, the Cartesian parabola—the curve that he discovered and investigated—he does not present it simply as a locus, that is, as a solution to Pappus’ problem. He also presents it as the result of a tracing machine. Figure 6.8 exhibits the Cartesian parabola as a locus, a solution to Pappus’ problem when there are
182
geometry and 17 th century mechanics
Figure 6.8. Descartes, Geometria, 36
five lines and four of them are parallel with the fifth perpendicular to the first four. In addition, the second tracing machine (Figure 6.6) is superimposed on the representation of Pappus’ problem, and it is labeled so as to underscore the superposition. The ruler GL in Figure 6.6 corresponds to the line GL in Figure 6.8; the line CNK in Figure 6.6 (which, Descartes tells us on the following page, can be replaced by various conic sections in order to trace out other curves) corresponds to the parabola CKN; the line traced out in Figure 6.6 GCE corresponds to the Cartesian parabola CEGC. Note that the parabola CKN is printed as a dotted line, for its role is a constructing curve, but the Cartesian parabola CEGC is a continuous
descartes’ geometry
183
line. Descartes presents his new discovery as a Pappian locus, the result of a tracing machine, and as an algebraic equation, y3 − 2ay2 − a2 y + 2a3 = axy. Three distinct modes of representation, related in the natural language exposition of the text, are needed for the inauguration of this novel curve, to indicate what it is, as well as to begin the process of analyzing it.²⁵ ²⁵ I would like to thank David Reed for his comments on various issues in this essay, which have been helpful in a pervasive and not easily foot-notable way, as were the first two chapters (on Euclid and Descartes) of his book Figures of Thought (London: Routledge, 1994). We often share the same presuppositions, perhaps due to having the same teachers at the University of Chicago, David Smigelskis and Eugene Garver, and the same dissertation advisor, Angus Macintyre, variously at Yale and Oxford, to whom I am also grateful. Daniel Garber and Roger Ariew have helped me clarify my ideas on Descartes for many years. And I thank Mickael Fontan for translating portions of this essay into French.
7 Newton’s Principia Molecules cannot be perceived directly by human beings, though their appearance is inferred indirectly from traces detected by various laboratory instruments. One important feature they share with macroscopic objects is shape, and, as we saw in Chapter 3, 4, and 5, geometrical shape may thus serve an important role in linking the microscopic and macroscopic, the molecular world and the world of the laboratory, field, and classroom. Geometrical shape also serves to bring the finite realm of Euclidean geometry and the infinitary/infinitesimalistic realm of Leibniz’s and Newton’s new calculus (representing dynamical processes) into rational relation. When icons are used to do this kind of conceptual bridging, their significance must be carefully contextualized and explained in symbolic and natural language, not least because the import of such icons is often ambiguous. That ambiguity must be carefully constrained and exploited to be effective in the growth of knowledge. A pyramidal sketch of the molecule NH3 refers to both the configuration of the molecule and a purified substance in a lab (which as a gas has no macroscopic shape); the roundness of a kernel of corn in McClintock’s photographs refers to both the surface of a macroscopic object and the completeness of a microscopic process; the triangles SBC and SBc in the diagram to Proposition I, Book I of Newton’s Principia (see Figure 7.1) refer to both finite and infinitesimal configurations. The use of each of these images in its attendant argument depends upon and exploits its ambiguity. Ambiguity, as every philosopher knows, must be carefully managed in order to avoid confusion and contradiction; but in these arguments it is in fact successfully managed. Such successful management of ambiguity is of great philosophical interest, as I have been arguing, for a theory of knowledge in general and scientific and mathematical knowledge in particular.
newton’s principia
185
Figure 7.1. Newton, Principia, Book I, Section II, Proposition I, Theorem I
Many philosophers—and mathematicians—object to the use of diagrams because they are not ‘rigorous’ and may be misleading when used as evidence in an argument. The objection rests, I believe, on the unfortunate Kantian assumption that images belong to intuition (construed as he construes it) in tandem with the Cartesian assumption that intuition is self-evident. Thus philosophers tend to assume that an image means only one thing and wears its meaning, as it were, on its face. When an image fails to meet this Cartesian standard (as it inevitably must) it is rejected as insufficiently rigorous. But in the case studies we have examined so far, the images are, and must be, framed by explanation in
186
geometry and 17 th century mechanics
symbolic and natural language; and they are often, ineluctably, ambiguous. Moreover, some of the most important images are present in the argument because they play an indispensable role, that is, to represent shape as shape. Shape is irreducible. It is true that nineteenth century theory of functions and twentieth century quantum mechanics have posited some important objects that cannot be directly pictured, though our investigation of them typically involves pictures used indirectly and by analogy. But many other objects of mathematics and physics can be pictured, and the use of iconic representation in their investigation is still rigorous in context. The indispensability and irreducibility of shape explains why the evolute of the ellipse (a star-shaped curve called the asteroid), the cycloid (which is its own evolute), the spiral (the involute of the circle) and the catenary (the evolute of the tractrix) are all pictured in striking diagrams at the end of Giuseppe Peano’s Formulario mathematico, his formalization of mathematical knowledge in the eighth and final section entitled ‘Theory of Curves.’¹ Their presence is noteworthy, since Peano is often closely associated with a group of mathematicians and philosophers intent upon the arithmeticallogical formalization of mathematics. Yet Peano acknowledged that his exposition was incomplete without the presence of the diagrams, for he knew that these algebraic and transcendental curves, as shapes in space, are fundamental to mathematics. Current philosophy of mathematics either assimilates geometry to arithmetic and then to logic, or refers it to a sense-data account of perception, a tendency that has almost completely banished the things of geometry from philosophical discourse. We have lost sight of the importance of the integrity of shape, and of geometrical form generally, an integrity that becomes clearer when we view the development of mathematical knowledge as a process of Leibnizian analysis and the things of geometry as intelligible unities in need of analysis. Before we look at Newton’s great proof of the inverse square law, in which geometric shape is central and the polyvalent use of diagrams is indispensable, I will discuss the approach of an influential contemporary philosopher of mathematics, Philip Kitcher, and show why, in this case study, he cannot explain certain significant features of the proof. ¹ G. Peano, Formulario mathematico (Turin: C. Guadagnini, 1894; repr. Rome: Edizioni cremonese, 1960).
newton’s principia 187
7.1. Philip Kitcher on History What use should the philosophy of mathematics make of history? Kitcher’s The Nature of Mathematical Knowledge (1983) in an admirable way intends to bring the philosophy of mathematics into relation with the history of mathematics, as a quarter of a century earlier Kuhn, Toulmin, and Lakatos aimed to do for the philosophy of science.² Yet Kitcher’s book seems in spirit quite ahistorical, and contains no philosophical meditation on history itself, but rather what he calls a ‘defensible empiricism’ opposed to Kantian apriorism and Platonism. In an earlier paper leading up to the book, he writes, ‘a very limited amount of our mathematical knowledge can be obtained by observations and manipulations of ordinary things. Upon this small basis we erect the powerful general theories of modern mathematics.’³ (This is the same general strategy of building upon a perceptual-empiricist basis employed by Penelope Maddy in her books Realism in Mathematics and Naturalism in Mathematics⁴ and by Donald Gillies in his Philosophy of Science in the Twentieth Century and Philosophical Theories of Probability.⁵ Kitcher) continues: My solution to the problem of accounting for the origins of mathematical knowledge is to regard our elementary mathematical knowledge as warranted by ordinary sense perception ... Yet to point to the possibility of acquiring some kind of knowledge on the basis of observation is not to dispose of the worry that, properly speaking, mathematical statements cannot be known in this way. Hence a complete resolution of the question of the origin of mathematical knowledge should provide an account of the content of mathematical statements, showing how statements with the content which mathematical statements are taken to have can be known on the basis of perception.
He does admit that ‘a full account of what knowledge is and of what types of inferences should be counted as correct is not to be settled in advance ... ’ especially since most current epistemology ‘is still dominated by the case of perceptual knowledge’ and restricted to ‘intra-theoretic’ reasoning.⁶ ² (New York: Oxford University Press, 1983). I wrote a review essay of this book in The British Journal for the Philosophy of Science, 36 (1985), 71–8. ³ Kitcher, The Nature of Mathematical Knowledge, 92. ⁴ (Oxford: Oxford University Press, 1990/2003); (Oxford: Oxford University Press, 1997/2006). ⁵ (London: Blackwell, 1993); (London: Routledge, 2000). ⁶ Kitcher, The Nature of Mathematical Knowledge, 96–7.
188
geometry and 17 th century mechanics
Nontheless, his own ‘epistemological preliminaries’ seem be so dominated and restricted: On a simple account of perception, the process would be viewed as a sequence of events, beginning with the scattering of light from the surface of the tree, continuing with the impact of light waves on my retina, and culminating in the formation of my belief that the tree is swaying slightly; one might hypothesize that none of my prior beliefs play a causal role in this sequence of events. ... A process which warrants belief counts as a basic warrant if no prior beliefs are involved in it, that is, if no prior belief is causally efficacious in producing the resultant belief. Derivative warrants are those warrants for which prior beliefs are causally efficacious in producing the resultant belief.
A warrant is taken to refer to processes that produce belief ‘in the right way.’ Then ‘I know something iff I believe it and my belief was produced by a process which is a warrant for it.’⁷ This is an account of knowledge with no historical dimension. It also represents belief as something which is caused, for a basic warrant is a causal process which produces a physical state in us as the result of perceptual experience and which can (at least in the case of beliefs with a basic warrant) be engendered by a physical process. In Kitcher’s book The Nature of Mathematical Knowledge, he makes a different kind of claim about mathematical knowledge, characterizing ‘rational change’ in mathematics as that which maximizes the attainment of two goals: The first is to produce idealized stories with which scientific (and everyday) descriptions of the ordering operation that we bring to the world can be framed. The second is to achieve systematic understanding of the mathematics already introduced, by answering the questions that are generated by prior mathematics.⁸
He then goes on to propose a concept of ‘strong progress,’ in which optimal work in mathematics would tend towards an optimal state of mathematics: We assume that certain fields of mathematics ultimately become stable, even though they may be embedded in ever broader contexts. Now define the limit practice by supposing it to contain all those expressions, statements, reasonings, ⁷ Kitcher, The Nature of Mathematical Knowledge, 18. ⁸ Kitcher, The Nature of Mathematical Knowledge, 530–1.
newton’s principia
189
and methodological claims that eventually become stably included and to contain an empty set of unanswered questions.
So there are two different kinds of assumption, in his book and the later article, that render Kitcher’s account ahistorical. One is that mathematical knowledge has its origins in physical processes that cause fundamental beliefs in us (and these processes, while temporal, are not historical). The other is that mathematics should optimally end in a unified, universal, axiomatized system where all problems are solved and have their place as theorems. This unified theory has left history behind, like Peirce’s ‘end of science’ or Hegel’s ‘end of history,’ and viewed in light of it, history no longer matters—its intervention between the ahistorical processes and objects of nature, and the ahistorical Ultimate System seems accidental. Indeed, in Kitcher’s account of rational mathematical change or ‘rational interpractice transitions,’ the emphasis is on generalization, rigorization, and systematization, processes that sweep mathematics towards the Ultimate System, with its empty set of unanswered questions.
7.2. Jean Cavaill`es on History At this point, for the sake of contrast I would like again to bring up the philosophy of mathematics of Jean Cavaill`es. With Emmy Noether, he edited and translated into French the correspondence between Cantor and Dedekind; he also wrote important works on the axiomatic method, logic, and the history of set theory. The method of Cavaill`es, like that of his teacher at the Ecole Normale Sup´erieure L´eon Brunschvicg, is historical. He rejects the logicism of Russell and Couturat, but rejects as well the appeal of Brouwer and Poincar´e to a specific mathematical intuition, referring the autonomy of mathematics to its internal history, ‘un devenir historique original’⁹ which can be reduced neither to logic or physics. In his historical researches (as for example into the genesis of set theory), Cavaill`es is struck by the ability of an axiomatic system to integrate and unify, and by the enormous ‘autodevelopment’ of mathematics attained by the increase of abstraction. The nature of mathematics and its progress are one and the ⁹ ‘R´eflections sur le fondement des math´ematiques,’ Travaux du IXe Congr`es international de philosophie, t. VI/no. 535 (Paris: Hermann, 1937), 136–9.
190
geometry and 17 th century mechanics
same thing for him: the movement of mathematical knowledge reveals its essence—its essence is the movement. For Cavaill`es, history is a kind of discipline for the philosopher, preventing him from indulging in a discourse which is too general and finally not rigorous. It allows him to retrieve lost links, and to examine the cross-fertilization of methods and the translation of one theory into another, the transversal nature of mathematics. But Cavaill`es (unlike his Hegelian teacher Brunschvicg) resists the temptation to totalize. History itself, he claims, while it shows us an almost organic unification en acte, also saves us from the illusion that the great tree may be reduced to one of its branches. The irreducible dichotomy between geometry and arithmetic always remains, and the network of tranversal links engenders multiplicity as much as it leads towards unification. Moreover, the study of history reminds us that experience is work—activity—not the passive reception of a given.¹⁰ What interests me most is Cavaill`es’ claim that a mathematical result exists only as linked to both the context from which it issues, and that which it produces, a link which seems to be both a rupture and a continuity.¹¹ The ‘unforeseeability’ is not merely psychological, not merely subjective, not merely human, any more than the creation of the novelty is merely a human construction. The disruption, like the new creation, lies in the mathematical objects as well as in the mind of the mathematician. Thus Cavaill`es writes, ‘I want to say that each mathematical procedure is defined in relation to an anterior mathematical situation upon which it partially depends, with respect to which it also maintains a certain independence, such that the result of the act [geste] can only be assessed upon its completion.’¹² A significant mathematical act, like Descartes’ solution to Pappus’ problem or Newton’s proof of the inverse square law discussed below, is related both to the situation from which it issues and to the situation it produces, extending and modifying the pre-existing one. To invent a new method, to establish a new correlation, even to extend old methods in novel ways, is to go beyond the boundaries of previous applications; and at the same time in a proof the sufficient conditions for the solution of the problem are revealed. What Cavaill`es calls ‘the fundamental dialectic of mathematics’ is an alliance between the necessary and the unforeseeable: the unforeseeability ¹⁰ Sinaceur, Jean Cavaill`es, ch.1. ¹¹ ‘La pens´ee math´ematique,’ Bulletin de la Soci´et´e fran¸caise de philosophie, 40 (1), 1–39. ¹² Cavaill`es, ‘La pens´ee math´ematique,’ 9.
newton’s principia
191
of the mathematical result is not appearance or accident, but essential and originary; and the connections it uncovers are not therefore contingent, but truly necessary. Mathematical necessity is historical, but it is necessity nonetheless. We need to learn the lessons of Hegel and Peirce without borrowing their tendencies to totalize the processes of history they so brilliantly addressed. As I have argued at length, the pragmatist correction of the semantic approach to philosophy of science and mathematics (itself a correction of the syntactic approach), reinstates history in philosophical reflection. Summarizing his arguments against the primary epistemic role given to the notion of ‘isomorphism’ in the semantical account of truth as a relation among a theory and its models, my fellow pragmatist Robin Hendry writes, Firstly, representation cannot be identified with isomorphism, because there are just too many relation-instances of isomorphism. Secondly, a particular relationinstance of isomorphism is a case of representation only in the context of a scheme of use that fixes what is to be related to what, and how. Thirdly, in reacting to the received [syntactic] view’s linguistic orientation, the semantic view goes too far in neglecting language, because language is a crucial part of the context that makes it possible to use mathematics to represent. Natural languages afford us abilities to refer, and equations borrow these abilities. We cannot fully understand particular cases of representation in the absence of a ‘natural history’ of the traditions of representation of which they are a part.¹³
Important problems often reorganize and extend mathematics via the processes by which they are solved. All of them work by indirection, and establish new correlations that transcend or traverse the boundaries of domains as they had been settled up to that point. This reorganization and extension is unprecedented, but once established the determinacy of the things correlated renders it determinate as well and far from subjective. Emerging conditions make new things and new alignments possible. And many problem-solving methods depend upon, or depart from, the canonicity of certain mathematical things vis-`a-vis others. The unit in one sense and the prime numbers in another sense are canonical vis-`a-vis the natural numbers; the triangle is canonical vis-`a-vis a large family of surfaces; sine and cosine are canonical vis-`a-vis a large family of functions; and so forth. These relations of primacy are also not ‘merely subjective,’ and they, like ¹³ Hendry, ‘Mathematics, Representation and Molecular Structure,’ 227.
192
geometry and 17 th century mechanics
the importance of important theorems, are absent in the presentation given by formal systems invoked by Kitcher. In fact, a formal system can’t explain anything; it only formalizes explanations that are brought to it. Mathematics is rational because of the way in which it furnishes explanations.
7.3. Book I, Propositions I and VI in Newton’s Principia In the Principia one task of the symbolic languages of the theory of proportions and (in a submerged way) algebra, and the iconic language of diagrams is to express the likeness of things that are different—geometrical things and the solar system. Moreover, the task of bringing these things into rational relation rebounds upon those languages and alters them. The theory of proportions is reconceived to include ‘ultimate’ ratios between elements which are infinitely or indefinitely small, ‘nascent’ or ‘evanescent.’ Algebra is pushed in the direction of differential equations, a tendency that is realized on the Continent, not in England. And the geometrical diagram is altered in its import: not only are evanescent segments and arcs introduced, but curves and surfaces are thought of as generated by various motions, each of which has its velocity or ‘fluxion.’ These alterations are essential to the correlation of geometrical things with time and force. As Fran¸cois De Gandt observes, The argumentation of the Principia remains resolutely geometric: the reasoning consists in interpreting the figure, in reading there certain relations of proportionality, which are then transformed according to the usual rules. The grand innovation, in relation to the ancients, consists in studying what these relations become when certain elements of the figure tend toward limiting positions or become infinitely small.
A page later he adds, The deepest rupture is in the very definition of magnitudes or quantities: the mathematics of the ancients speaks of fixed or determinate quantities, while Newton treats of quantities that can ‘tend ... to,’ ‘approach,’ up to an ultimate situation. Variation and time are here essential to the very meaning and definition of magnitude.¹⁴ ¹⁴ De Gandt, Force and Geometry in Newton’s Principia, 225–6. The influence of this book is apparent throughout my exposition here. I would also like to recommend the philosophical treatment of
newton’s principia
193
A related development stems from the algebraization of geometry. For geometers of the late seventeenth century like Barrow and Newton, or Schooten, Huygens and Leibniz, a curve is understood to embody relations among several variable geometrical quantities, which are defined with respect to a variable point (x, y) on the curve. These quantities include the abscissa, ordinate, arclength, tangent, subtangent, normal, radius and polar arc, as well as the area between the curve and the x-axis, and the area of the rectangle xy.¹⁵ The relations among these quantities are represented, if possible, by equations: Leibniz’s attempts to expand this kind of representation leads to the theory of differential equations and the exploration of transcendental relations. In short, the curve analyzed by algebra induces a family of related quantities constrained by the shape of the curve: it creates and expresses a system of geometrical quantities. Thus when a curve is thought of as a trajectory, it may also be thought to express a system of physical parameters which mutually constrain each other: for Newton a trajectory is the nexus of an interplay of forces. A trajectory is an odd object of thought. We are used to thinking about the shape of a trajectory, because many objects leave traces behind them as they move: a smoking torch or a jet plane leaves clouds behind it marking its path, a fish leaves a trail of bubbles in water, a boat leaves stream lines, a pen leaves a trail of ink on paper. Yet such a trace is a hypostatization, since a trajectory is the lingering and fixed record of a temporal and evanescent process; it is an intelligible object, like a melody, but a highly abstract one. Since it is so abstract and yet located precisely in time and space, a trajectory is a good locus for bringing geometry and mechanics into rational relation. For Leibniz and Newton it is not only temporal but also the record of a dynamic process that expresses the interplay of forces. By examining how natural and formal languages link geometry and the solar system, I hope to shed light more generally on how mechanics brings mathematics into rational relation with physical reality (material or perceptible or phenomenal). This is not the same as bringing it into relation Newton in Michel Blay’s Les raisons de l’infini: Du monde clos a` l’univers math´ematique (Paris: Gallimard, 1993), translated by M. B. DeBevoise into English as Reasoning with the Infinite: From the Closed World to the Mathematical Universe (Chicago: University of Chicago Press, 1998); and in Marco Panza’s Isaac Newton (Paris: Les Belles Lettres, 2003). ¹⁵ See the opening pages of H. J. M. Bos, ‘Differentials, higher-order differentials, and the derivative in the Leibnizian calculus,’ Archive for History of Exact Sciences, 14 (1974/75), 1–90.
194
geometry and 17 th century mechanics
with the ‘given,’ pace Descartes, Locke, Kant, and A. J. Ayers. Recall that Descartes treats ‘res extensa’ as the datum in his physics; Locke treats ‘ideas of sense’ as data that ground his philosophical defense of science; A. J. Ayer revives that empiricist notion in the ‘sense data’ that ground his phenomenalism; and Kant treats the manifold of sensible intuition as the given in his account of science in the first Critique. All of them treat the given as a surd that must be worked up by the mind, organized and unified, before it can be thought. By contrast, I side with Leibniz and the principle of reason in holding that anything that might be merely ‘given’ would be by its very definition unintelligible, and so could not be encountered as an existing whole. Otherwise put, there is no merely material or perceptible stuff or a ‘manifold of sense’ that thought must then organize. Moreover, structure itself—the structure expressed by language—is intelligible only in relation to existing things, things that express the structure. (To express structure is not the same as to instantiate it; instantiation is a notion special to logic that deserves closer scrutiny within current philosophy of logic and should only very cautiously be exported.) Structure cannot be thought by itself. Briefly stated, I am trying to avoid the myth of the given as well as the myth of pure syntax.¹⁶ Proposition XI in Book I of the Principia forges a novel relation between the solar system and geometry by explaining why the ellipse qua trajectory is a condition of intelligibility for the stably persistent solar system, and the process of explanation registers important changes in mechanics as well as mathematics.¹⁷ The proposition builds on a process that is already more than a century old and involves the reconstruction of understanding the earth and the sun, and of geometry, in the work of Copernicus, Tycho Brahe, Kepler, Descartes, Galileo, Huygens, and others. To understand the genealogy of Proposition XI, we must go back a bit in Book I, to Proposition I (which shows how to represent time by means of geometry) and Proposition VI (which shows how to represent force). Proposition I is Newton’s generalization of Kepler’s law of areas: ‘The areas which revolving bodies describe by radii drawn to an immoveable center of force ¹⁶ For a deep and suggestive account of this issue, see Dale Jacquette, ‘Intentionality and the Myth of Pure Syntax’, Protosoziologie, 6 (1994), 76–89, 331–3. ¹⁷ Sir Isaac Newton’s Mathematical Principles of Natural Philosophy and his System of the World, tr. A. Matte, ed. F. Cajori, vol. 1 (Berkeley: University of California Press, 1934) 56–7. Hereafter referred to as Principia.
newton’s principia
195
do lie in the same immoveable planes, and are proportional to the times in which they are described.’¹⁸ (Recall that Kepler’s formulation of this law, like his claim that planetary orbits are elliptical, rests both on his sense of geometrical propriety and on the vastly improved astronomical data that he inherited from Tyco Brahe.) Newton’s proof of Proposition I is accompanied by a diagram (see Figure 7.1) where S is the center of force. A body proceeds on an inertial path from A to B in an interval of time; if not deflected, it would continue on in a second, equal interval of time along the virtual path Bc. However, Newton continues, ‘when the body is arrived at B, suppose that a centripetal force acts at once with a great impulse’ so that the body arrives not at c, but at C. Then cC(= BV) represents the deflection of the body due to the force; indeed, as will become apparent, cC = BV becomes the geometrical representative of the force. The perimeter ABCDEF ... is the trajectory of the body as it is deflected at the beginning of each equal interval of time by discrete and instantaneous impulsions from S. Newton then uses the Euclidean theorem that triangles with equal bases and equal elevations have equal areas, to show that the area SAB = area SBc = area SBC; this equality extends to triangles SDC, SED, SFE ... by the same reasoning, so that equal areas are described in equal times. We have only, as Newton says, ‘to let the number of those triangles be augmented, and their breadth diminished in infinitum’ for this result to apply to a continuously acting force and a curved trajectory. Notice that this result holds for any kind of central force. The regularity of the sweeping out of areas is linked only to the directionality of deviation; if the deviation Cc is always directed towards a fixed point, then the areas swept out are proportional to the time. Force is defined only as the ability of a ‘center of force’ (whatever that is) to cause a body to deviate from inertial motion. Thus it appears that the trajectory of a planet, whatever its mathematical form, in turn has a condition of intelligibility that is defined formally and causally: a center of force that leads to its deflection in a regular way from uniform motion in a straight line. The shape of the trajectory testifies to the center of force and the law it obeys; the lawful center of force explains the shape of the trajectory.
¹⁸ Newton, Principia, 40–2.
196
geometry and 17 th century mechanics
The diagram and its accompanying proportions are thoroughly ambiguous, and must be read in two incompatible ways. Read as a collection of finite line segments and areas, where the perimeter is a polygon, they allow for the application of Euclidean theorems to the problem; read as a collection of infinitesimal as well as finite lines and areas, where the perimeter is a curve, they become pertinent to accelerated motion, time, and force. Thus the diagram, whose meaning and intent cannot be understood unless it is read in both ways, could not have arisen in Euclidean geometry; indeed, even if construed as a finite configuration, it could not have arisen there or in the geometry of Archimedes. It does not even resemble a classical problem of quadrature, for if it were, the perimeter would be a known curve, and there would be no reason to single out the point S, or to study the line segments Bc and cC = BV. The diagram makes sense only with reference to a problem about force, time, and motion, in which geometry enters as an auxiliary means to help solve it. What Newton finds in Kepler’s law of areas is a way to express time in terms of non-uniform rather than uniform motion, as the sweeping out of equal areas in equal times, and a way to identify in formal terms a center of force. Proposition XI is a special case of the general result that Newton works out in Proposition VI, where he shows that for any kind of revolution APQ of a body P around a center of force S, the centripetal force will be inversely proportional to the quantity SP2 × QT2 /QR (see Figure 7.2). Newton writes, ‘In a space void of resistance, if a body revolves in any orbit about an immoveable center, and in the least time describes any arc just then nascent; and the versed sine of that arc is supposed to be drawn bisecting the chord, and produced passing through the center of force: the centripetal force in the middle of the arc will be directly as the versed sine and inversely as the square of the time.’¹⁹ The deviation QR is proportional to the intensity of the force tending to S, and also proportional to the square of the time. The proof runs as follows. PR is the virtual trajectory the body would have followed if it had not been deflected by S, and by the First Law that governs inertial motion it is directly proportional to time t. By Proposition I, the curvilinear area SPQ is also proportional to t; and since in ‘the least time’ it may be considered a triangle, 2 SPQ = SP × QT. Then PR, ¹⁹ Newton, Principia, 48–9.
newton’s principia
197
Figure 7.2. Newton, Principia, Book I, Section II, Proposition VI, Theorem V
proportional to t, is also proportional to SP × QT, an evanescent area. The segment QR represents the virtual deviation of the body as a result of an ‘impulse’ of force in ‘the least time,’ and is thus directly proportional to the force. It is also directly proportional to t2 , by Lemma X: ‘The spaces which a body describes by any finite force urging it ... are in the very beginning of the motion to each other as the squares of the times.’ Here Newton asserts an analogy between any such force, and gravity, generalizing Galileo’s result that in free fall the space traversed is proportional to the square of the time—in the first instant of motion, ‘the very beginning of the motion.’ In sum, F is proportional to QR / SP2 × QT2 . In terms of the diagram and Newton’s preference for writing his result so that the ratio has three dimensions, the force is inversely as SP2 × QT2 /QR: ‘the centripetal force will be inversely as the solid SP2 × QT2 /QR, if the solid be taken of that
198
geometry and 17 th century mechanics
magnitude which it ultimately acquires when the points P and Q coincide.’ In the sequence of propositions that follow, Newton exploits the peculiar geometrical properties of various possible trajectories APQ to transform this latter expression into another expression containing only constants multiplied by the distance SP raised to a certain power. That is, Newton explores how the geometry of the curve may characterize the force; and this is what leads to Proposition XI. Note that very little, in either the diagram of Proposition VI or the reasoning about it, is drawn from Euclidean geometry: only that SP is a line segment, PY is the tangent to the curve at P, and the area of a triangle is half the product of its base and altitude. (And indeed the application of the latter theorem is quite un-Euclidean.) The exploitation of the geometry of the trajectory comes later, as Proposition VI is applied to various cases, only one of which is ‘real.’ The meaning of the diagram is determined for the most part by the way it represents a physical situation, since why PR, QR, and SP × QT are chosen and how they are related can only be explained by theorems of mechanics developed by Kepler, Galileo, Descartes, and Newton. Of course, these theorems also geometrize mechanics in the sense of discovering geometrical forms as conditions of intelligibility for physical things in novel ways. And Newton’s way of expressing his result, ‘the centripetal force will be inversely as the solid SP2 × QT2 /QR’ allows him to exploit the proportion idiom of the Eudoxian tradition to relate and yet discriminate heterogeneous physical magnitudes, lines, and areas, and finite and infinitesimal magnitudes.
7.4. Book I, Proposition XI in Newton’s Principia In Proposition XI, Newton applies Proposition VI to the case where the trajectory is an ellipse, the form that Kepler brought to light but could not explain or put together with his law of areas. Clearly, this is a crucial step in the application of Newton’s results to the System of the World in Book III. The diagram of Proposition XI (see Figure 7.3) combines the physicalgeometrical schema of the diagram of Proposition VI with the ‘pure geometry’ of the ellipse, but it is instructive to examine the combination in detail. The latus rectum L = 2BC2 /AC and the diameters of the ellipse, BC, SA, DK, and PG have no physical import; but the perimeter APBDGK
newton’s principia 199
Figure 7.3. Newton, Principia, Book I, Section III, Proposition XI, Theorem VI
200
geometry and 17 th century mechanics
is also the orbit of a revolving body, S also the center of force, SP also the distance of the revolving body to the center of force, and PR, QR, and SPQ = 1/2(SP × QT) retain their physical significance. By contrast, the auxiliary lines PH, IH, and PF, like the ellipse’s diameters, enter the reasoning only insofar as they are geometrical. What the figure depicts is thus a thorough hybrid, a creature of both geometry and mechanics. The proof proceeds by establishing proportions between the segments and products of line segments by means of theorems about similar triangles, isosceles triangles, and ellipses, and culminates in an elaborate, two-sided compounding of these ratios. Newton directs, ‘Let S be the focus of the ellipse. Draw SP cutting the diameter DK of the ellipse in E, and the ordinate Qv in x; and complete the parallelogram QxPR,’ and then begins by proving that EP = AC, using auxiliary lines HI and HP (H is the other focus, besides S). Since IHS is similar to ECS, SE = EI and EP = 1/2(PS + PI), or 1/2(PS + PH), since PIH is isosceles. PS + PH = 2AC by the nature of ellipse construction, so EP = AC. ²⁰ Then Newton sets out to establish certain extended proportions— L × QR : L × Pv :: QR : Pv (where the constant latus rectum L = 2 BC2 /AC). Since this refers to a physical situation that is ‘nascent,’ Newton is profiting from the potential openness of ‘:’ and ‘::’ to relate non-Archimedean magnitudes. QR(= Px): Pv :: PE : PC because Pxv is similar to PEC, another highly non-Euclidean application of a Euclidean theorem. Finally, PE : PC :: AC : PC because PE = EP = AC, so that L × QR : L × Pv :: AC : PC. Next, Newton asserts that L × Pv : Gv × Pv :: L : Gv and that Gv × Pv: Qv2 :: PC2 : CD2 , a fact about ellipses, except that Pv and Qv are infinitesimal magnitudes. It is only at this point in the proof that Newton says explicitly, let Q → P: ‘when the points P and Q coincide, Qv2 = Qx2 .’ One might then take the foregoing reasoning to be about very small finite magnitudes, so that the application of Euclidean magnitudes is straightforward. However, as we shall shortly see, the final compounding of ratios even-handedly combines ratios established before and after this step in the proof. Newton uses the ambiguity of the diagram with its ²⁰ For a more detailed account of the proof, see my ‘Some Uses of Proportion in Newton’s Principia, Book I,’ Studies in History and Philosophy of Science, 18 (2) (1987), 209–20.
newton’s principia
201
accompanying proportions: read as finite, it allows the application of Euclidean results; read as ‘nascent,’ it provides a mathematical schema for force, time, and accelerated motion. When points P and Q coincide, Newton claims that Qv2 = Qx2 and so Qx2 (= Qv2 ) : QT2 :: EP2 : PF2 , since the infinitesimal triangle QxT is similar to PEF. Then EP2 : PF2 :: CA2 : PF2 by the first result, and CA2 : PF2 :: CD2 : CB2 by a previously established result about ellipses. Thus Qx2 (= Qv2 ) : QT2 :: CD2 : CB2 . Newton is now ready to carry out the final compounding, which may be summarized in the following perspicuous array.²¹ L × QR L × Pv Gv × Pv Qv2 Qx2
: L × Pv : Gv × Pv : Qv2 : Qx2 : QT2
:: AC :: L :: PC2 :: 1 :: CD2
: PC : Gv : CD2 :1 : CB2
Edith Sylla, in her article just cited, notes that Newton has set up the left-hand ratios as a continuous series, and compounds them according to the Eudoxian tradition, taking the extreme terms and forming the new ratio L × QR : QT2 . The right-hand ratios he compounds according to the medieval tradition, by multiplication: AC × L × PC2 × 1 × CD2 : PC × Gv × CD2 × 1 × CB2 or (substituting 2BC2 / AC for L and cancelling) 2PC : Gv. The foregoing discussion of Newton’s analysis explains why Newton makes this distinction in his compounding. The magnitudes on the lefthand side have physical import and are evanescent, that is, they are just the kind of magnitude which should be treated in a manner that respects the heterogeneity of terms and—given the indeterminacy of the sign ‘:’—allows for the manipulation of infinitesimal magnitudes. Those on the right-hand side, if we recall that Gv = GP as Q → P, are all constant finite geometrical line lengths with no physical import. Having no reason to think of them other than as numbers, even when they are squared, ²¹ See E. Sylla, ‘Compounding Ratios,’ 16.
202
geometry and 17 th century mechanics
Newton handles them according to the second tradition, multiplying them as rational numbers. The compounding thus yields the proportion: L × QR : QT2 :: 2PC : Gv. As Q → P, 2 PC = Gv, so L × QR : QT2 :: 1 : 1, so in turn L × QR is proportional to QT2 . Multiplying both these terms by SP2 / QR, we find that SP2 × QT2 / QR is proportional to L × SP2 , or, since L is a constant, to SP2 . Thus, the central force in this problem is inversely proportional to the square of the distance SP. Nature presents us with shapes that command our attention and invite explanation. The shape of a hanging chain is immediately visible; the path of the sun through the daytime sky is—rather slowly—visible as the ecliptic and the path of the planets through the night sky is visible too, in a more constructed and indirect sense, given the way they retrogress. Leibniz’s analysis discovers the catenary (the function cosh x) as a condition of intelligibility of the hanging chain, when it is understood as an equilibrium of forces and expressed in terms of his novel, revised algebra, both as a differential equation and as the family of solutions to that equation. Newton’s analysis discovers the ellipse as a condition of intelligibility for the solar system; but the ellipse serves as such a condition only in tandem with another condition of intelligibility, the formal cause of a center of force. This yoking means that the ellipse must be thought not only as a geometrical unity (which it is, and does not cease to be) but also as the trajectory of a moving body constrained by a central force. Proposition XI on the one hand and Propositions XXXIX–XLI on the other exhibit how these two conditions, one stemming from classical geometry and the other from a radically novel conception of mechanics, determine each other. The diagrams in the latter set of propositions, significantly, show how geometrical diagrams may be used in a way that is mostly symbolic and only barely iconic. Icons, especially icons that must be read in two or three different ways, do not wear their meaning on their faces; the interpretation of icons is not direct or ‘intuitive.’ And a geometrical figure can be used not as an icon but as a symbol. In one sense analysis tries to find simples vis-`a-vis a complex thing; sometimes this means finding the parts of a whole, but of course meriology takes many forms and not all analysis is meriology. Newton doesn’t identify individual heavenly bodies as the simples of the solar system, but
newton’s principia
203
rather has the subsystem of a planet and the sun and a body in inertial motion deflected in an orderly way by a center of force. The relation of this subsystem-nomological-machine, and model, to the System of the World is far from additive, unlike the relation of bricks to a wall, and the reconstitution of the whole from the simples is both difficult and in principle incomplete. The simples of the model, exhibited by the diagram-proof and the ‘perspicuous array’ given above, are the ratios and proportions associated with the elements of the ellipse. Here the simples are not parts in any sense. In particular, Newton explores what becomes of the important ratio SP2 × QT2 / QR in the case where the trajectory is an ellipse, and discovers that it is the constant L × SP2 , which is the key to the whole problem. In brief, Newton reduces the complexity of the situation by treating time, force, accelerated motion, uniform motion, physical distance and merely geometrical distance evenhandedly as magnitudes and then considering the proportions in which they must stand to each other. Some of the heterogeneity amongst these terms is reinstated, as we have seen, when the ratios are compounded. Another nice example of Newtonian analysis is the result from Proposition I used in Proposition XI where two virtual (and indeed evanescent) trajectories, compounded in accordance with Corollary I of the Axioms of Motion and understood as inertial motion and as the first moment of free fall, yield the actual motion of the revolving body. So too is the way the mathematical form of the ellipse and the mechanical form of a center of force become, as the means by which geometry is brought into alignment with the solar system, conditions for the intelligibility of the world. Analysis searches for requisites, that without which the thing could not be what it is, or in mathematics, that without which a problem could not be solved. The wholes that present themselves as intelligible existents always require analysis, though they can never be completely reconstituted or retrieved from the elements that analysis discovers. This is true of the Euclidean triangle, and Newton’s System of the World, and the solar system in which we find ourselves.
8 Leibniz on Transcendental Curves When Louis Couturat and Bertrand Russell enlisted Leibniz as the champion of logicism, in the search for a single perfect idiom for the truths of mathematics and science, they dismissed at the same time much of his writing on theology and metaphysics. This was of course a brutal triage that generations of scholars throughout the twentieth century have critically examined and rejected, but in the Anglophone world we have only recently begun to assess properly how it distorts our understanding of Leibniz’s account of mathematics and science.¹ In this chapter, I argue that Leibniz believed that mathematics is best investigated by means of a variety of modes of representation, often stemming from a variety of traditions of research, like our investigations of the natural world and of the moral law. I expound this belief with respect to two of his great metaphysical principles, the Principle of Perfection and the Principle of Continuity, both versions of the Principle of Sufficient Reason; the tension between the latter and the Principle of Contradiction is what keeps Leibniz’s metaphysics from triviality. I’ll then illustrate my exposition with two case studies from Leibniz’s mathematical research, his development of the infinitesimal calculus, and his investigations of transcendental curves. ¹ Good examples of this assessment include C. Wilson, Leibniz’s Metaphysics: A Historical and Comparative Study (Princeton: Princeton University Press, 1989); E. Yakira, Contrainte, n´ecessit´e, choix (Zurich: Editions du Grand Midi, 1989), esp. ch. VI and VII; F. Duchesneau, Leibniz et la m´ethode de la science (Paris: Presses universitaires de France, 1993) and La dynamique de Leibniz (Paris: Vrin, 1994); M. Fichant, La r´eforme de la dynamique (Paris: Vrin, 1994) and Science et m´etaphysique dans Descartes et Leibniz (Paris: Presses Universitaires de France, 1998); D. Rutherford, Leibniz and the Rational Order of Nature (Cambridge: Cambridge University Press, 1998); C. Mercer, Leibniz’s Metaphysics, Its Origins and Development (Cambridge: Cambridge University Press, 1998), part IV. Herbert Breger and Eberhard Knobloch have written various important essays on this topic in the last couple of decades; Daniel Garber is currently working on a study of Leibniz’s physics and philosophy; and Ursula Goldenbaum is completing a book on the influence of Hobbes on Leibniz’s metaphysics and science. All of these scholars are indebted to the mid-century work of Pierre Costabel and Martial Gueroult.
leibniz on transcendental curves 205
8.1. The Principle of Continuity Leibniz wrote a public letter to Christian Wolff, in response to a controversy over the reality of certain mathematical items sparked by Guido Grandi; it was published in the Supplementa to the Acta Eruditorum in 1713 under the title ‘Epistola ad V. Cl. Christianum Wolfium, Professorem Matheseos Halensem, circa Scientiam Infiniti.’² Towards the end, he presents a diagram (discussed below in Section 8.2) and concludes, All this accords with the Law of Continuity, which I first proposed in the Nouvelles de la R´epublique des Lettres of Bayle, applied to the Laws of Movement.³ It entails that with respect to continuous things, one can treat an external extremum as if it were internal [ut in continuis extremum exclusivum tractari posit ut inclusivum], so that the last case or instance, even if it is of a nature completely different, is subsumed under the general law governing the others.
He cites as illustration the relation of rest to motion and of the point to the line: rest can be treated as if it were evanescent motion and the point as if it were an evanescent line, an infinitely small line. Indeed, Leibniz gives as another formulation of the Principle of Continuity the claim that ‘the equation is an infinitesimally small inequality.’⁴ The Principle of Continuity, he notes, is very useful for the art of invention: it brings the fictive and imaginary (in particular, the infinitely small) into rational relation with the real, and allows us to treat them with a kind of rationally motivated tolerance. For Leibniz, the infinitely small cannot be accorded the intelligible reality we attribute to finite mathematical entities because of its indeterminacy; yet it is undeniably a useful tool for engaging the continuum, and continuous items and procedures, mathematically. The Principle of Continuity gives us a way to shepherd the infinitely small, despite its indeterminacy, into the fold of the rational. It is useful in another sense as well: not only geometry but also ² A. E. Supplementa 1713, t. V, section 6; M.S. V, 382–7. The latter abbreviation stands for Mathematische Schriften, ed. C. I. Gerhardt, 7 vols. (Berlin: A. Asher/Halle: H. W. Schmidt, 1848–63; repr. Hildesheim: Georg Olms, 1962). ³ ‘R´eplique a` l’abb´e D.C. sous forme de letter a` Bayle,’ Feb. 1687; P.S. III, 45. The latter abbreviation stands for Die Philosophischen Schriften, ed. C. I. Gerhardt, 7 vols. (Berlin: Weidemann, 1875–90; repr. Hildesheim: Georg Olms, 1978). ⁴ M.S. VII, 25, for example.
206
geometry and 17 th century mechanics
nature proceeds in a continuous fashion, so the Principle of Continuity guides the development of mathematical mechanics. But how can we make sense of a rule that holds radically unlike (or, to use Leibniz’s word, heterogeneous) terms together in intelligible relation? I want to argue that two conditions are needed. First, Leibniz must preserve and exploit the distinction between ratios and fractions, because the classical notion of ratio presupposes that while ratios link homogeneous things, proportions may hold together inhomogeneous ratios in a relation of analogy that is not an equation. This allowance for heterogeneity disappears with the replacement of ratios by fractions: numerator, denominator, and fraction all become numbers, and the analogy of the proportion collapses into an equation between numbers.⁵ However, Leibniz’s application of the Principle of Continuity is more strenuous than the mere discernment of analogy: the relation between 3 and 4 is analogous to the relation between the legs of a certain finite right triangle. But the relation between the legs of a finite and those of an infinitesimal 3-4-5 right triangle is not mere analogy; the analogy holds not only because the triangles are similar but also because of the additional assumption that as we allow the 3-4-5 right triangle to become smaller and finally evanescent, ‘the last case or instance, even if it is of a nature completely different, is subsumed under the general law governing the others.’ Thus, the notation of proportions must co-exist beside the notation of equations; but even that combination will not be sufficient to express the force of the Principle of Continuity. The expression and application of the principle requires as a second condition the adjunction of geometrical diagrams. They are not, however, Euclidean diagrams, but have been transformed by the Principle of Continuity into productively ambiguous diagrams whose significance is then explicated by algebraic equations, differential equations, proportions, and infinite series, and the links among ⁵ Some commentators have been puzzled by Leibniz’s allegiance to the notion of ratio and proportion. Marc Parmentier, for example, writes, nous devons nous rappeler que les math´ematiques de l’´epoque n’ont pas encore la¨ıcis´e les antiques connotations que recouvre le mot ratio. A cette notion s’attache un archa¨ısme, auquel l’esprit de Leibniz par ailleurs si novateur, acquitte ici une sorte de tribute, en s’obstinant dans une position ind´efendable. La ratio constitue a` ses yeux une entit´e s´epar´ee, ind´ependante de la fraction qui l’exprime ou plus exactement, la mesure. En ce domaine l’alg`ebre n’a pas encore appliqu´e le rasoir d’Occam. La preuve en est que la ratio e´ tait encore le support de la relation d’analogie, e´ quivalence de deux rapports, toute diff´erente de la simple e´ galit´e des produits des extremes et des moyens dans les fractions. (G. W. Leibniz, La naissance du calcul differential, ed. M. Parmentier (Paris: Vrin, 1989), 42)
leibniz on transcendental curves 207 them in turn explicated by natural language. In these diagrams, the configuration can be read as finite or as infinitesimal (and sometimes infinitary), depending on the demands of the argument; and their productive ambiguity, which is not eliminated but made meaningful by its employment in problem-solving, exhibits what it means for a rule to hold radically unlike things together. This is a pattern of reasoning, constant throughout Leibniz’s career as a mathematician, which the Logicists who appropriated Leibniz following Louis Couturat and Bertrand Russell could not discern, much less appreciate. As Herbert Breger argues in his essay ‘Weyl, Leibniz und das Kontinuum,’ the Principle of Continuity and indeed Leibniz’s conception of the continuum—indebted to Aristotle on the one hand, and seminal for Hermann Weyl, Friedrich Kaulbach and G.-G. Granger on the other—is inconsistent with the Logicist program, even the moderate logicism espoused by Leibniz himself not to mention the more radical versions popular in the twentieth century. The intuition (Anschauung) of the continuous, as Leibniz understood it, and the methods of his mathematical problem-solving, cannot be subsumed under the aegis of logical identity. Breger adds, ‘I can’t go into this conjecture here, and would like simply to assert that although Leibniz did advocate a philosophical program corresponding to Logicism, he also distanced himself a great deal from it in his mathematical practice.’⁶ In the three sections that complete this chapter, I will show that this pattern of reasoning characterizes Leibniz’s thinking about, and way of handling, non-finite magnitudes throughout his active life as a mathematician.
8.2. Studies for the Infinitesimal Calculus In 1674, Leibniz wrote a draft entitled ‘De la Methode de l’Universalit´e,’⁷ in which he examines the use of a combination of algebraic, geometric ⁶ ‘Ich kann dieser Vermutung hier nicht nachgehen und m¨ochte mich mit der Feststellung begn¨ugen, dass Leibniz zwar ein dem Logizismus entsprechendes philosophisches Programm vertreten hat, dass er aber durch seine Mathematik selbst sich weit von diesem Programm entfernt hat ... ’ H. Breger, ‘Weyl, Leibniz, und das Kontinuum,’ Studia Leibnitiana, Supplementa 26 (1986), 316–30. ⁷ C, 97–122; Bodemann V, 10, f, 11–24. The former abbreviation stands for Opuscules et fragments in´edits de Leibniz, ed. L. Couturat (Paris, 1903; repr. Hildesheim: George Olms, 1961). Bodemann is the catalogue published in 1895 for the collection of manuscript papers of Leibniz at the G. W. Leibniz Bibliothek/Nieders¨achsische Landesbibliothek in Hannover, Germany.
208
geometry and 17 th century mechanics
and arithmetic notations, and defends a striking form of ambiguity in the notations as necessary for the ‘harmonization’ of various mathematical results, once treated separately but now unified by his new method. He discusses two different kinds of ambiguity, the first dealing with signs and the second with letters. The simplest case he treats is represented this way:
A
C
B
C
The point of the array is to represent a situation where A and B are fixed points on a line; this means that if the line segment AC may be determined by means of the line segment AB and a fixed line segment BC = CB, there is an ambiguity: the point C may logically have two possible locations, one on each side of B. Leibniz proposes to represent this situation by a sole equation, which however involves a new kind of notation. He writes it this way: AC = AB = | BC, and goes on to suggest a series of new signs for operations, corresponding to cases where there are three, four, or more fixed points to begin with. He generates the new symbols by a line underneath (which negates the operation) or by juxtaposing symbols. (One sees some nascent group theory here.)⁸ Re-expressing the same point in algebraic notation, he writes that = | a + b, or +a = | b = c means that +a + b, or −a + b, or +a + b, or +a − b, and goes on to give a more complex classification for ambiguous signs. The important point is that the ambiguous signs can be written as a finite number of cases involving unambiguous signs.⁹ The treatment of ambiguous letters, however, is more complex, truly ambiguous, and fruitful. He illustrates his point with a bit of smoothly curved line AB(B)C intersected at the two points B and (B) by a bit of straight line DB(B)E. The notation AB(B)C and DB(B)E is ambiguous in two different senses, he observes. On the one hand, the concatenated letters may stand for a line, or they may stand for a number, ‘since the numbers are represented by divisions of the continuum into equal parts,’¹⁰ and because, by implication, Descartes has shown us how to understand products, quotients, and nth roots of line segments as line segments. On the ⁸ C, 100.
⁹ C, 102.
¹⁰ C, 105
leibniz on transcendental curves 209 other hand (and this is a second kind of ambiguity), lines may be read as finite, as infinitely large, or as infinitely small. The mathematical context will tell us how to read the diagram, and he offers the diagram just described as an example: ‘ ... Thus in order to understand that the line DE is the tangent, one has only to imagine that the line B(B) or the distance between the two points where it intersects the curve is infinitely small: and this is sufficient for finding the tangents.’¹¹ In this configuration, reading B(B) as finite so that the straight line is a secant, and as infinitesimal so that the straight line is a tangent, is essential to viewing the ‘harmony’ among the cases, or, to put it another way, to viewing the situation as an application of the Principle of Continuity. The fact that they are all represented by the same configuration, supposing that B(B) may be read as ambiguous, exhibits the important fact that the tangent is a limit case subject to the same structural constraints as the series of secants that approach it. And this is the key to the method of determining tangents. A good characteristic allows us to discern the harmony of cases, which is the key to the discovery of general methods; but such a characteristic must then be ambiguous. To further develop the point, Leibniz returns to his original example, adumbrated. A
B 1C
(3C)
3C
((3C))
2C
Once again, A and B are fixed points on a line. When we set out the conditions of the problem where a line segment AC is determined by two others, AB and BC, the point C may fall not only to the left or right of B, but directly on B: ‘the point C which is moveable may fall on the point B.’¹² Since we want the equation AC = +AB = | BC to remain always true, we must be sure to include the case where B and C coincide, that is, where BC is infinitely small, ‘so that the equation may not contradict the equality between AC and AB.’¹³ In other words, the equality AC = AB is a limit case of the equation just given. In order to exhibit its status as a limit case, or (to use Leibniz’s vocabulary) to exhibit the harmony among these ¹¹ C, 105.
¹² C, 106.
¹³ C, 106.
210 geometry and 17 th century mechanics arithmetic facts and thus the full scope of the equation, we must allow that BC may be infinitely small. Here, Leibniz observes, the ambiguity of the sign = | is beside the point and doesn’t matter; but the ambiguity of the letters is essential for the application of the principle of continuity, and so cannot be resolved but must be preserved. Since one may place 3C, not only directly under B, in order to make AC = AB and BC equal to zero, but over towards A at (3C), or over on the other side of B at ( (3C) ) in order to make the equation AC = +AB − BC true on the one hand or on the other to make the equation AC = +AB + BC true, provided that the line (3C)B or ( (3C) )B be conceived as infinitely small. You see how this observation can serve the method of universality in order to apply a general formula to a particular case.¹⁴
In the diagram, (3C) and 3C, or 3C and ((3C)) may be identified when AC = AB, as B and (B) are in the preceding diagram when the secant becomes the tangent. Leibniz’s intention to represent series or ranges of cases so as to include boundary cases and maximally exhibit the rational interconnections among them all depends on the tolerance of an ineluctable ambiguity in the characteristic. Some of the boundary cases involve the infinitesimal, but some involve the infinitary. Scholars often say that while Aristotle abhorred the infinite and set up his conceptual schemata so as to exclude and circumvent it, Leibniz embraced it and chose conceptual schemata that could give it rational expression. This is true, and accounts for the way in which Leibniz devises and elaborates his characteristics in order to include infinitary as well as infinitesimalistic cases; but it has not been noticed that this use renders his characteristic essentially ambiguous. And he says as much. He notes that the use of ambiguously finite/infinitesimal lines had been invoked by Guldin, Gregory of St. Vincent and Cavalieri, while the use of ambiguously finite/infinite lines was much less frequent, though not unknown: For long ago people noticed the admirable properties of the asymptotes of the hyperbola, the conchoid, the cissoid, and many others, and the geometers knew ¹⁴ C, 106.
leibniz on transcendental curves 211 that one could say in a certain manner that the asymptote of the hyperbola, or the tangent drawn from the center to that curve, is an infinite line equal to a finite rectangle ... and in order to avoid trouble apropos the example we are using in order to try out this method, we will find in what follows that the latus transversum of the parabola must be conceived as an infinite length.¹⁵
Leibniz alludes to the fact that if we examine a hyperbola (or rather, one side of one of its branches) and the corresponding asymptote, the drawing must indicate both that the hyperbola continues ad infinitum, as does the asymptote, and that they will meet at the ideal point of infinity; moreover, a rule for calculating the area between the hyperbola and the asymptote (identified with the x-axis) can be given. The two lines may both be infinite, but their relation can be represented in terms of a finite (though ambiguous) notation—involving both letters and curves—and can play a determinate role in problems of quadrature. In the spring of 1673, Leibniz had traveled to London, where John Pell referred him to Nicolaus Mercator’s Logarithmotechnia, in which Leibniz discovered Mercator’s series. Taking his lead from the result of Gregory of St. Vincent, that the area under the hyperbola y = 1/(1 + t) from t = 0 to t = x is what we now call ln (1 + x), Mercator devised the series that bears his name, ln(1 + x) = x/1 − x2 /2 + x3 /3 − x4 /4 + ... The more important example is that of the parabola; at stake are its relations to the other conic sections. Leibniz gives the following account of how to find a ‘universal equation’ that will unify and exhibit the relations among a series of cases. He offers as an illustration the conic sections, and what he writes is an implied criticism of Descartes’ presentation of them in the Geometry, which does not sufficiently exhibit their harmony: The formation of a universal equation which must comprehend a number of particular cases will be found by setting up a list of all the particular cases. Now in order to make this list we must reduce everything to a line segment or magnitude, whose value is sought, and which must be determined by means of certain other line segments or magnitudes, added or subtracted; consequently there must be certain fixed points, or points taken as fixed, and others which move, whose possible different locations give us the catalogue of all the possible cases ... having found this list, we must try to reduce all the possible cases to a general formula, by means of ambiguous signs, and of letters whose values are sometimes finite, ¹⁵ C, 106–7.
212
geometry and 17 th century mechanics
sometimes infinitely large or small. I dare to claim that there is nothing so mixed up or ill-assorted that can’t be reduced to harmony by this means.¹⁶
He gives a diagram, with a bit of curved line representing an arbitrary conic section descending to the right from the point A, ABYE; a vertical axis AXDC descending straight down from A; and perpendicular to that axis at X another axis XY which meets the curve in Y; the line DE is drawn parallel to XY. Two given line segments a and q represent the parameters of the conic section. Leibniz asserts that the general equation for all the cases, where AX = x and XY = y, must then be, 2ax = | (a/q)x2 − y2 = 0. When a and q are equal and = | is explicated as −, we have the circle of radius a = q; when a and q may be equal or unequal and = | is explicated as −, we have an ellipse where a is the latus rectum and q is the latus transversum; when = | is explicated as +, the conic section is the hyperbola. However, in order to include both the parabola and the straight line as cases of the conic section, Leibniz asserts, one must make use of infinite or infinitely small lines. ‘Now supposing that the line q, or the latus transversum of the parabola be of infinite length, it is clear that the equation 2axq = | ax2 = qy2 , will be equivalent to this one: 2axq = qy2 (which is that of the parabola) because the term ax2 of the equation is infinitely small compared to the others 2axq, and qy2 ... ’¹⁷ And with respect to the straight line, he asserts, we must take both a and q as being infinitely small, that is, infinitesimal. ‘Consequently, in the equation: 2ax = | (a/q)x2 = y2 , the term 2ax will vanish as it is infinitely small compared to (a/q) x2 and y2 , and that which remains will be +(a/q)x2 = y2 with the sign = | changed into +. Now the ratio of two infinitely small lines may be the same as that of two finite lines and even of two squares or of two rectangles; thus let the ratio a/q be equal to the ratio e2 /d2 and we will have (e2 /d2 )x2 = y2 or (e/d)x = y whose locus is the straight line.’¹⁸ Leibniz concludes that this equation, by exhibiting the conic sections as limit cases of one general equation, not only displays their mutual relations as a coherent system, but also explains many peculiar features of the special cases: why only the hyperbola has asymptotes, why the parabola and the straight line do not have a center while the others do, and so forth. At the end of the essay, Leibniz notes that we must distinguish between ambiguity which is an equivocation, and ambiguity which is a ‘univocation.’ ¹⁶ C, 114–15.
¹⁷ C, 116.
¹⁸ C, 116.
leibniz on transcendental curves 213 The ambiguity of the sign = | is an example of equivocation which must be eliminated each time we determine the general equation with respect to the special cases. But the ambiguity of the letters must be retained; it is the way the characteristic expresses the Principle of Continuity, for Leibniz believed that the infinitesimal, the finite, and the infinite are all subject to the same rational constraints. One rule will embrace them, but it must be written in an irreducibly ambiguous idiom. With respect to signs [for operations], the interpretation must free the formula from all equivocation. For we must consider the ambiguity that comes from letters as giving a ‘univocation’ or universality but that which comes from signs as producing a true equivocation, so that a formula that only contains ambiguous letters gives a truly general theorem ... The first kind of interpretation is without difficulty, but the other is as subtle as it is important, for it gives us the means to create theorems and absolutely universal constructions, and to find general properties, and even definitions and subaltern kinds common to all sorts of things which seem at first to be very distant from each other ... it throws considerable illumination on the harmony of things.¹⁹
We should not think that Leibniz wrote this only in the first flush of his mathematical discoveries, and that the more sophisticated notations and more accurate problem-solving methods which he was on the threshold of discovering would dispel this enthusiasm for productive ambiguity. A look at his celebrated investigations of transcendental curves by means of his new notation will prove my point.
8.3. The Principle of Perfection Leibniz’s definition of perfection is the greatest variety with the greatest order, a marriage of diversity and unity. He compares the harmonious diversity and unity among monads as knowers to different representations or drawings of a city from a multiplicity of different perspectives, and it is often acknowledged that this metaphor supports an extension to geographically distinct cultural groups of people who generate diverse accounts of the natural world, which might then profitably be shared. However, it is less widely recognized that this metaphor concerns not ¹⁹ C, 119.
214
geometry and 17 th century mechanics
only knowledge of the contingent truths of nature but also moral and mathematical truths, necessary truths. As Frank Perkins argues at length in chapter 2 of his Leibniz and China: A Commerce of Light, the human expression of necessary ideas is conditioned (both enhanced and limited) by cultural experience and embodiment, and in particular by the fact that we reason with other people with whom we share systems of signs, since for Leibniz all human thought requires signs. Mathematics, for example, is carried out within traditions that are defined by various modes of representation, in terms of which problems and methods are articulated.²⁰ After having set out his textual support for the claim that on Leibniz’s account our monadic expressions of God’s ideas and of the created world must mutually condition each other, Perkins sums up his conclusions thus: We have seen ... that in its dependence on signs, its dependence on an order of discovery, and its competition with the demands of embodied experience, our expression of [necessary] ideas is conditioned by our culturally limited expression of the universe. We can see now the complicated relationship between the human mind and God. The human mind is an image of God in that both hold ideas of possibles and that these ideas maintain set relationships among themselves in both. Nonetheless, the experience of reasoning is distinctively human, because we always express God’s mind in a particular embodied experience of the universe. The human experience of reason is embodied, temporal, and cultural, unlike reason in the mind of God.²¹
Innate ideas come into our apperception through conscious experience, and must be shaped by it. With this view of human knowledge, marked by a sense of both the infinitude of what we try to know and the finitude of our resources for knowing, Leibniz could not have held that there is one correct ideal language. And Leibniz’s practice as a mathematician confirms this: his mathematical Nachlass is a composite of geometrical diagrams, algebraic equations taken singly or in two-dimensional arrays, tables, differential equations, mechanical schemata, and a plethora of experimental notations. Indeed, it was in virtue of his composite representation of problems of quadrature in number theoretic, algebraic and geometrical terms that Leibniz was able to formulate the infinitesimal calculus and the differential ²⁰ Leibniz and China, A Commerce of Light (Cambridge: Cambridge University Press, 2004). ²¹ Perkins, Leibniz and China, A Commerce of Light, 96–7.
leibniz on transcendental curves 215 equations associated with it, as well as to initiate the systematic investigation of transcendental curves.²² Leibniz was certainly fascinated by logic, and sought to improve and algebraize logical notation, but he regarded it as one formal language among many others, irreducibly many. Once we admit, with Leibniz, that expressive means that are adequate to the task of advancing and consolidating mathematical knowledge must include a variety of modes of representation, we can better appreciate his investigation of transcendental curves, and see why and how he went beyond Descartes.
8.4. The Isochrone and the Tractrix Leibniz’s study of curves begins in the early 1670s when he is a Parisian for four short years. He takes up Cartesian analytic geometry (modified and extended by two generations of Dutch geometers including Schooten, Sluse, Hudde, and Huygens) and develops it into something much more comprehensive, analysis in the broad eighteenth century sense of that term. Launched by Leibniz, the Bernoullis, L’Hˆopital and Euler, analysis becomes the study of algebraic and transcendental functions and the operations of differentiation and integration upon them, the solution of differential equations, and the investigation of infinite sequences and series. It also plays a major role in the development of post-Newtonian mechanics. The intelligibility of geometrical objects is thrown into question for Leibniz in the particular form of (plane) transcendental curves: the term is in fact coined by Leibniz. These are curves that, unlike those studied by Descartes, are not algebraic, that is, they are not the solution to a polynomial equation of finite degree. They arise as isolated curiosities in antiquity (for example, the spiral and the cycloid), but only during the seventeenth century do they move into the center of a research program that can promise important results. Descartes wants to exclude them from geometry precisely because they are not tractable to his method, but Leibniz argues for their admission to mathematics on a variety of grounds, and over ²² Emily Grosholz, ‘Was Leibniz a Mathematical Revolutionary?’ in Revolutions in Mathematics, ed. D. Gillies, (Oxford: Clarendon Press, 1992), 117–33. One of the best examples of my claim here is the manuscript ‘De quadratura arithmetica circuli ellipseos et hyperbolae cujus corollarium est trigonometria sine tabulis,’ edited by E. Knobloch (G¨ottingen: Vandenhoeck & Ruprecht, 1993), and recently translated into French, with commentary, by M. Parmentier (Paris: Vrin, 2004).
216
geometry and 17 th century mechanics
a long period of time. This claim, of course, requires some accompanying reflection on their conditions of intelligibility. For Leibniz, the key to a curve’s intelligibility is its hybrid nature, the way it allows us to explore numerical patterns and natural forms as well as geometrical patterns; he was as keen a student of Wallis and Huygens as he was of Descartes. These patterns are variously explored by counting and by calculation, by observation and tracing, and by construction using the language of ratios and proportions. To think them all together in the way that interests Leibniz requires the new algebra as an ars inveniendi. The excellence of a characteristic for Leibniz consists in its ability to reveal structural similarities. What Leibniz discovers is that this ‘thinking-together’ of number patterns, natural forms, and figures, where his powerful and original insights into analogies pertaining to curves considered as hybrids can emerge, rebounds upon the algebra that allows the thinking-together and changes it. The addition of the new operators d and , the introduction of variables as exponents, changes in the meaning of the variables, and the entertaining of polynomials with an infinite number of terms, are all examples of this. Indeed, the names of certain canonical transcendental curves (log, sin, sinh, etc.) become part of the standard vocabulary of algebra. This habit of mind is evident throughout Volume I of the VII series (Mathematische Schriften) of Leibniz’s works in the Berlin Akademie-Verlag edition, devoted to the period 1672–76.²³ As M. Parmentier admirably displays in his translation and edition Naissance du calcul diff´erentiel, 26 articles des Acta eruditorum, the papers in the Acta Eruditorum taken together constitute a record of Leibniz’s discovery and presentation of the infinitesimal calculus.²⁴ They can be read not just as the exposition of a new method, but as the investigation of a family of related problematic things, that is, algebraic and transcendental curves. In these pages, sequences of numbers alternate with geometrical diagrams accompanied by ratios and proportions, and with arrays of derivations carried out in Cartesian algebra augmented by new concepts and symbols. For example, ‘De vera proportione circuli ad quadratum circumscriptum in numeris rationalibus expressa,’²⁵ which treats the ancient problem of the squaring of the circle, moves through a consideration of the series π/4 = 1 − 1/3 + 1/5 − 1/7 + 1/9 ... to a ²³ G. W. Leibniz’s Mathematische Schriften, Geometrie—Zahlentheorie—Algebra 1672–1676, Series VII, Vol. 1, ed. E. Knobloch and W. Contro (Berlin: Akademie Verlag, 1990). ²⁴ (Paris: J. Vrin, 1995). ²⁵ A. E. Feb. 1682; M. S. V, 118–22.
leibniz on transcendental curves 217 number line designed to exhibit the finite limit of an infinite sum. Various features of infinite sums are set forth, and then the result is generalized from the case of the circle to that of the hyperbola, whose regularities are discussed in turn. The numerical meditation culminates in a diagram that illustrates the reduction: in a circle with an inscribed square, one vertex of the square is the point of intersection of two perpendicular asymptotes of one branch of a hyperbola whose point of inflection intersects the opposing vertex of the square. The diagram also illustrates the fact that the integral of the hyperbola is the logarithm. Integration takes us from the domain of algebraic functions to that of transcendental functions; this means both that the operation of integration extends its own domain of application (and so is more difficult to formalize than differentiation), and that it brings the algebraic and transcendental into rational relation. During the 1690s, Leibniz investigates mathematics in relation to mechanics, deepening his command of the meaning and uses of differential equations, transcendental curves and infinite series. In this section I will discuss two of these curves, the isochrone and the tractrix, and in the next, the catenary. The isochrone is the line of descent along which a body will descend at a constant velocity. Leibniz publishes his result in the Acta Eruditorum in 1689 under the title, ‘De linea isochrona in qua grave sine acceleratione descendit et de controversia cum Dn. Abbate D. C.’²⁶ However, the real analysis of the problem is found in a manuscript published by Gerhardt in the Mathematische Schriften V,²⁷ and accompanied by two diagrams: the first, reversed, is incorporated in the second. (Figures 8.1 and 8.2 are labeled 119 and 120 in Gerhardt.)
Figure 8.1. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 119 ²⁶ Acta Eruditorum April 1689, M. S. V, 234–7.
²⁷ M. S. V, 241–3.
218
geometry and 17 th century mechanics
Figure 8.2. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 120
On the first page of this text, Figure 8.1 is read as infinitesimal. It begins, The line of descent called the isochrone YYEF is sought, in which a heavy body descending on an incline approaches the plane of the horizon uniformly or isochronously, that is, so that the times are equal, in which the body traverses BE, EF, the perpendicular descents BR, RS being assumed equal. Let YY be the line sought, for which AXX is the straight line directrix, on which we erect perpendiculars; let us call x the abscissa AX, and let us call y the ordinate XY, and 1 X2 X or 1 Y1 D will be dx and let 1 D2 Y be called dy.²⁸
The details of the analysis are interesting, as Leibniz works out a differential equation for the curve and proves by means of it what was in fact already known, that the curve is a quadrato-cubic paraboloid. However, what matters for my argument here is that we are asked to read the diagram as infinitesimalistic, since 1 X2 X, 1 Y1 D, and 1 D2 Y are identified as differentials. (In Figure 8.1,1 D is misprinted 1 B. Immediately afterwards, in the section labeled ‘Problema, Lineam Descenscoriam isochronam invenire,’ exactly the same diagram is used, but reversed, incorporated into a larger diagram, and with some changes in the labeling. Here, by contrast, Figure 8.2 is meant to be read as a finite configuration; but it is intended to be the same diagram. Note how Leibniz begins: ‘Let the line BYYEF be a quadrato-cubic paraboloid, whose vertex is B and whose axis is BXXRS ... ’ There is no S in Figure 8.2; but the argument that follows makes sense if we suppose that ‘G’ ought to be ‘S’ as it is in Figure 8.1. Leibniz shows, using a purely geometrical argument cast in the idiom of proportions, that if the curve is the quadrato-cubic paraboloid, then it must be the isochrone. A heavy object falling from B ²⁸ M.S. V, 241. In diagram 119, 1 D is mislabeled 1 B.
leibniz on transcendental curves 219 along the line BYY, given its peculiar properties, must fall in an isochronous manner: ‘namely, the ratio between the time in which the heavy object runs down along line BYY from B to E, and the time in which it runs down from E to F, will be [the same as] the ratio of BR to RS; and then if BR and RS are equal, so also the intervals of time, in which it descends from B to E and from E to F, will be equal.’²⁹ What we find here is the same diagram employed in two different arguments that require it to be read in different ways; what a diagram means depends on its context of use. We might say that in the second use here, the diagram is iconic because it resembles the situation it represents directly, but in the first use it is symbolic because it cannot directly represent an infinitesimalistic situation. Yet the sameness of shape of the curve links the two employments, and holds them in rational relation. We can find other situations in which the same diagram is read in two ways within the same argument. The tractrix is the path of an object dragged along a horizontal plane by a string of constant length when the end of the string not joined to the object moves along a straight line in the plane; you might think of someone walking down a sidewalk while trying to pull a recalcitrant small dog off the lawn by its leash. In fact, in German the tractrix is called the Hundkurve. The Parisian doctor Claude Perrault (who introduces the curve to Leibniz) uses as an example a pocket watch attached to a chain, being pulled across a table as its other end is drawn along a ruler. The key insight is that the string or chain is always tangent to the curve being traced out; the tractrix is also sometimes called the ‘equitangential curve’ because the length of a tangent from its point of contact with the curve to an asymptote of the curve is constant. The evolute of the tractrix is the catenary, which thus relates it to the quadrature of the hyperbola and logarithms.³⁰ So the tractrix is, as one might say, well-connected. Leibniz constructs this curve in an essay that tries out a general method of geometrical-mechanical construction, ‘Supplementum geometriae dimensoriae seu generalissima omnium tetragonismorum effectio per motum: similiterque multiplex constructio lineae ex data tangentium conditione,’ ²⁹ M. S. V, 242. ³⁰ The evolute of a given curve is the locus of centers of curvature of that curve. It is also the envelope of normals to the curve; the normal to a curve is the line perpendicular to its tangent, and the envelope is a curve or surface that touches every member of a family of lines or curves (in this case, the family of normals).
220
geometry and 17 th century mechanics
published in the Acta Eruditorum in September, 1693.³¹ His diagram, like the re-casting of Kepler’s Law of Areas in Proposition I, Book I, in Newton’s Principia, represents a curve that is also an infinite-sided polygon, and a situation where a continuously acting force is re-conceptualized as a series of impulses that deflect the course of something moving in a trajectory. The diagram must thus be read in two ways, as a finite and as an infinitesimal configuration (see Figure 8.3). Here is the accompanying demonstration: We trace an arbitrarily small arc of a circle 3 AF, with center 3 B, whose radius is the string 3 A3 B. We then pull on the string 3 BF at F, directly, in other words along its own direction towards 4 A, so that from position 3 BF it moves to 4 B4 A. Supposing that we have proceeded from the points 1 B and 2 B in the same fashion as from 3 B, the trace will have described a polygon 1 B2 B3 B and so forth, whose sides always fall on [semper incident in] the string. From this stage on, as the arc 3 AF is indefinitely diminished and finally allowed to vanish—which is produced in the continuous tractional motion of our trace, where the lateral displacement of
Figure 8.3. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 139 ³¹ M. S. V, 294–301.
leibniz on transcendental curves 221 the string is continuous but always unassignable [inassignabilis]—it is clear that the polygon is transformed into a curve having the string as its tangent.³²
Up to the last sentence, we can read the diagram as the icon of a finite configuration; in the last sentence, where the diagram becomes truly dynamical in its meaning, we are required to read it as the symbol of an infinitesimalistic configuration, a symbol that nonetheless reliably exhibits the structure of the item represented. (A polynomial is also a symbol that reliably exhibits the structure of the item it represents.) After Leibniz invents the dx and notation, his extended algebra can no longer represent mathematical items in an ambiguous way that moves among the finite, infinitesimal, and infinitary; thus, he must employ diagrams to do this kind of bridging for him. In the foregoing argument, and in many others like it, we find Leibniz exploiting the productive ambiguity of diagrams that link the finite and the infinitesimal in order to link the geometrical and dynamical aspects of the problem.
8.5. The Catenary or La Chainette In the ‘Tentamen Anagogicum,’ Leibniz discusses his understanding of variational problems, fundamental to physics since all states of equilibrium and all motion occurring in nature are distinguished by certain minimal properties; his new calculus is designed to express such problems and the things they concern. The catenary is one such object; indeed, for Leibniz its most important property is the way it expresses an extremum, or, as Leibniz puts it in the ‘Tentamen Anagogicum,’ the way it exhibits a determination by final causes that exist as conditions of intelligibility for nature. And indeed the catenary, and its surface of rotation the catenoid (which is a minimal surface, along with the helicoid), are found throughout nature; their study in various contexts is pursued by physicists, chemists, and biologists.³³ The differential equation, as Leibniz and the Bernoullis discussed it, expresses the ‘mechanical’ conditions which give rise to the curve: in modernized terms, they are dy/dx = ws/H, where ws is the weight of s feet of chain at w pounds per foot, and H is the horizontal tension ³² M. S. V, 296.
³³ ‘Tentamen anagogicum,’ (ca. 1696), P. S. VII, 270–9.
222
geometry and 17 th century mechanics
pulling on the cable. Bernoulli’s differential equation, in similar terms, sets zdy = adx, where z is a curved line, a section of the catenary proportional to the weight W, and a is an appropriate constant. It can √ be rewritten as dy = adx/ (x2 − a2 ). Bernoulli solves the differential equation by reducing the problem to the quadrature of a hyperbola, which at the same time explains why the catenary can be used to calculate logarithms.³⁴ The solution to the differential equation proves to be a curve of fundamental importance in purely mathematical terms, the hyperbolic function y = acosh x/a or simply y = cosh x if a is chosen equal to 1. In ‘De linea in quam flexile,’ Leibniz exhibits his solution to the differential equation in different, geometrical terms; he announces: ‘Here is a Geometrical construction of the curve, without the aid of any thread or chain, and without presupposing any quadrature.’³⁵ That is, he acknowledges various means for defining the catenary, including the physico-mechanical means of hanging a chain and the novel means of writing a differential equation; but in order to explain the nature of the catenary he gives a geometrical construction of it (see Figure 8.4). The point-wise construction of the catenary makes use of an auxiliary curve which is labeled by points 3ξ, 2ξ, 1ξ, A (origin), 1(ξ), 2(ξ), 3(ξ) ... This
Figure 8.4. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 121 ³⁴ Opera Johannis Bernoulli, ed. G. Cramer, (Geneva, 1742), III, 494. ³⁵ A. E. June 1691, M. S. V, 243–7; Parmentier, Naissance du calcul diff´erentiel, 193.
leibniz on transcendental curves 223 auxiliary curve, which associates an arithmetical progression with a geometrical progression, is constructed as a series of mean proportionals, starting from a pair of selected segments taken as standing in a given ratio D : K; it is the exponential curve. Having constructed ex , Leibniz then constructs every point y of the catenary curve to be 1/2(ex + e−x ) or cosh x. ‘From here, taking ON and O(N) as equal, we raise on N and (N) the segments NC and (N)(C) equal to half the sum of Nξ and (N)(ξ), then C and (C) will be points of the catenary FCA(C)L, of which we can then determine geometrically as many points as we wish.’³⁶ Leibniz then shows that this curve has the physical features it is supposed to have (its center of gravity hangs lower than any other like configuration) as well as the interesting properties that the straight line OB is equal to the curved segment of the catenary AC, and the rectangle OAR is equal to the curved area AONCA. He also shows how to find the center of gravity of any segment of the catenary and any area under the curve delimited by various straight lines, and how to compute the area and volume of solids engendered by its rotation. It also turns out to be the evolute of the tractrix, another transcendental curve of great interest to Leibniz; thus it is intimately related to the hyperbola, the logarithmic and exponential functions, the hyperbolic cosine and sine functions, and the tractrix; and, of course, to the catenoid and so also to other minimal surfaces. An important difference between Descartes and Leibniz here is that Leibniz regards the mechanical genesis of these curves not as detracting from their intelligibility, but as constituting a further condition of intelligibility for them. As new analogies are discovered between one domain and another, new conditions of intelligibility are required to account for the intelligibility of the hybrids that arise as new correlations are forged. The analytic search for conditions of intelligibility of things that are given as unified yet problematic (like the catenary) is clearly quite different from the search for a small, fixed set of axioms in an axiomatization. The catenary is intelligible because of the way in which it exhibits logarithmic relations among numbers; and embodies the function that we call cosh, from whose shape we can ‘read off’ its rational relation to both the exponential function and the hyperbola; and expresses an equilibrium state in nature; and displays a kind of duality with the tractrix, and whatever deep and interesting aspect we ³⁶ Paramentier, Naissance du calcul diff´erentiel, 194.
224
geometry and 17 th century mechanics
discover next. Generally, we can say that the things of mathematics, especially the items that are fundamental because they are canonical, become more meaningful with time as they find new uses and contexts. Thus the conditions of their intelligibility may expand, often in surprising ways. When the differential equation of the catenary is ‘fitted out’ with a geometrical curve or an equilibrium state in rational mechanics (to use Cartwright’s term), the combination of mathematical representations allows us both to solve problems and to refer successfully, that is, to discover new truths.
PA RT I V
Geometry and Twentieth Century Topology
This page intentionally left blank
9 Geometry, Algebra, and Topology In this chapter, I return to the notion of mathematical experience, as it is treated by Jules Vuillemin in his book La Philosophie de l’alg`ebre, in order to bring the discussion of mathematical rationality out of the seventeenth century and into the twentieth.¹ I want to show that the patterns of argument I have described in the last three chapters also hold for the structuralist mathematics of the twentieth century, and for mathematical logic and set theory. Vuillemin was one of the great ‘analytic’ philosophers of France, and for many decades a friend and admirer of W. V. O. Quine, but his writings were always thoroughly informed by the history of mathematics and science, and the history of philosophy. Methodologically, he was at odds with Quine, because Quine treated history as a (serious) hobby that was disjunct from his philosophical projects; yet they shared a number of Kantian assumptions, and each greatly admired the work of the other. In this chapter, I hope to use Vuillemin’s writings as a middle term to explain my approach in a manner that is not inimical to my colleagues who were raised in the logical positivist tradition.
9.1. Vuillemin on the Relation of Mathematics and Philosophy In the Introduction to La Philosophie de l’alg`ebre, Vuillemin distinguishes practical philosophy from theoretical philosophy, and then observes that theoretical philosophy must be rigorously distinguished from the psychology and history of the sciences. History and psychology study the way ¹ Jules Vuillemin, La Philosophie de l’alg`ebre (Paris: Presses Universitaires de France, 1962).
228
geometry and 20 th century topology
knowledge presents itself in the development of individual or collective experience: What prejudices or apparent but false evidence did the investigator have to set aside? What indeterminate correspondences did the investigator have to explore, and what new modes of expression or notation had to be devised? By contrast, he contends, theoretical philosophy is only interested in the order inherent in the things themselves, in objective validity, not in the accidents of invention. It is also, he adds, committed to critical analysis that investigates the relation of pure knowledge to the faculty of thought, that is, which acts of thought make an object of knowledge possible, and the consequent assignment of limits to such knowledge. Here his debts to Kant on the one hand and to Frege on the other are clear. By now, it should be evident to the reader that I view modes of representation as constitutive of mathematical knowledge because of the way in which they reveal the order inherent in intelligible things, and that I view discovery and justification as two sides of the same coin of Leibnizian analysis. Nevertheless, Vuillemin believes history enters into the considerations of theoretical philosophy via mathematics. ‘But a relation exists that is closer, although less apparent and certain, between pure mathematics and theoretical philosophy. The history of mathematics and philosophy, in tandem, shows that a revision of methods in the former always has repercussions in the latter.’² La Philosophie de l’alg`ebre is devoted to exhibiting the cogency and fruitfulness of the insight that pure mathematics and theoretical philosophy are closely allied, and that changes in the methods of the one have consequences for the other. Mathematics is used perennially in philosophy to criticize, reform, and define the methods of theoretical philosophy; as mathematics changes, it bears in a changing way on philosophy. Vuillemin traces the impact of Descartes’ analytic geometry and Leibniz’s infinitesimal calculus, of Lagrange’s method of resolvants, of Gauss’s Disquisitiones arithmeticae, of Galois’s work following upon that of Gauss and Abel, and finally the articulation of group theory in the work of Klein and Lie. The genealogy here traces the emergence of the notions of field on the one hand and group on the other, with the increasing versatility or ambiguity of algebraic forms vis-`a-vis their applications, and the impact of this development on philosophers from Descartes through Kant to Frege and Husserl. ² Vuillemin, La Philosophie de l’alg`ebre, 4.
geometry, algebra, and topology 229 So history intervenes in theoretical philosophy in an essential, irrevocable way after all, via the history of mathematics. In his commitment to the philosophical importance of the history of mathematics, the history of philosophy, and indeed the history of logic, Jules Vuillemin presented a remarkable contrast to Quine. Yet Vuillemin typically returned from his excursions into history with a notebook containing only recurrent and invariant structures. Thus in his magisterial N´ecessit´e ou contingence: L’aporie de Diodore et les syst`emes philosophiques, he discovered the six (and only six) types of metaphysical system, a classification which could be used to contest the claims of Heidegger and Derrida to have discovered something new under the philosophical sun.³ And in El´ements de Po´etique, he re-discovered Aristotle’s four-fold classification of types of tragedy, which subsume comedy as well—including the comedy of Moli`ere—though Vuillemin himself argued that Shakespeare escaped the nets of the Aristotelian schema.⁴ From the study of history emerges a clearer view of rational structure that belongs more to Being than to Becoming. This is what interests Vuillemin in the work of Galois. Galois’s procedure of extending a field by adjoining the roots of certain polynomials (with coefficients in the original field but irreducible in that field) and thereby creating a new, nested series of fields—a procedure that leads step by step from, e. g., the field of the rational numbers to the field of the algebraic numbers—seemed to Vuillemin a good example of how a detour into the contingent and arbitrary (here, the choice of roots to adjoin and the order in which they are adjoined) may in the end succeed in revealing objective structure. In this chapter, I return to the geometry of Euclid and Descartes in light of Vuillemin’s discussion of geometrical ‘intuition’ in section 1, and then of Kantian intuition in section 2; this discussion employs Vuillemin’s illuminating distinction between extrinsic and intrinsic intuition. In light of the distinction, I examine an important definition and a proof of De Rham’s theorem presented in a textbook on algebraic topology, as an example of what ‘intrinsic intuition’ looks like in recent mathematical practice, and link it to my account of formal experience, given in Chapter 2. ³ N´ecessit´e ou contingence: L’aporie de Diodore et les syst`emes philosophiques (Paris: Les Editions de Minuit, 1984/97). Part of this book was translated into English as Necessity or Contingency: The Master Argument (Stanford: CSLI Lecture Notes, 1996). I wrote an essay review of it in Metaphilosophy, 18/(2) (1987), 161–70. ⁴ El´ements de Po´etique (Paris: Vrin, 1991).
230
geometry and 20 th century topology
9.2. Euclid’s Elements and Descartes’ Geometry Revisited The impact of Descartes’ analytic geometry on mathematics and in turn on philosophy was certainly great. Vuillemin’s account of its significance in sections 19, 20, and 21 of chapter ii, ‘La th´eor`eme de Gauss,’ draws attention to the many questions analytic geometry raised about the nature of space, which plays such a central role in mathematics, physics, and metaphysics. He begins by discussing the role of the line and circle in Greek geometry, the two canonical ‘constructing curves’: the Greeks always tied geometrical constructions to existence theorems. Though there was some debate in antiquity about whether the ruler and compass represent ideal things independent of our mode of access to them (Speusippus), or whether mathematical objects exist only to the extent that we can construct them by ruler and compass, so that the latter constitute our mode of access (Menaechmus), the Greeks delimited the realm of the rational for mathematics and gave expression to what Vuillemin aptly calls une m´etaphysique de la finitude.⁵. A price had to be paid however for this comfortable metaphysical grounding of mathematics in the ideal objects of the line and circle. There are both points on the line (like π) and curves (like the conchoid) that escape such means of construction; and the Greeks knew that the boundaries they drew for mathematics were too narrow, though of course not how radically narrow. Moreover, the canonical means of construction were thus in a sense extrinsic to geometry as it developed, and were obviously extrinsic to arithmetic, though the inherent finitude of the study of the natural and rational numbers made that exteriority hard to discern for the Greeks. Geometrical construction, Vuillemin concludes, fulfills two roles in the history of mathematics, one legitimate and one illegitimate. The first is to play a ‘critical’ role that allows mathematicians to search for minimal means of proof, a search which is often mathematically fruitful and in any case consonant with a mathematician’s interest in formal elegance; and this accompanies the demand to admit into mathematics only entities which
⁵ Vuillemin, La Philosophie de l’alg`ebre, 167.
geometry, algebra, and topology 231 can be constructed. This is a coherent position even if it is in the short run exclusionary, for constructivism in the long run (from Menaechmus to Descartes to Kant to Brouwer, for example) is willing to revise what it takes as acceptable means of construction. But the second role leads, according to Vuillemin, to an incoherent position: it postulates an extrinsic foundation for existence in all the domains of pure mathematics, a dogmatic ‘intuition’ that must play, to use Kant’s terms, a constitutive role for knowledge, as if all of knowledge were homogeneous, and could be totalized or contained. Descartes’ analytic geometry unmasks the incoherence of this second role for ruler and compass—for line and circle taken as canonical means of construction—because it reveals how radically these means of construction fail to provide a foundation for geometry, and makes explicit how they fail to provide a foundation for the study of number. Moreover, algebra provides a new way to relate the realm of figure and the realm of number while honoring their distinctness, because of its explicitly equivocal notation that announces its own ambiguity as geometric diagrams and arithmetic numerals do not. As Vuillemin puts it, ‘With Descartes begins, as one might say, the synthesis of the line, whereas the Greeks only proposed to analyze it.’⁶ Analytic geometry reduces a curve to an algebraic relation among points of the line on the one hand, and among numerical coordinates on the other. Constructing the curve point-wise using the procedures outlined in Book I of the Geometry left holes in the curve, as Descartes knew very well; and even the tracing machines described in Book II as a supplement were not satisfactory in this regard: ‘it would require a denumerable infinity of instruments.’⁷ If one has the project of synthesizing the line rather than analyzing it, then the Greek ‘intuition’ of the line will no longer suffice or even seem to suffice, for one must reason about infinite sequences or infinite sets of points on the one hand and of numbers on the other: ‘the chimerical character of existence proofs dependent on constructions was unmasked.’⁸ The project of synthesizing the line also eventually leads to mathematical entities that count against constructivism—for example, axioms of set theory like the Axiom of Choice, which make existence claims transcending all means of construction. ⁶ Vuillemin, La Philosophie de l’alg`ebre, 168. ⁸ Vuillemin, La Philosophie de l’alg`ebre, 168.
⁷ Vuillemin, La Philosophie de l’alg`ebre, 168.
232
geometry and 20 th century topology
The watershed created by Descartes’ Geometry is, of course, an ironic triumph, since—as I noted in Chapter 6—he himself shared a commitment to geometrical canons of construction and existence with the Greeks, and didn’t much exploit the increased expressiveness of his notation (he worked very little with higher algebraic curves and avoided transcendental curves) or its ambiguity (he showed little interest in the exploration of numerical patterns of the kind pursued by Wallis and Leibniz). All the same, Vuillemin concludes, the Geometry weighs in favor of a distinction endorsed by Felix Klein, Poincar´e, and Vuillemin himself, between intrinsic and extrinsic intuition. Intrinsic intuition is the kind of familiarity with particular kinds of mathematical items to which mathematicians attest; it is accompanied by a know-how (to use Herbert Breger’s term) that is often tacit at first and then articulated (in papers and textbooks) as general rules and definitions; it is the result of what I have called the formal experience of mathematicians.
9.3. Kant’s Transcendental Aesthetic: Extrinsic and Intrinsic Intuition Vuillemin observes that such intrinsic intuition has a psychological status and serves to discover, not to found, since it may as often as not mislead. Given my discussion of mathematical experience in Chapter 2, I must dispute this aspect of his claim. To avoid the imputation that intrinsic intuition is merely a matter of human psychology, one can point to the constellation of solved and unsolved problems that such items, thus ‘intuited,’ give rise to, as well as to the textbook presentations which follow upon the solution of those problems by an increasingly explicit know-how. That is, though the historical and psychological setting may be arbitrary and contingent, the intelligible items and their features under investigation are objective. Certain items give rise to certain kinds of problems (and not others) because of their peculiar integrity, the way in which they exist vis-`a-vis their features and parts, and the way they lend themselves to representation; certain problems are solvable by certain kinds of means and procedures (and not others); and this is so independent of human wish, whim, or fancy. As Vuillemin himself knew very well, excursions into history may result in the discernment of timeless and necessary things.
geometry, algebra, and topology 233 Intrinsic intuition, the reward and fruit of formal experience, is specific to a certain mathematical domain or area of research, which concerns certain kinds of items, and moreover a licit mode of constructivism may accompany it, involving a search for the most economical and elegant means of construction for those sorts of problems and items. In other words, there is an irreducible multiplicity of ‘intrinsic intuitions’ and we may expect new varieties to arise as mathematics develops.⁹ By contrast, extrinsic intuition underwrites an incoherent intuitionism ‘which limits the procedures of reason by forcing it to collaborate with an alien faculty that limits it from the outside.’¹⁰ And this is precisely what Kant does in his doctrine of space, to which Vuillemin, for all his sympathy with Kant, very strenuously objects. Although I agree with his diagnosis of what is wrong with Kant’s use of extrinsic intuition, I disagree with the conclusions he draws from it. Vuillemin observes, ‘Kantian philosophy requires that the concepts of the understanding correspond to necessarily sensible [perceptual] intuitions, and thus that their pure form corresponds to the exteriority [external conditions] of space and time.’¹¹ This requirement imposed upon reason—that it subordinate itself to the pure forms of intuition that allegedly serve as a transcendental condition for perceptual intuition, which is passively received in our apprehension of the external world before the application of concepts—is illegitimate for mathematics, because the intuition of space may be completely alien to certain mathematical domains and will only obscure their investigation. In the case of pure theoretical knowledge, Kant’s theory introduces the taint of the merely empirical. The dogmatic assertion of this extrinsic intuition imposes an extrinsic method upon all of human knowledge, and this is a disaster. Upon the altar of the transcendental aesthetic, ‘ Kant had nevertheless sacrificed the whole theory of knowledge, as much for mathematics as philosophy.’ ¹² Vuillemin urges the distinctness of the sciences, their objects and methods, including the distinctness of mathematical domains. He writes, ‘the architectonic and systematic character of a science requires that it proceed in a purely local way, for if not, confusion obscures the exact limits of the sciences in their reciprocal relations, and destroys the whole edifice of ⁹ Vuillemin, La Philosophie de l’alg`ebre, 171–2. ¹¹ Vuillemin, La Philosophie de l’alg`ebre, 172.
¹⁰ Vuillemin, La Philosophie de l’alg`ebre, 172. ¹² Vuillemin, La Philosophie de l’alg`ebre, 173.
234
geometry and 20 th century topology
knowledge.’¹³ In particular, the Kantian doctrine of transcendental aesthetic has obscured our philosophical understanding of the problem of the continuum, because it has led philosophers—Vuillemin mentions both Leibniz and Kant—to confuse intrinsic with extrinsic intuition. Basing his account of the continuum on the way in which the line seems to be indefinitely divided without ever arriving at ‘atoms,’ Leibniz concludes that a whole which cannot be reduced to elements must be ideal: mathematical space is an ideal construction that exhibits the possibility of partition. By contrast, physical and metaphysical space is just the order of the things that exist. Kant too understands space as ideal and a whole prior to all its parts, but as transcendentally ideal because it is the pure form of our perceptual intuition. Both these doctrines, Vuillemin observes, fail to distinguish the problem of the continuum and that of space, because they take the continuum to pertain equally to the intuition of space and to the concept of space. But the latter, he claims, belongs to analysis and the former to geometry. Analysis, and by implication analytic geometry, present us with items that cannot be ‘pictured,’ that have no immediate relation to perception. The investigation of the continuum in rigorous terms arises from the classification of point sets in late nineteenth century analysis and twentieth century set theory; and the intrinsic intuition that serves Euclidean geometry would not have been directly useful in that investigation. Vuillemin shows by various examples that a commitment to Euclidean intuition would have been a problem for analysis if it had held sway, for it would have prevented the articulation of certain problems, important conceptual distinctions, and items. Luckily, he observes, the mathematicians outstripped the philosophers, and went about their business in analysis taking their inspiration not from Kant but from Kronecker. This triumph of intrinsic over extrinsic intuition in the self-understanding of mathematicians, Vuillemin argues, should also govern philosophy, which has misled itself in various ways. (He considers this is a good example of how mathematical practice can and should serve as a model for philosophy.) For Euclidean geometry, the continuity of the line segment is a given, ‘a clear, distinct and existent thing’.¹⁴ However, analysis wants to construct the continuum out of numbers, real numbers, a project that ¹³ Vuillemin, La Philosophie de l’alg`ebre, 173.
¹⁴ Vuillemin, La Philosophie de l’alg`ebre, 183.
geometry, algebra, and topology 235 stands in problematic relation to wanting to construct the line out of points. Kantian dogmatism obscures the problematic analogy between the line and the continuum, and their ‘construction.’ This is an important problem for philosophy, which Vuillemin correctly identifies and brings to our attention; he and I differ in the way in which we understand the analogy. Vuillemin goes on to charge the doctrine of extrinsic intuition with the following prejudices: ‘Logically, it reduces external relations in favor of internal relations and the category of substance. Theologically, this relation tempts us to reinstate the mysteries of being, its simplicity and interiority, which analysis tends to steer us away from.’¹⁵ Thus analysis can help philosophy learn to treat relations as external, and to do without the category of substance and the ‘mysteries of being.’ Here again he and I differ in our approach, because I would like to reformulate and revive Leibniz’s understanding of intelligible existence, and of relations as internal, precisely in order to formulate an account of knowledge that would do better justice to pure mathematics.
9.4. The First Pages of Singer and Thorpe In this section and the next, I illustrate my position on the way in which the formal experience of modern geometers and topologists develops by two examples drawn from the algebraic topology textbook Lecture Notes on Elementary Topology and Geometry by I. M. Singer and John Thorpe. This influential mid-twentieth century textbook, worked out in the classroom at MIT and Haverford College, was published in 1967.¹⁶ In the first few pages of the book, the authors introduce the fundamental notions of sets, real numbers, and the Euclidean plane. By using a second widely used textbook as a foil, I show that the authors must employ a consortium of modes of representation in order to fix the reference of items that will be centrally important to their exposition and to carry out an initial, indispensable analysis of those items. At the beginning of Lecture Notes on Elementary Topology and Geometry, Singer and Thorpe introduce ‘Naive Set Theory’ in the first section of ¹⁵ Vuillemin, La Philosophie de l’alg`ebre, 198–9. ¹⁶ I. M. Singer and J. A. Thorpe (Glenview, IL: Scott, Foresman; repr. New York: Springer, 1967).
236
geometry and 20 th century topology
chapter 1, and ‘Topological Spaces’ in the second section. After defining ‘set,’ ‘belonging to,’ ‘subset,’ ‘set equality,’ ‘union,’ ‘intersection’ and ‘complement,’ they define the Cartesian product A × B of two sets A and B as the set of ordered pairs, A × B = [(a, b); a ∈ A, b ∈ B], and add that a relation between A and B is a subset R of A × B. They then illustrate this definition by an example. Example. Let A = B = the set of real numbers. Then A × B is the plane. The order relation x < y is a relation between A and B. This relation is the shaded set of points in Fig. 1.1.¹⁷ Figure 1.1 is reproduced in Figure 9.1. Why did Singer and Thorpe add this diagram?
Figure 9.1. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 1.1
Although the choice of example and the addition of the diagram seems rather casual, I will argue that both are essential to the project of the ¹⁷ Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 3.
geometry, algebra, and topology 237 textbook, and that the need for the diagram is philosophically important. There is a conceptual gap between the Cartesian product of the set of real numbers with itself and the plane (presumably they mean the Euclidean plane) that must be bridged in order for the exposition to proceed. The brisk identification is actually a conjurer’s sleight of hand, and Singer and Thorpe were surely aware of this; perhaps this is why they call their first section naive set theory. Conversely, Singer and Thorpe could not have proceeded by invoking the Euclidean plane and merely presenting a diagram of it. Analysis—and the topology that springs from it—require the articulation of the line by the real numbers and of the plane by ordered pairs of real numbers referred to a coordinate system of two orthogonal lines, the center that is their point of intersection, and a unit chosen by convention. My philosophical claim is that the authors of a topology textbook must indicate in particular what is referred to, including canonical objects as well as more esoteric objects, and in general how those things will be investigated, including a repertoire of methods and results. In order to do this successfully, they must employ a variety of modes of representation whose rational relation is explained in natural language, for two reasons. First, referring to canonical objects and conducting a broad investigation of the conditions of intelligibility of a domain are two quite different functions of representation; one and the same mode of representation used univocally cannot fulfill both roles at once. Second, the canonical object invoked above, the real number line, is a hybrid; its very definition arises at the intersection of different mathematical enterprises with distinct traditions of representation that are transformed but retained in the hybrid. The reconstruction of the Euclidean plane in terms of the Cartesian product of the set of real numbers with itself further complicates the issue. One and the same mode of representation used univocally cannot present the real number line, or the reconstructed plane. William Boothby, in An Introduction to Differentiable Manifolds and Riemannian Geometry, reminds us that the product of two (or, more generally, n) copies of the set of real numbers with itself is just a set of ordered pairs (or, more generally, n-tuples) of real numbers.¹⁸ This means that R2 (or, ¹⁸ W. Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry (New York: Academic Press, 1975).
238
geometry and 20 th century topology
more generally, Rn ) stands for a whole spectrum of possible mathematical items: vector spaces, metric spaces, topological spaces, and finally (but with some qualification) Euclidean space. Boothby comments, ‘We must usually decide from context which one is intended.’¹⁹ If we decide to treat Rn as a n-dimensional vector space over R, we must recall that there are many possible n-dimensional vector spaces over R. Though they are all isomorphic, the isomorphism depends on choices of bases in the spaces to be identified by isomorphism, and there is in general no natural or canonical isomorphism independent of these choices. There is, however, one n-dimensional vector space over R that has a distinguished or canonical basis: the vector space of n-tuples of real numbers with componentwise addition and scalar multiplication, Rn = Vn , for which the n-tuples e1 = (1, 0, 0, ... 0), e2 = (0, 1, 0, ... 0), ... , en = (0, 0, ... , 0, 1) provide a natural or canonical basis. This choice allows us to restrict what we mean by Rn = Vn as a vector space of dimension n over R, but there is a further choice to be made. An abstract n-dimensional vector space over R is called Euclidean if it has a positive definite inner product defined on it. Once again, in general there is no natural or canonical way to choose such an inner product, but in the case of Rn = Vn a natural inner product is available: (x, y) = ni=1 xi yi (where x and y are vectors and the xi and yi are the vector components). Its canonicity is tied to that of the canonical basis for Rn = Vn because relative to this inner product, the natural basis is orthonormal, so that (ei , ej ) = (δij ). This inner product on Rn = Vn (the vector space) can be used to define a metric on Rn (the Cartesian product of n sets of the real numbers) by first 1 defining the norm of the vector x by ||x|| = ((x, x)) /2 and then defining a distance function in terms of it, d (x, y) = ||x − y||. (Here x means (x1 , x2 , ... , xn ) and y means (y1 , y2 , ... , yn ) and (x, y) is a point in Rn ). In particular, d (x, 0) = ||x||, that is, the norm of the vector x is just the distance of the point (x, 0) from the origin, and the definition of the distance of a point from the origin relates vectors (from Rn = Vn considered as a vector space) with points (from Rn considered as the Cartesian product of sets of real numbers), conferring on Rn the structure of a metric space. Boothby observes, ‘This notation is frequently useful even when we are ¹⁹ Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry, 2.
geometry, algebra, and topology 239 dealing with Rn as a metric space and not using its vector space structure.’²⁰ Having once conferred this metric structure on Rn , we can define ‘natural or canonical’ neighborhoods that are ‘open balls’ around each point of Rn and use them as the basis for a topology: then Rn is not only a metric space but a topological space. Thus we see that depending on the context of use we can mean by Rn a product of sets, a vector space, a metric space or a topological space. Sometimes (as in the first pages of Singer and Thorpe) it is identified with Euclidean space, and indeed that ultimate identification seems to provide, retro-fittingly, the ‘natural or canonical’ choices of structure that distinguish Rn as discussed in the preceding paragraphs from all other possible products of sets of real numbers with super-added vector space, metric, and/or topological structure. Euclidean space is the ‘intended model’ for Rn : in particular, the Euclidean line is the ‘intended model’ for R, and the Euclidean plane is the ‘intended model’ for R2 . But even after we have conferred the canonical vector space structure, metric structure and topological structure just discussed, the Cartesian product of sets of real numbers Rn still isn’t exactly Euclidean space. Boothby writes, many texts refer to Rn with the metric d(x, y) as Euclidean space. This identification is misleading in the same sense that it would be misleading to identify all ndimensional vector spaces [over R] with Rn ; moreover, unless clearly understood, it is an identification that can hamper clarification of the concept of manifold and the role of coordinates. ²¹
At best, Boothby observes, we can say that R2 may be identified with E2 (or, more generally, Rn may be identified with En ) plus a coordinate system. And there is no ‘natural or canonical’ coordinate system: an arbitrary choice of coordinates is involved because there is no natural geometrically determined way to identify the two spaces. Having chosen and imposed an origin, a unit, and two mutually perpendicular axes on the Euclidean plane, we can define a one-to-one mapping from the Euclidean plane to R2 by p → (x(p), y(p)), the coordinates of p; the mapping is an isometry, preserving distances between points on the Euclidean plane and their images in R2 . This mapping has limitations, however. We can find an analogue for lines on the Euclidean plane in a straightforward way: ²⁰ Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry, 3. ²¹ Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry, 4.
240
geometry and 20 th century topology
subsets of R2 consisting of the solutions of linear equations. Similarly, as Fermat and Descartes discovered, subsets of R2 consisting of the solutions of quadratic equations correspond in a straightforward way to circles and conic sections. However, some common and canonical entities on the Euclidean plane do not correspond to easily definable subsets of R2 : the best example is the triangle and more generally plane polygons. One must use a cut-andpaste method that is formally un-illuminating. Moreover, the imposition of a coordinate system on the Euclidean plane generates ‘artifacts’ that are geometrically insignificant, the way that staining a section of a cell for an electron microscope slide generates biologically insignificant artifacts (even as it exhibits other, biologically significant items and properties). Boothby’s example of this is especially interesting, because he shows that to define a geometrically significant thing (the angle between two lines) we must make an excursion into the geometrically insignificant, or, as Vuillemin would put it, the geometrically contingent. Given a coordinate system on the Euclidean plane, and the identification of a line with the graph of a linear equation, we define its slope and its y-intercept b when we write: L = {(x, y) | y = mx + b}. The values of m and b are not geometrically significant, since they arise simply from the choice of coordinate system. However, given two lines thus represented, the value (m2 − m1 /1 + m1 m2 ) does have geometrical meaning: its value, the tangent of the angle between the lines, is independent of the choice of coordinates. In sum, the relation between the Euclidean plane and the Cartesian product of the set of real numbers is an analogy, which may be made determinate by various kinds of representations in the service of solving various kinds of problems. The notion of isomorphism central to the semantic approach in the philosophy of science is here insufficient for two reasons. First, the choice of the most meaningful or best isomorphism cannot be explained without reference to the pragmatic context of problem solving: with respect to what mathematical intent are we representing the Euclidean plane? Second, we cannot explain the ‘natural or canonical’ status of the Euclidean plane in the sea of isomorphic (or homomorphic or homeomorphic or isometric) versions of the Cartesian product of the set of real numbers. Thus, Singer and Thorpe must add a diagram of the Cartesian
geometry, algebra, and topology 241 plane to their definitions in the first section of the first chapter of their topology book. The symbolic and much too general notation R2 must be supplemented by an icon (in this case a picture of the plane nicely fitted out with a coordinate system and shading that indicates its indefinite/infinite extent), as well as natural language to bring the two kinds of representation into rational relation: ‘Let A = B = the set of real numbers. Then A × B is the plane.’ The same kind of argument could be made for the use of diagrams in Figure 1.2 of the second section (See Figure 9.2), where the definition of metric space precedes and motivates the definition of a topological space.²²
Figure 9.2. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 1.2
The Euclidean plane and the Euclidean line play the role of canonical objects here, as they do throughout analysis and throughout differential ²² Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 6.
242
geometry and 20 th century topology
geometry. For a function to be differentiated and integrated, and for a manifold to be tractable, it must be locally linearizable—which means to stand locally in a special relation to Euclidean n-dimensional space—and then that condition must somehow be made global. Boothby, having completed his account of the gap between Euclidean geometry and Rn , concludes, ‘We need to develop both the coordinate method and the coordinate-free method of approach. Thus we shall often seek ways of looking at manifolds and their geometry which do not involve coordinates, but will use coordinates as a useful computational device.’²³ This means that we cannot avoid the use of diagrams like Figure 9.1 and Figure 9.2, but it does not, I think, mean only that, because Boothby is speaking not just of the line and the plane, but of n-dimensional Euclidean geometry generally. It certainly follows from what he says about the pedagogical approach of his textbook that we must use a variety of modes of representation to do (in this case) differential geometry properly, to know what we are talking about and to find effective methods for solving problems about them. Boothby invokes the study of geometry by the Greeks and by high school students (who pose and solve problems about figures without benefit of coordinate systems), just at the point where he makes a dutiful but non-committal gesture towards Hilbert’s axiomatization of geometry: It is very tricky and difficult to give a suitable definition of Euclidean space of any dimension, in the spirit of Euclid, that is, by giving axioms for (abstract) Euclidean space as one does for abstract vector spaces. This difficulty was certainly recognized for a very long time, and has interested many great mathematicians ... Careful axiomatic definition of Euclidean space is given by Hilbert. Since our use of Euclidean geometry is mainly to aid our intuition, we shall be content with assuming that the reader ‘knows’ this geometry from high school. ²⁴
Boothby uses the term ‘intuition,’ but I urge the notion of formal experience (with its roots in Cavaill`es and Vuillemin) that I have been developing in this book, which focuses on the strengths and limitations of various modes of representation, employed singly or in tandem, as they help us identify and analyze the things of mathematics. ²³ Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry, 5. ²⁴ Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry, 4.
geometry, algebra, and topology 243
9.5. De Rham’s Theorem In chapter 4 of Singer and Thorpe, the authors introduce simplicial complexes; in chapter 5 they introduce smooth manifolds; and in chapter 6 they prove De Rham’s theorem, which links simplicial complexes and smooth manifolds in a significant way by means of group theory.²⁵ In this problem context, abstract algebra provides representations for topology that are effective when combined with other sorts of representations, including icons and natural language. What does effective at problem-solving mean in this context? A family of problems that occurs naturally with respect to manifolds is how to understand their global structure in light of their local structure. One version of this is how to classify certain kinds of manifolds. Another version is how to carry out real or complex analysis on a manifold, how to define functions and integrate and differentiate functions on them; and more generally how to understand what happens when real or complex analysis is transferred to manifolds, and what that transformation of analysis on flat spaces reveals about the geometry of the manifolds. Mathematicians often start by trying out new methods on well understood, canonical objects and then extending them to others, less well understood or harder to represent: for example, higher dimensional manifolds. In algebraic topology these methods include associating topological structures (and maps between them) with group structure (and maps between them), and infinitary items and groups with finite, combinatorial items and groups. At the end of this section, I turn to an illuminating argument given by Nancy Cartwright in order to develop my reflections on the proof of De Rham’s Theorem. In chapters 4–6 of Singer and Thorpe, the sphere and the torus, among others, play the role of canonical objects. They are first of all geometrical objects with icons that reflect certain conventions about how to represent three dimensions on a 2-dimensional page (See Figures 9.3 and 9.4) ²⁶ As a variety, a given sphere or torus can be represented by a system of polynomial equations f1 (x1 , ... xn ) = 0, ... fn (x1 , ... xn ) = 0, because it is ²⁵ Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 69–156. The exposition follows these chapters closely. My thanks go to I. M. Singer for granting me permission to make use of his textbook for philosophical purposes in this way; and to Joseph Mazur for his extensive comments on drafts of this chapter, and his deep insight into algebraic topology. I would also like to thank Jean-Jacques Szczeciniarz for his helpful comments on an earlier draft of this paper. ²⁶ Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 159 and 152.
244
geometry and 20 th century topology
Figure 9.3. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 6.4
Figure 9.4. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 5.18
geometry, algebra, and topology 245 the set of points on which the polynomials vanish. As a manifold, the sphere or torus is viewed as a certain kind of topological space. This representation presupposes that each one may be decomposed into a set of points S and that we may choose some subset of all the subsets of S, T ⊂ 2S satisfying certain conditions: the empty set and S itself are in T, all finite intersections of members of T are again in T and so are all arbitrary unions. Then we call T a collection of ‘open sets’ on S and the pair (S, T) is a topological space. (To suppose that any geometrical entity may be treated as a set of points or elements, and that we can intelligibly locate its power set, requires further reflection and justification.) We can give any set what is called the discrete topology by stipulating that T = 2S , the power set of S; this is why I said earlier that any set can be thought of as a topological space. A final note: what corresponds to an isomorphism in topology is a homeomorphism. A function f : S → T is continuous if the inverse images of open sets are open; f is a homeomorphism if f is a one–one correspondence and both f and its inverse are continuous. Our sphere and torus have now been represented by an icon, a system of polynomial equations, a highly infinitary (who knows how high?) set of points, and a topological space. There is more to come, but we need to make a digression to explain the definition of a topological space. In analytic geometry, we are used to knowing the distance between two points, and being able to assign a number to it: this means we are regarding our space as a kind of metric space S. It is defined as a set S together with a function ρ : S × S → R+ (the non-negative reals) such that for all s1 , s2 and s3 in S, ρ(s1 , s2 ) = 0 if and only if s1 = s2 ρ(s1 , s2 ) = ρ(s2 , s1 ) ρ(s1 , s3 ) ≤ ρ(s1 , s2 ) + ρ(s2 , s3) ) This abstract definition of distance as a metric ρ was first given by Fr´echet in 1906, but despite its generalization of the notion of distance, it is too special in two respects. First, different metrics may end up furnishing the same notion of ‘openness’ or of neighborhoods for e.g. Euclidean space; thus mathematicians were led to formulate the definition of a topological space as a way of defining a more abstract structure that captures ‘openness,’
246
geometry and 20 th century topology
neighborhoods, limit points and continuity. (A closed set in topology is the complement of an open set; the closure of an open set is that set with all its limit points.) Second, some mathematical entities cannot be given a natural metric, but can be given a useful topological structure. The definition of a topological space raises the question whether a given topological space is rich (but not too rich—the discrete topology is uninformative) in open sets. If there are enough open sets to separate points, the space is called Hausdoff; if there are enough open sets to separate closed sets, the space is called normal. A space with a metric, S, can be turned into a topological space by taking as the open sets unions of balls, where a ball of radius a (a real) about s0 ∈ S is defined as the set, [s ∈ S | ρ(s, s0 ) < a]. If the metric is the usual Pythagorean metric, the ball is the interior of a circle or sphere or n-sphere of radius a. All metric spaces turned into topological spaces this way are Hausdorff and normal. Now we can turn to the representation of our sphere and torus as manifolds. A manifold is a Hausdorff space with additional structure, a set of maps such that for each s ∈ S there is a φs ∈ that maps some open set containing s homeomorphically into an open set in R2 . Thinking of a surface as a manifold leads to thinking of it as articulated into cells or panes or the faces of a triangulation, the bits that are flat ‘enough’ to be mapped nicely to 2-dimensional Euclidean space, the plane. A manifold can be linearized locally: the problem then is to move from the local reduction to a global reduction, so as to extend some version of the nice properties that follow from linearization to the manifold as a whole. In order to get the maps to overlap with each other in a way that accommodates this extension, we must add further conditions governing what happens on the overlap—the nature of the maps in . The manifold is called a C 0 -manifold if the mappings overlap in a way that is continuous; C k if all partial derivatives of order ≤ k exist and are continuous; C ∞ if all partial derivatives of all orders exist and are continuous; and C ω if it is real analytic. We represent our sphere and torus as C ∞ manifolds, ‘smooth manifolds.’ Because smooth manifolds are so well-behaved, they have a tangent space (and a cotangent space) at every point and all the tangent spaces (and cotangent spaces) can be collected into tangent (and cotangent) bundles that allow us to define vector fields and differential 1-forms on the manifold. A construction called a Grassmann algebra or exterior algebra allows us to define the set of smooth k-forms on the manifold, and then the set of all
geometry, algebra, and topology 247 smooth differential forms on the manifold. A smooth differential form ω on a smooth manifold X is closed if dω = 0; it is exact if it is the differential of another form on X, that is, if ω = dτ for some τ. Every exact form is closed, since d2 = 0. We are now way out there in the universe of abstraction, but the path back to earth is surprisingly rapid; in the case at hand, it is really surprising because of the connections forged between the local properties and the global structure of the space X. The construction of homology and cohomology groups depends on a continuous map, a ‘boundary operator,’ in this case ‘d’, such that when the map is applied twice, the result is a ‘zero’ in the space. Call Z k (X, d) the vector space of closed k-forms on X, and Bk (X, d) the space of exact k-forms on X; then H k (X, d) is defined as Z k (X, d)/Bk (X, d) and it is the kth De Rham cohomology group of X. Even though Z k (X, d) and Bk (X, d) may be highly infinitary, the dimension of their quotient H k (X, d) is finite for a compact manifold and is called the kth Betti number of X. The dimension of group H 0 (X, d) where X is a smooth manifold measures the number of connected components of the manifold and the dimension of group H 1 (X, d) measures the number of ‘holes.’ Thus, the former group assigns the same number to both the sphere and the torus, and the latter group assigns different numbers to them. Now we see the sphere or the torus represented by tangent and cotangent bundles, sets of smooth differential forms borrowed from analysis with a complicated algebraic structure (the Grassman algebra), quotients of k-forms that may turn out to be groups of finite dimension, and, finally, integers that count components and holes of the original manifold. These representations work in tandem to pull up and away from the sphere and the torus into a highly abstract realm, which in turn precipitates us back down to the realm of whole numbers. Though cohomology groups are defined in terms of the manifold structure of X, they are topological invariants: if two manifolds are homeomorphic, then they have isomorphic cohomology groups. Indeed these groups can be defined directly using only the topological structure of the manifold. Moreover, each smooth manifold X (here the sphere or the torus) has been associated with a new algebraic invariant, the group H k (X, d), so that a smooth map between manifolds induces algebraic maps between these algebraic objects. Thus, difficult topological problems can be approached and sometimes solved by studying homology groups and crunching numbers.
248
geometry and 20 th century topology
Another way of linearizing a manifold does so globally by bringing the whole thing into relation with a simplicial complex, though the ‘bringing into relation’ depends on the locally Euclidean structure, smoothness, and compactness of the manifold. A simplicial complex is most straightforwardly, though imprecisely, described as a concatenation of ‘simplexes’ of different dimensions: points, line segments, triangles, tetrahedrons, and so forth. Two simplices that compose a simplicial complex must always intersect in a simplex; and all components of a simplex must also belong to the simplicial complex to which that simplex belongs. The ‘triangulation’ of a surface that is a smooth manifold, for example, is carried out by finding a homeomorphism from an appropriate simplicial complex to the manifold, whose restriction to each simplex is a nice map satisfying certain conditions. Every compact, smooth manifold can be smoothly triangulated. The obvious advantage of simplicial complexes is that they can be treated as a finite set of numbers: the number of their vertices, edges, faces, and so forth. They are combinatorial items. Simplicial complexes, like manifolds, can also be represented by homology and cohomology groups, defined in terms of a boundary operator ∂ satisfying the condition ∂ 2 = 0. Every simplicial complex K can be used to define a group Cl (K, G) (G an arbitrary abelian group) using the l-dimensional (oriented) simplices of a simplicial complex in a certain construction called l-chains. The boundary operator ∂ maps n + 1-dimensional simplices to their n-dimensional boundaries in a certain way, and induces a map from Cl+1 (K, G) → Cl (K, G) → Cl−1 (K, G). (When G is a field F, Cl (K, F) is a vector space over F whose dimension equals the number of l-simplices of K.) The group Zl (K, G) consists of all the l-chains that are mapped to 0 by ∂ (called cycles); Bl (K, G) consists of all the l-chains that are obtained by ∂ operating on some l + 1 cycle (called boundaries); and Hl (K, G) is the quotient group Zl (K, G)/Bl (K, G). It is the lth homology group of K with coefficients in G. It turns out that the groups Hl (K, G) depend only on the topology of [K], where [K] is the point set union of the open simplices of K, so that a homeomorphism from [K] to [L], where L is another, suitable simplicial complex, induces appropriate isomorphisms between the associated homology groups. When G = R (the reals), the group Hl (K, R) is also a vector space over R whose dimension is called the lth Betti number βl of K, and the Euler characteristic of K, χ(K), is equal to the sum (−1)l βl where l ranges from 0 to the dimension of K.
geometry, algebra, and topology 249 From these stipulations, it follows that the Euler characteristic of K also equals the sum (−1)l αl (where l ranges from 0 to the dimension of K) in which the αl denote the number of l-simplices in K; that is, the Euler number of K is equal to the number of vertices − the number of edges + the number of 2-faces − et cetera. This brings us down to earth again, to the sphere and the torus, though this time we only have to descend from the lower stratosphere of combinatorics. If [K] is homeomorphic to a connected, compact, orientable 2-dimensional manifold, then β0 = 1 or β2 = 1, so that χ(K) = 2 − β1 or β1 = 2 − χ(K). Passing via simplicial complexes, it can be shown that any such surface is homeomorphic to a sphere with a certain number of handles and 1/2 β1 (β1 is always even) gives the number of handles. Clearly, a torus is homeomorphic to a sphere with one handle. In sum, the homology groups completely determine the homeomorphism class of connected, compact, orientable surfaces. However, Singer and Thorpe remark, for higher dimensional manifolds (and for surfaces that fail to be connected, compact, or orientable) the homology groups are not so informative. One way to exhibit a homeomorphism between a smooth manifold X and a simplicial complex K is to find a map h which, when restricted to each simplex of K, maps nicely to a smooth submanifold of X. The triple (X, K, h) is called a ‘triangulation’ of X, and it can then be used to link the differential operator d and smooth l-forms on the manifold to the boundary operator ∂ and oriented (l + 1)-simplices of the associated simplicial complex in a version of Stokes’ Theorem.²⁷ (To do this requires the technical trick of passing from the homology theory of simplicial complexes to its dual cohomology theory.) This sets up the proof of De Rham’s Theorem, which asserts that if (X, K, h) is a smoothly triangulated manifold, then the map from the cohomology group of X to the cohomology group of K is an isomorphism for each l between 0 and the dimension of X. It is important to see the intentionality built into this ²⁷ Stokes’ Theorem says that if M is a smooth manifold of dimension n and ω is a certain nicely behaved kind of n-1 differential form on M, and if ∂M denotes the boundary of M with its induced orientation, then M dω = ∂M ω. i Stokes’ Theorem can be considered as a generalization of the fundamental theorem of the calculus in the case where the manifold is R, the real line.
250
geometry and 20 th century topology
homeomorphism and the group isomorphism that goes along with it. The theorem depends on homology groups that isolate focal objects, helping to pinpoint the structure the mathematician is interested in, while at the same time forging the link between local information and global information. Moreover, the simplicial complexes are studied because they contribute to an understanding of manifolds, and not vice versa, even though De Rham’s Theorem is expressed in terms of an isomorphism. Because simplicial complexes are combinatorial items, the process of triangulation in concert with De Rham’s Theorem makes some aspects of some manifolds a matter of numerical computation. (This should remind us of the multiply reductive pattern of reasoning examined in Chapter 5 that took us from the study of molecules to the character tables.) I reproduce in Figures 9.5a, b, and c part of the set-up running from pages 161 to 164 for the full proof of De Rham’s Theorem in Singer and Thorpe; the proof itself continues on to page 173. On these pages the reader can see the conjunction of diagrams, and algebraic and analytic notation, embedded in the expository prose. Although the proof itself takes almost ten pages, the choice of objects and the representations chosen for those objects is what drives the proof. We should notice how hard the topologists had to work to arrive at the series of group isomorphisms demonstrated by De Rham’s Theorem. The analysis of well-understood canonical objects deploys a variety of representations, in which reductions link the representations, and novel constructions elaborate those linkages into groups and algebras. The constructive and reductive procedures in turn often invoke, or assume as accessible and well-understood, further canonical objects: the Euclidean plane, the real number line, the circle, the triangle. The process involves many different kinds of mappings, and shifts the focus of its interest from one set of related objects to another, depending on the success and interest of the mappings. Isomorphisms emerge from this process, but they are highly constrained when they do emerge, and what they mean depends on an understanding of the problems they are designed to solve, and the most important objects that figure in those problems.²⁸ ²⁸ I realize that this exposition may be difficult to follow. I could not make it as transparent as the exposition of, for example, Galileo’s demonstration given in Chapter 1, without spending more pages on preliminaries than I have room for in this book. That would be the case for any non-trivial twentieth century theorem, I think. Since I must, for philosophical reasons, deal with modern mathematics, I can only choose one of the theorems I understand best and that suits my philosophical purposes, and
geometry, algebra, and topology 251
Figure 9.5a. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 162
252
geometry and 20 th century topology
Figure 9.5b. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 163
geometry, algebra, and topology 253
Figure 9.5c. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 164
254
geometry and 20 th century topology
9.6. Nancy Cartwright on the Abstract and Concrete When we try to speak truly about things, we ask language (broadly construed) to perform at least two different roles: to indicate what we are talking about, and to analyze it by discovering its conditions of intelligibility, its rational requisites, its reasons for being what it is. The traditional schema for a proposition used by logicians is ‘S is P,’ and in general the subject term pertains to the first role, and the predicate term to the second. What is represented is the subject, and the representation is the predicate. In the two examples drawn from Singer and Thorpe that I traced in the last two sections, we find the first role played both by icons and by standard notations that have become proper names of certain objects; importantly, these designations are always accompanied by natural language that explains them and relates them to more abstract levels in the discourse. Mathematicians don’t ask icons to do the work of designating mathematical things all by themselves. We find the second role played by a repertoire of representations that, to be most effective, act in tandem to carry out a Leibnizian analysis. Natural language then also helps to explain the relation between the different modes of representation, and the controlled ambiguity of a single mode when the relation between its disparate uses must be clarified. The important asymmetry between S and P in the assertion of a proposition has been covered over by twentieth century logic, which treats them both extensionally as sets. But language used for S typically plays the role of referring; language used for P typically plays the role of analyzing; the use of language is different in each case, and the assertion of a proposition juxtaposes them. 28(cont.) then deal with it in a summary fashion while offering some pertinent definitions in the exposition and the glossary, and pointing the interested reader in the direction of the textbook from which I learned about it long ago. Joseph Mazur offered the following useful orientation:
Mathematicians can make a wonderfully surprising connection between combinatorial geometry of simplicial complexes and the differential picture of some vector field on the manifold. The simple one-dimensional example is the Fundamental Theorem of Calculus, where a function on the boundary of the interval (the vertices being the endpoints of the interval) corresponds to an integral (a 1-form) of a function over the interval. Green saw this and generalized it to functions on planes; Stokes probably saw what Green had done and generalized it to surfaces in 3-space; and De Rham just generalized all this to get his theorem ... Grothendieck, in the 1960s, built a whole branch of mathematics (a theory of sheaves generalizing algebraic varieties) out of this and used it in algebraic geometry to build a Galois theory for algebraic varieties (private correspondence).
geometry, algebra, and topology 255 In The Dappled World: A Study of the Boundaries of Science, Nancy Cartwright’s meditation on why we must reflect more deeply on the relationship between the abstract and the concrete bears directly on this point.²⁹ She uses her argument to talk about episodes in modern physics, yet clearly what she says has a more general import for epistemology. She argues, First, a concept that is abstract relative to another more concrete set of descriptions never applies unless one of the more concrete descriptions also applies. These are the descriptions that can be used to ‘fit out’ the abstract description on any given occasion. Second, satisfying the associated concrete description that applies on a particular occasion is what satisfying the abstract description consists in on that occasion.³⁰
In other words, abstract descriptions can only be used to say true things if they are combined with concrete descriptions that fix their reference in any given situation. This insight holds for mathematics as well as for physics. When abstract schemata are applied in mathematics, their successful application depends on the mathematician’s ability to find a useful concrete description for the occasion, which will mediate between complex mathematical reality and the general theory. This does not, however, entail that abstract concepts are no more than collections of concrete concepts. Cartwright argues, ‘The meaning of an abstract concept depends to a large extent on its relations to other equally abstract concepts and cannot be given exclusively in terms of the more concrete concepts that fit it out from occasion to occasion.’³¹ The more abstract description of the situation adds important information that cannot be ‘unpacked’ from any or even many of the concrete descriptions that might supplement it, or from our awareness of the thing or things successfully denoted. Nor is the relation between the abstract description and the concrete descriptions the same as the Aristotelian relation between genus and species, where the genus-concept is arrived at by subtracting content from the species-concept. Likewise, the more concrete descriptions have meanings of their own that are to a large extent independent of the meaning of any given abstract term they fall under; we cannot deduce the ²⁹ Cartwright, The Dappled World, ch. 2. ³⁰ Cartwright, The Dappled World, 49.
³¹ Cartwright, The Dappled World, 40.
256
geometry and 20 th century topology
content of the concrete descriptions by specifying a few parameters or by plugging in constants for variables in the abstract description. Here is another way of putting the insight. Abstract terms can only be used to say something true when they are combined with more concrete locutions in different situations that help us to fix their reference. And there is no complete ‘sum’ of such concrete locutions that would be equivalent to the original term. Conversely, as Leibniz argues, concrete terms can be used to say something true only when they are combined with more abstract locutions that express the conditions of intelligibility of the thing denoted, the formal causes that make the thing what it is and so make its resemblance to other things possible. And there is no complete ‘sum’ of the conditions of intelligibility of a thing.³² We cannot totalize concrete terms to produce an abstract term and we cannot totalize abstract terms to produce a concrete term that names a thing; and furthermore we cannot write meaningfully and truthfully without distinguishing as well as combining concrete and abstract terms. So we are left with an essential ambiguity that results from the logical slippage that must obtain between the concrete and the abstract. There is an inhomogeneity that cannot be abolished, which obtains between the abstract terms that exhibit and organize the intelligibility of things, and the concrete terms that exhibit how our understanding bears on things that exist in the many ways that things exist. We need to use language that both exhibits the ‘what’ of the discourse, and identifies the formal causes, the ‘why’ of the things investigated. To do so, we need to use different modes of representation in tandem, or to use the same mode of representation in different ways, to use it ambiguously. I have tried to show that the pursuit of knowledge about topological spaces, in the particular case of De Rham’s Theorem, proceeds in this way. Set theory, real analysis and abstract algebra provide modes of representation for topology that are productive when combined with other sorts of representations, including icons and natural language. And this is true for fundamental definitions that must be given at the beginning of any investigation of algebraic topology, as well as of particularly important theorems. ³² Grosholz and Yakira, Leibniz’s Science of the Rational, 56–72.
10 Logic and Topology Logic is usually presented as a canon of argument, a codification of the rules of thought. It formalizes the ways in which we move from premises that present evidence to the conclusions they support, and exhibits the ways that are most reliable because they transmit truth, or perhaps high probability. Understood in this way, logic has no special subject matter; thinking can be thinking about anything. However, from the very beginning of the Western tradition of logic, it has been clear that some patterns of reasoning resist formulation in, for example, Aristotle’s syllogistic, which is a logic of terms: reasoning about propositions, relations, temporal succession, modalities, counterfactuals, and individuals, to name a few. Modern predicate logic, which begins with the work of Gottlob Frege taken over by Bertrand Russell and Alfred North Whitehead, is not one system of higher order logic but rather an open-ended family of theories. Each theory consists of axioms couched in logical and extralogical vocabulary, and ‘purely logical’ rules of inference to specify the licit pathways from axioms to theorems. Thus, reconstructions of arithmetic in terms of predicate logic have special terms like ‘0’ and ‘1’ and the successor operator S, as well as operations like ‘+’ and ‘×’. These special terms have an ambiguous status when considered as part of logic, since logic is not supposed to be tied to any special subject matter. Bertrand Russell claimed to show that the special vocabulary of arithmetic could in fact be rewritten in ‘purely logical’ terms.¹ Then, the reductionist might argue with Charles Chihara, arithmetic is also shown to have no subject matter.² ¹ Principia Mathematica (Cambridge: Cambridge University Press, 1910/63), vol. 1, part II, sec. A. ² C. Chihara, Ontology and the Vicious-Circle Principle (Ithaca: Cornell University Press, 1973); an even more emphatic structuralism is presented in A Structural Account of Mathematics (New York: Oxford University Press, 2004).
258
geometry and 20 th century topology
By contrast, I would argue that what emerges here is a second role for logic. It becomes a collection of not only rules of thought, but also representations for other mathematical objects and procedures. In the latter role it exhibits the deductive structure of special theories (in model theory) as well as the fine detail of definitions (in definability theory, examining the logical complexity of different mathematical items.) This added role, however, can lead to a philosophical misunderstanding, the logicist misreading of Leibniz. Russell and Couturat read Leibniz as if he claimed that a single ‘universal characteristic’ would prove to be the optimal representation of mathematical things. But Russell’s elaborate formulas for the natural numbers in I.II.A of Principia Mathematica are mathematically inert, ‘lifeless,’ to use Angus Macintyre’s adjective. Inter alia, it does not exhibit their most important feature (most important to number theory at any rate), their unique decomposition into prime factors, any more than the stroke notation does, and for the same reason: it lacks the economical compound periodicity of Arabic numerals. Different modes of representation in mathematics bring out different aspects of the items they aim to explain and precipitate with differing degrees of success and accuracy. Moreover, Russell’s definitions of 0, 1, and 2 and so forth as logical formulas also define each number as a set of sets (so the unit 1 is, for example, the set of all one-membered sets); there are two problems with this. The first is that the definition presupposes the availability of what it defines, that is, the unit; and the second is that the set of all one-membered sets is not a set. The mathematical success of Russell’s definitions, it seems to me, was negative; its incoherence led to deeper mathematical and philosophical reflection on the nature of sets. Viewed in this representative role, first order predicate logic does in fact have an affinity with a special subject matter, depending on how we understand ‘aboutness’: that subject matter is sets of integers on the one hand, or recursively defined well-formed formulas on the other hand. First order predicate logic represents the universe of discourse of any of its special theories as if it were a set of discrete things with no pertinent internal structure; these things may then be collected into sets (represented by predicates) and then into increasingly ‘logically complex’ sets by projection and complementation using the quantifiers, recorded by the increasing complexity of the well-formed formulas. In sum, first
logic and topology 259 order predicate logic is not very good at representing the natural numbers themselves; it is better at representing sets of integers in one sense or wellformed formulas in another, and exhibiting something useful about those sets or well-formed formulas. Its representation of the natural numbers, we might say, is highly symbolic and not allied in any useful way with more iconic modes of representation of them; while its representation of sets of integers and well-formed formulas is relatively iconic. This reflection makes it seem like an irony that the key to the greatest meta-mathematical result of the last century was the representation of the well formed formulas of first order predicate logic by the natural numbers, a representation whose success depends on the unique prime decomposition of the latter.
10.1. Penelope Maddy on Set Theory There are at least two reasons why there cannot be a complete speech about mathematical things, any more than there can be a complete speech about human actions. (And this is no more a reproach to the reality of mathematics things than it is to the reality of who we human beings are and what we do.) One is that there are many different kinds of mathematical things, which give rise to different kinds of problems, methods, and systems. The other is that mathematical things are investigated by Leibnizian analysis, a historical process in which some things (or features of things) that were not yet foreseen are discovered, and others forgotten. Indeed, these two aspects of mathematics are closely related. For when we solve problems, we often do so by relating mathematical things to other things that are different from them, and yet structurally related in certain ways, as when we generalize to arrive at a method, or exploit new correlations. We make use of the internal articulation or differentiation of mathematical domains in order to investigate the intelligible unities of mathematics. To put it another way: just as there is a certain discontinuity between the conditions of solvability of a problem in mathematics and its solution (as Cavaill`es noted), so there is a discontinuity between a thing and its conditions of intelligibility (as Plato noted). An analysis results in a speech, which both expresses, and fails to be the final word about, the thing it considers. But perhaps I have been too restrictive by discussing number theory, geometry and topology, and should rather consider the claims of a more
260 geometry and 20 th century topology encompassing theory, like set theory. Penelope Maddy’s defense of set theory as a foundation for mathematics in the first section of Part I.2 of her Naturalism in Mathematics is especially instructive, for she renounces the traditional reductive claims of set theory, in favor of a defense that cites its usefulness as a resource in problem-solving. The identification of the natural numbers with, say, the finite von Neumann ordinals is not claimed to reveal their true nature, but simply to provide a satisfactory set theoretic surrogate; thus, no problem arises from the observation that more than one satisfactory set theoretic surrogate can be found. Likewise, the identification is not understood as a prelude to the repudiation or elimination of the original mathematical objects; though they may well drift off into irrelevancy for the most part, no such strong ontological conclusion is drawn.³
So the benefits of set theory are not ontological; Maddy also concedes that they are not epistemological—set theory is not provably consistent, and is not ‘more certain’ than the other branches of mathematics for which it provides foundations. By contrast, Maddy goes on to argue that the benefits of set theory are ‘mathematical rather than philosophical.’ Granting that the heterogeneity of mathematical domains is essential to research, she writes, ‘I think it cannot be denied that mathematicians from various branches of the subject—algebraists, analysts, number theorists, geometers—have different characteristic modes of thought, and that the subject would be crippled if this variety were somehow curtailed.’⁴ She also invokes the analogy with the sciences, as I have done: Consider the claim that everything studied in natural science is physical; it doesn’t follow from this that botanists, geologists, and astronomers should all become physicists, should all restrict their methods to those characteristic of physics. Again, to say that all objects of mathematical study have set theoretic surrogates is not to say that they should all be studied using set theoretic methods.⁵
Thus, Maddy defends set theory by pointing out that it serves two functions. First, it helps us to generalize and systematize, like abstract algebras, logic, or category theory. Second, when objects from other domains are correlated with sets, we can solve formerly unsolved problems about ³ P. Maddy, Naturalism in Mathematics (Oxford: Clarendon Press, 1997), 23. ⁴ Maddy, Naturalism in Mathematics, 33. ⁵ Maddy, Naturalism in Mathematics, 34.
logic and topology 261 those objects. And these claims are true. But I think that Cavaill`es would dispute—as would I—her inference from them: The force of set theoretic foundations is to bring (surrogates for) all mathematical objects and (instantiations of ) all mathematical structures into one arena—the universe of sets—which allows the relations and interactions between them to be clearly displayed and investigated. Furthermore, the set theoretic axioms developed in this process are so broad and fundamental that they do more than reproduce the existing mathematics; they have strong consequences for existing fields and produce a mathematical theory that is immensely fruitful in its own right. Finally, perhaps most fundamentally, this single, unified arena for mathematics provides a court of final appeal for questions of mathematical existence and proof: if you want to know if there is a mathematical object of a certain sort, you ask (ultimately) if there is a set theoretic surrogate of that sort; if you want to know if a given statement is provable or disprovable, you mean (ultimately), from the axioms of the theory of sets.⁶
Maddy’s admission that mathematics ‘would be crippled if this variety [the variety of its domains] were somehow curtailed’ does not sit well with her claim here that the translation of their objects and methods into a set theoretic idiom ‘allows the relations and interactions between them to be clearly displayed and investigated.’ In most cases, such a translation would paralyze research. In a few cases, correlating numbers with set theoretic entities is helpful for solving problems, but only when both the number and its set theoretic analogue are both present in the problem-solving situation. In many cases, using set theory as a generalizing structure is very helpful, as when a number theorist considers the set of all the natural numbers; but this is not a translation of the situation into set theoretical terms—rather it is the integration of set theory into a number theoretical situation. Number theory can be generalized and reorganized by the superposition of the algebra of arithmetic, various abstract algebras, topology, set theory, and category theory, to name a few; but in this line-up set theory has no special pride of place. The unifying and clarifying virtues of various generalizing structures have all had their champions: abstract algebras (Boubaki), predicate logic in the context of model theory (Robinson), category theory (Lawvere). In this company, Maddy’s claim that set theory is ‘the ultimate court of appeal’ seems unsubstantiated.
⁶ Maddy, Naturalism in Mathematics, 26.
262
geometry and 20 th century topology
10.2. A Brief Reconsideration of Arithmetic Iconic representations are supposed to resemble what they represent; this claim is usually taken to mean that they resemble things in a visual and spatial manner. I have argued that iconic representations in geometry, as in chemistry, are typically combined with symbolic representations and natural language instructions that explain the meaning of the icons, and to which the icons lend meaning, at the level of syntax, semantics, and pragmatics. In many cases, the iconic representation is indispensable. This is often, though not always, because shape is irreducible; in many important cases, the canonical representation of a mathematical entity is or involves a basic geometrical figure. At the same time, representations that are ‘faithful to’ the things they represent may often be quite symbolic, and the likenesses they manifest may not be inherently visual or spatial, though the representations are, and articulate likeness by visual or spatial means. Now I would like to raise the question, what might iconic representation mean in arithmetic? What does it mean to ‘picture’ a number? What does it mean to ‘picture’ a number system? At first it might seem as if the representation of numbers is never iconic, but this rather Kantian assumption is not justified, and I counter it by showing that there are degrees of iconicity in the representation of numbers. A natural number is either the unit or a multiplicity of units in one number. The representation of such a unified multiplicity is more iconic when the representation itself involves multiplicity: ////// is more iconic than ‘6’ or ‘six’, because ////// exhibits the multiplicity of the number 6—its multiplicity can be ‘read off’ the representation. In the case of six strokes, the unity of the number six is indicated (iconically) by the way the six strokes are grouped together and isolated from other things on the page. Six strokes scattered around the page or the chapter would not be any kind of representation of the number 6, unless accompanied by special instructions saying how to find them and then to put them together (in thought or concretely). The putting together and isolating is an aspect of graphic representation that stands for the intelligible unity of the number: natural numbers are unified multiplicities that we use to count with. Thus even the representation ‘6’ is iconic in this respect: it represents the unity of the number six. As we saw in earlier chapters, the graphic space around
logic and topology 263 the representation of a molecule, as well as a schematic representation of its symmetries by a geometrical figure, indicate that we are investigating a unified whole. An even more iconic representation of the concurrent unity and multiplicity of a natural number is the following, which is the kind of representation used by Leibniz in various manuscript discussions of the foundations of arithmetic.⁷
This representation exhibits the unity of the number more strongly and positively than the set of strokes, because of the irreducible unity of the continuum represented by the line. It insists upon the unity of a number. Insofar as a natural number is a multiplicity of units taken together as one intelligible thing, iconic representations of a natural number must represent the distinctness of the component units by spatial side-by-side-ness while its unity may be represented strongly by the continuum (whose own unity is so strong) or more weakly by the spatial isolation of the units taken together. One thing inter alia that this example shows is that representation cannot be explained in terms of merely physical tokens, even in the very simplest cases. (Here I echo a point made by Charles Parsons in his essay ‘The Structuralist View of Mathematical Objects,’ though I of course do not espouse structuralism.)⁸ The Leibnizian representation just given is, in its iconic intent, not a representation of the number 6 by a merely physical line (which is not strictly continuous or even straight on either the macroscopic or microscopic levels), but by the continuum, an intelligible thing whose unity it invokes and requires. And the spatial side-by-side-ness that represents multiplicity is not just physical but also intelligible spatiality, ⁷ E. Grosholz, ‘L’analogie dans la pens´ee math´ematique de Leibniz’ in L’Actualit´e de Leibniz: Les deux labyrinthes, eds. D. Berlioz and F. Nef (Stuttgart: Steiner, 1999), Studia Leibnitiana, Supplementa 34, 511–22. ⁸ C. Parsons, ‘The Structuralist View of Mathematical Objects,’ The Philosophy of Mathematics, ed. W. D. Hart (Oxford: Oxford University Press, 1996), 272–309. I have also learned a great deal from Michael Resnik’s Mathematics as a Science of Patterns (Oxford: Oxford University Press, 1997). Indeed, I’ve developed my own position over the last twenty years in part by arguing against Donald Gillies’ rather naturalist empiricism and Michael Resnik’s structuralism, and in the meantime have come to value their friendship. The footnotes in this book don’t really do justice to my intellectual debt to both these philosophers.
264
geometry and 20 th century topology
a condition of the difference of different things. Elsewhere, I have argued that this representation is a standing structural analogy designed to exhibit a condition of the intelligibility of a natural number.⁹ The two representations of 6 just given will not, however, properly represent the additive structure of the natural numbers. This is because each in its own way represents the natural numbers as an uninflected sequence, stretching out towards infinity: ... /, //, ///, ////, /////, //////, ... Because every natural number is distinguished from its predecessor in or /), every exactly the same way (by the spatial concatenation of fact of arithmetic (like 2 + 4 = 6) will have to be recorded separately: there will be no possibility of a finite arithmetical table. In the first case, the arithmetical ‘table’ would just be the line divided into units, stretching out to infinity, all over again. In the second case, it would be a stroke table that extends infinitely in both dimensions.
/ // /// //// ///// . . .
/ // /// //// // /// //// ///// /// //// ///// ////// //// ///// ////// /////// ///// ////// /////// ////// ///////
///// ////// . ////// /////// ///////
.
.
The problem is that both these tables are unsurveyable and uninformative. In his book, Laws and Symmetry, Bas van Fraassen defends his semantic approach to philosophy of science by arguing that scientific investigations are centrally concerned with models, and that considerations of syntax aren’t meaningful apart from semantics. The construction of models, he argues at length, depends on the selection of significant features of a system ⁹ Grosholz, ‘L’analogie dans la pens´ee math´ematique de Leibniz.’
logic and topology 265 and associated symmetries or periodicities, that is, a group of transformations that leave that feature invariant.¹⁰ I have argued earlier in this book that such semantic considerations must be supplemented by pragmatic considerations arising from the problem-solving context of use; models qua representations shade into nomological machines qua interventions, and paper tools play an important role in this middle ground. The situation here can be explained in these terms. Arabic numerals, considered as a representation of the natural numbers, introduce periodicities based on 10, 102 , 103 , and so forth, which are compounded or superimposed by means of horizontal and vertical conventions of bookkeeping. Thus we add:
123 + 237 360 Imagine if along a wall, for example, we recorded 123 strokes, and then recorded another 237 strokes, and then looked at a ‘big enough’ table like that given above to see the result of adding 123 + 237. It would be highly determinate in one way (we could see all the strokes) and highly indeterminate in another way (what systematic sense can be made of it?). My point is that the notation of Arabic numerals has a conceptual function that is not merely syntactic, but semantic and pragmatic as well: it creates a model of the natural numbers that precipitates a nomological machine, furnishing a 10 × 10 table for arithmetic, which summarizes most humanly relevant arithmetic facts of addition in 100 entries. It also exhibits, in virtue of its compounded 10n periodicity, patterns that form the basis for problems concerning natural numbers, and their solutions. This is one reason why notation is so important in mathematics, because of its role in creating models and precipitating nomological machines. Philosophers of mathematics have not clearly recognized this role perhaps because they have been so fixed on symbolic representation and so inattentive to iconic representation, and the iconic (and indexical) aspects of symbolic representations. This has also led them to posit an artificial ¹⁰ Van Fraassen, Laws and Symmetry, ch. 10 and 11.
266
geometry and 20 th century topology
disjunction between syntax, semantics, and pragmatics. Iconic representations need not be pictures of things with shape, though of course they often are; they may also be representations that exhibit the orderliness that makes something what it is. Arabic numerals, by exhibiting a certain multiply-periodic structure in the natural numbers, construct a model of them that allows for the articulation and solution of problems relating to the operation of addition. Here the role of syntax, semantics, and pragmatics cannot be disentangled; arithmetic is a domain in which form has content. Note that this doesn’t mean that other notations aren’t possible for the natural numbers, or that they can be identified with any one ‘best’ notation, or that there is a set of all possible notations, or that the notations exhaust what the natural numbers are. The ‘ruler’ representation does not allow for the representation of multiplication at all. The stroke representation will represent the factorization of a natural number if we introduce conventions of vertical as well as horizontal grouping, but that introduction itself precipitates periodicities in the internal structure of the number: Is // repeated? Is /// repeated? How often? This convention iconically represents something about the multiplicative structure of a number, since multiplication is the iteration of addition. 4 5 // //
6 // // //
7 /// ///
8 // // // //
//// ////
9 /// /// ///
But what would a multiplication table look like in the stroke notation? Like the arithmetic table, it would go on and on, with an infinite number of entries, determinate but uninformative. / / //
// /// //// // /// //// // /// //// // /// //// /// // /// //// // /// //// // /// //// //// // /// //// // /// //// // /// //// // /// //// . . .
. . .
logic and topology 267 Note that the entries: // // //
and
///
///
are distinct, and there is no structural connection made between them. By contrast, Arabic numerals furnish a 10 × 10 table for multiplication (where 2 × 3 is immediately 3 × 2, because they are both 6), and enhance the investigation of that most important feature of the natural numbers, the possibility of expressing each one as a unique product of primes. This is, of course, the insight on which much of number theory turns. Moreover, a great deal of modern number theory also consists of discerning and then imposing further symmetries or periodicities that hold of the natural numbers as well as further number systems in which they may be embedded. The notation makes possible procedures that allow for the investigation of the natural numbers, in terms of difficult but answerable problems. Another way to put my point is this: the natural numbers, represented by e. g. the stroke notation, are a highly determinate but infinitary domain, and insofar as they are infinitary, they are intractable. No problems can be posed with respect to them. This doesn’t mean that they aren’t intelligible; they are potentially intelligible in many ways, but we have to render them intelligible in at least a few ways in order to pose and solve problems about them. Our mathematical notations help us to do this; through the discernment and articulation of symmetries, they constitute models that are finitary and nomological machines that help us to solve problems. By saying this, I hope to avoid both structuralism on the one hand, and logicism as a kind of dogmatic Platonism on the other. I want to avoid an account that says that the natural numbers are empty placeholders, as well as an account that says that everything that is true about the natural numbers is always already true, in some big theory in the sky. The natural numbers are so determinate that whatever comes to be true of them will be necessarily true of them, but there is no sum total of all the true things that can be said of them. Different notations reveal different aspects of things. A similar argument could be made concerning the continua of geometry, whose infinitary determinateness is made finitary and tractable by geometrical figures, and the diagrams, lettering, and instructions of geometers that record them.
268
geometry and 20 th century topology
10.3. The Application of Logic to General Topology These observations explain a peculiar feature of the fate of mathematical logic in the twentieth century. Very few mathematicians were interested in Russell’s project of rewriting the objects of other areas of mathematics in the notation of mathematical logic. However, they had some interest in the results of mathematical logic as a new domain with its own peculiar objects, and in the study of various mathematical domains in terms of patterns of inference, because different kinds of items leave their traces on the patterns of inference about them. (So logic, at least pursued as model theory, isn’t so pure after all.) Frege’s and Russell’s notation for predicate logic was developed in relation to concepts, where a concept is conceived extensionally as the set of discrete items falling under the concept; and in relation to propositions, where what might count as a proposition is formalized in terms of recursively defined well-formed formulas. This notation, not surprisingly, thus helped to precipitate set theory on the one hand and the theory of recursive functions on the other. It also proved strikingly un-illuminating when applied to other mathematical domains that were already well supplied with their own models and modes of representation: number theory, geometry, topology, analysis, and so forth. In what follows, I will give examples that illustrate both sides of this particular coin: the initially rather fruitless application of logic (in this case, the propositional logic that underlies predicate logic) to general topology; and the use of recursion theory to classify the hierarchy of sets in topological contexts. In mathematics, syntax, semantics, and pragmatics—formal languages, the models that construct the determinate yet infinitary as something finite and thinkable, and the problem-solving strategies that develop languages and models—are so intertwined that the fate of mathematical logic is not surprising. The discipline of model theory plays a special role within mathematical logic, however, to which I will return at the end. Early twentieth-century mathematical logic, under the influence of Hilbert’s foundational program, was preoccupied with first order theories and computability. Mathematicians approached problems in the field of logic through the methods of recursion theory and the study of finitistic, combinatorial methods. This approach was quite fruitful, as the results
logic and topology 269 of Lowenheim, Skolem, and G¨odel show. However, it is also true that between G¨odel’s results in the 1930s and Cohen’s results in the 1960s, the development of mathematical logic was slow. While the restriction of methods of proof to elementary finitistic ones may have provided clarification of certain aspects of logic, in other respects it introduced artificial complications and limited the development of logic.¹¹ Because in mathematics the modes of representation that play the role of syntax also precipitate models and problem-solving strategies, it is not surprising that mathematical logic, despite its claims to universality, was in fact tied to a certain domain of mathematical items. However, another way of viewing logic revitalized the field in the work of Tarski and Stone. The set of all well-formed formulas of a formalized theory in predicate logic can be viewed as an algebra. When the formulas have been ‘modded out’ (organized symmetrically or periodically) into appropriate equivalence classes, they become one of a variety of lattices, Boolean, pseudo-Boolean, quasi-Boolean, topological Boolean, and so forth. This alternative interpretation thus views the formulas of logical calculi as mappings into certain lattices, which is a generalization of the truth table method, moving from two truth values, to an arbitrary finite number of truth values, to an infinite number of them. Topology enters here, because the open sets of a topological space form a (usually infinite) lattice to which formulas can be mapped; thus a topological space might be considered as an interpretation, a lattice of truth values, for logic. The approach of Frege and Russell, so fastened on arithmetic, showed no obvious rapprochement between logic and topology, but this approach made it seem promising, with topology provided a kind of concrete semantics for various logics. And indeed it did eventually prove fruitful, shedding new light on previous results in mathematical logic, though its usefulness for topology is not so clear. The completeness of predicate logic was proved to be equivalent to Stone’s Representation Theorem on the representation of Boolean algebras. G¨odel’s Completeness Theorem for predicate logic was proved equivalent to a modification of Stone’s theorem, as well as a result of the Baire Theorem on sets of the first category in topological spaces.¹² Moreover, the interpretation of logic in terms of algebra, lattice theory and ¹¹ H. Rasiowa and R. Sikorski complain about this finitist tendency in logic in the preface to The Mathematics of Metamathematics (Warsaw: Polish Scientific Publishers, 1963). ¹² Rasiowa and Sikorski, The Mathematics of Metamathematics, 1–6.
270
geometry and 20 th century topology
topology, because of the way in which it diverged from the arithmetization of logic, led to the posing and solution of different sorts of problems. They included the results of Scott and Solovay on Boolean-valued models, which played an important role in their re-writing of Cohen’s proof of the independence of the Continuum Hypothesis.¹³ Generally speaking, the interpretation of logic by topology was at first rather sterile, because the application of logic was inappropriate—it had not been designed for the study of topological spaces. However, eventually the attempt to bring logic into rational relation with topology precipitated new kinds of models and problems, and moreover precipitated new forms of logic, as well as giving new life or prominence to items, problems, and logical forms earlier regarded as peripheral. During the 1920s and 1930s, Alfred Tarski used logic as a means for representing and investigating the fine structure of topological spaces in general topology. The crucial insight governing his work is that propositional logic (and its variants), Boolean algebras (and their variants) and topological spaces can all be construed as lattices. Propositional logic, of course, underlies predicate logic. The propositional calculus can be considered a Boolean algebra; the ordinary method of truth tables yields homomorphisms which map the well-formed formulas to the Boolean algebra consisting of two elements, 0 and 1. A formula is a tautology if it is mapped to 1 by every such homomorphism. Appropriate homomorphisms can be found which map the propositional calculus in this way to any Boolean algebra. In other words, any Boolean algebra, even an infinite one, can serve as a system of truth values for the propositional calculus. The open sets of a topological space form a lattice, in fact, a complete lattice. So too do the closed sets, and the collection of closed and open sets. Thus, a topological space whose lattice of open sets forms a Boolean algebra can serve as a system of truth values for the propositional calculus. The answer to the question ‘for which topological spaces does the lattice of open sets form a Boolean algebra?’ depends on the way in which the Boolean operations are correlated with topological operations. In an early paper written in 1937, Tarski correlates X ∨ Y with the ordinary set theoretic sum of open sets, X ∧ Y with the ordinary set ¹³ Dana Scott, ‘A proof of the independence of the continuum hypothesis,’ Mathematical Systems Theory, 1 (1967), 89–111.
logic and topology 271 theoretic product, and ∼ X with the complement of the closure of X, so that operations on open sets again produce open sets.¹⁴ Unfortunately, in this instance only isolated topological spaces emerge as truth value systems for the propositional calculus. Tarski observes that this interpretation of the propositional calculus is a ‘quite trivial and in fact a general set-theoretical (not especially topological) interpretation ... ; every set S can in fact be made into an isolated topological space by putting X = X for every X ⊆ S.’ This construction is equivalent to giving a set the discrete topology, which is topologically uninformative. Disappointed with this result, Tarski suggests that one might adjust the terms of the correlation, placing restrictions on the collection of open sets. W. H. Stone, a functional analyst with interests similar to Tarski’s, lays out a general theory in which ∧, ∨, and ∼ correspond to ordinary settheoretic multiplication, addition, and complementation; the clopen sets of any topological space under these operations form a Boolean algebra.¹⁵ Thus Stone focuses attention on topological spaces with a basis of clopen sets. These spaces are of genuine interest to both the topologist and the logician; known as Boolean spaces, they are totally disconnected, compact Hausdorff spaces. The best-known example is the Cantor space, which can be constructed by giving the two-member set (0,1) the discrete topology, and then giving the Cartesian product of denumerably many copies of it the product topology. It can also be thought of as the collection 2N of all characteristic functions of the set N, the natural numbers; this is a metric space whose metric induces a topology equivalent to the product topology. The Cantor set had been identified long before Stone’s researches as the following construction. Begin with the unit interval and remove the interval (1/3, 2/3), that is, the middle third; then remove the middle third (1/9, 2/9) and (7/9, 8/9) from the two remaining intervals; then the middle third from the four remaining intervals; and so forth. When this procedure is carried out denumerably many times, the Cantor set is the set of points that remains. (A topological space is a Cantor ¹⁴ ‘Sentential Calculus and Topology,’ reprinted in Logic, Semantics and Metamathematics: Papers from 1923 to 1938 (Oxford: Clarendon Press, 1956 / Indianapolis: Hackett, 1983), 421–54. ¹⁵ ‘The Theory of Representation for Boolean Algebras,’ Transactions of the American Mathematical Society, 40 (1936), 37–111; and ‘Applications of the Theory of Boolean Rings to General Topology,’ Transactions of the American Mathematical Society, 41 (1937), 375–481; see further discussion of this work in P. R. Halmos, Boolean Algebras, mimeographed lecture notes, 1959 (held in the University of Chicago library).
272
geometry and 20 th century topology
space if it is homeomorphic to the Cantor set.) Up until the debut of Stone’s researches, it had been regarded as an isolated curiosity; Stone shifted it to a position of much greater interest. One reason why the Cantor space is important is that it is both compact and the product of discrete spaces; in mathematics, compactness and discreteness rarely arrive in the same package. The Cantor space is moreover dual to Euclidean space. The main result in Stone’s first article is the Representation Theorem. It asserts a correspondence, and in fact a duality, between any Boolean algebra and some Boolean space. This theorem would not have arisen from either Boolean algebra or topology alone, but refers to and extends both domains. The correspondence posited by Stone’s Representation Theorem between Boolean algebras and topological spaces is the deepest so far, for it selects out a class of topological spaces, Boolean spaces, which have an intrinsic interest for topology, analysis, and logic. Moreover, from it a wealth of special correspondences follow, so that knowledge of the fine structure of a Boolean space yields information about the fine structure of a Boolean algebra or ring, and vice versa. For example, a Boolean algebra is atomic if and only if the isolated points are dense in its dual space; finite if and only if it is the dual of a discrete space; countable if and only if the dual space is metrizable. This duality between spaces and rings has been generalized in a variety of ways and plays an important role in contemporary algebraic geometry. The usefulness of this correlation is also apparent when it is applied to complete Boolean algebras, where every subset has both a supremum and an infimum. The duals of such algebras are called complete Boolean spaces; from the topological point of view, they seem at first pathological, since the closure of every open set is open, and the interior of every closed set is closed. It might be questioned whether any such spaces exist. However, Stone’s Representation Theorem shows that there are ‘many’ of them, since each complete Boolean algebra corresponds to one of them, and every Boolean algebra can be completed. What might have been constructed with difficulty and only in isolated cases in point set topology has here been generated systematically, abundantly, and simply. As noted above, these spaces, and indeed Stone’s work in general, play a central role in Scott and Solovay’s version of Cohen’s proof of the independence of the Continuum Hypothesis from the other axioms of set theory. Stone’s
logic and topology 273 work also underlies other important results: alternate proofs of various fundamental metatheorems concerning predicate, modal, and intuitionistic logics; Łos’s concept of an ultraproduct of models; cylindric and polyadic algebras; and others. Note that these results are important for mathematical logic rather than for topology.¹⁶ Significantly, Stone was dissatisfied with his results, for he had hoped to encompass many more kinds of topological spaces, and find out more about their fine structure by his methods. In his 1937 paper, he wrote, Plainly we are engaged here in building a general abstract theory, and must accordingly be occupied to a considerable extent with the elaboration of technical devices. Such preoccupation appears the more unavoidable, because the known instances of our theory are so special, and so isolated that they throw little light upon the domain which we wished to investigate.¹⁷
Stone’s work is interesting to us here not only for its results but also for its limitations. In the end, only a restricted portion of general topology was amenable to direct correlation with propositional logic. Stone had hoped to find more global representations, encompassing many more kinds of topological spaces and yielding significant information about their fine structure; in this he was disappointed. The research program that followed upon Stone’s founding papers of 1936 and 1937 attempted the interpretation of logic by topology by using stronger and non-classical logics: predicate logic, modal logics, logics with set and function variables, logics with generalized quantifiers, and intuitionistic logic.
10.4. Logical Hierarchies and the Borel Hierarchy The second case I want to discuss is the succession of attempts to find a precise analogy between hierarchies of formulas of predicate logic, representing recursive sets and their complements and projections, and the Borel hierarchy, a topological structure developed by Borel, Baire, and Lebesgue at the end of the nineteenth century. Borel sets are a family R ¹⁶ See J. L. Bell, Boolean-valued Models and Independence Proofs in Set Theory (Oxford: Clarendon Press, 1977); and H. Rasiowa, An Algebraic Approach to Non-classical Logics (Amsterdam: North Holland, 1974). ¹⁷ Stone, ‘Applications of the Theory of Boolean Rings to General Topology.’
274
geometry and 20 th century topology
of subsets of any given topological space, which includes the open sets and countable unions and intersections built up from them. Recursion theory serves as the link between the logical and the topological elements. Mostowski, Kleene, and Addison proposed ever more refined versions of this correlation, each a strengthening and correction of those preceding. The recursive hierarchy begins with the ‘arithmetical hierarchy,’ which begins with recursive relations of n-tuples of integers, represented by formulas of predicate logic consisting of predicate letters. Recursive relations are those for which one can compute in a finite number of steps whether or not an n-tuple of integers belongs to it. This level is nicely represented by first order formulas consisting of predicate letters, since there are denumerably many recursive relations, so that denumerably many predicate letters can name them all. The next level, represented by predicate letters preceded by a finite number of existential quantifiers, consists of the recursively enumerable (r.e.) relations, projections along the nth coordinates of recursive relations. The hierarchy continues via successive applications of projection and complementation to relations at lower levels, represented by adding blocks of existential and universal quantifiers before the predicate letters (since ∀ =∼ ∃ ∼ and ‘∼’ represents complementation). Thus increasing complexity in the formulas mirrors increasing complexity in the sets of integers. In this case, one employs a two-sorted logic with set or function variables as well as individual variables, only the latter being quantified over. The ‘analytical hierarchy,’ which extends the arithmetical, allows projection along function coordinates as well, and is represented by second-order logic. The Borel hierarchy is a family R of subsets of any given topological space, beginning with the open sets and continuing with the construction of countable unions and intersections. Ordinal numbers α classify Borel sets into levels Rα . The family of all open sets is level R0 ; for α = λ + n > 0, ∞ ∞ Xk or Xk , λ a limit ordinal, Rα is the family of all sets of the form k=1
k=1
according as n is even or odd, and the sets Xk belong to earlier levels. The Borel hierarchy continues upwards into the hierarchy of projective sets, whose first stage consists of Borel sets of finite and transfinite levels, and whose second stage consists of ‘analytic sets’. In his paper ‘A Symmetric Form of G¨odel’s Theorem,’ Kleene remarks that Mostowski in 1946 had proposed an analogy between recursive sets and
logic and topology 275 Borel sets, r.e. sets and analytic sets, and in general between the arithmetical hierarchy of recursion theory, and the Borel and projective hierarchy of topology.¹⁸ The analogy is apparently thoroughgoing, since adjacent levels of both the arithmetical hierarchy and the topological hierarchy are related by a projection. For example, if a set is r.e., then there is a recursive set such that the first is a projection of the second. Similarly, the sets of the projective hierarchy are obtained by the operation of ‘generalized projection’ from the Borel sets. However, Kleene asserts that the analogy is imperfect. A theorem of Lusin states that two disjoint analytic sets can always be separated by a Borel set, but two disjoint r. e. sets may not always be separated by a recursive set. The correlation therefore had to be completely revised, so that the arithmetical hierarchy of recursion theory corresponded only to the finite Borel hierarchy, and the analytical hierarchy of recursion theory corresponded to the projective hierarchy of descriptive set theory. Kleene’s analogy is also unsatisfactory, however, for he was primarily interested in classifying definable sets of natural numbers. The difficulty is that there is a kind of incompatibility between the objects of recursion theory, the integers, and the canonical objects of topology, sets of real or complex numbers. (The Borel hierarchy is usually constructed upon a complete metric space like the intervals of reals between 0 and 1.) Since these items are so different—discrete, incomplete, and ‘first order’ in the former case but connected, complete, and ‘second order’ in the latter—some kind of mediating thing is required that admits an interesting topology, retains certain features of the reals, and exhibits recursive structure. Some adjustments in the recursive and topological hierarchies are also necessary. The following versions of the analogy were developed along these lines by John W. Addison in his doctoral dissertation and a paper that appeared in 1959.¹⁹ The first of these compromise items is, not surprisingly, the Cantor space. The topological side of the analogy is the effective finite Borel hierarchy, constructed upon the Cantor space, where unions and intersections are ¹⁸ S. C. Kleene, ‘A Symmetric Form of G¨odel’s Theorem,’ Indagationes Mathematicae, 12 (1950), 244–66; A. Mostowski, ‘On definable sets of positive integers,’ Fundamenta Mathematicae, 34 (1946), 81–112. ¹⁹ J. W. Addison, On Some Points in the Theory of Recursive Functions, Ph. D. Thesis, Madison: University of Wisconsin, 1954; and ‘Separation Principles in the Hierarchies of Classical and Effective Descriptive Set Theory,’ Fundamenta Mathematicae, 46 (1959), 123–35.
276
geometry and 20 th century topology
limited to being recursive. Here the Borel hierarchy is altered to accord better with its recursive analogue. The first level of this hierarchy can then be thought of as the clopen sets that form a basis for the Cantor space, and the second level as the open sets. The recursive side is the arithmetical recursive hierarchy of sets of sets of integers, which requires set variables as well as individual variables for its expression. This analogy is successful to the extent that any set of sets of integers which occurs at the nth level of the recursive hierarchy will also occur at the nth level of the topological hierarchy; but it is limited, because sets which occur at the nth level of the topological hierarchy may not occur at the nth level of the recursive hierarchy. A second mediating item is the Baire space, ωω , formed by giving the product of N copies of N the product topology. Although complete and with a basis of clopen sets, it is not compact; it can be identified with the irrational real numbers between but not including 0 and 1. The points in the product space can be thought of as functions from N to N, and the basis of clopen sets as composed of all such functions whose values at a finite number of integers are specified. The clopen sets are recursive in the sense that they can be specified by a finite amount of information. In this case, the topological side of the analogy is the Borel hierarchy constructed on the Baire space, called the finite hierarchy of Borel sets on the irrationals. The recursive side is the arithmetical hierarchy of sets of functions of integers, which requires function variables as well as individual variables for its expression. This analogy is limited by the fact that classes that occur at the first level of the recursive hierarchy occur at all levels of the Borel hierarchy. However, though it doesn’t capture properly many second order topological properties, this analogy does yield effective versions of a number of theorems in consequence of the final and most successful version of the analogy, set forth in the table in Figure 10.1. The recursive side of this last analogy consists of the arithmetical hierarchy of sets of functions of integers, not just recursive, but recursive in a real. (A function recursive in a set S can be computed by a T¨uring machine with an oracle, a black box that can tell what integers are in S.) The topological side consists again of the Borel hierarchy on the Baire space. This time the analogy is extremely thoroughgoing; it gives classes for all levels that occur at the same level in both the recursive and topological hierarchies. It can be extended to include the analytical hierarchy as
logic and topology 277
Figure 10.1. Grosholz, ‘Two Episodes in the Unification of Logic and Topology,’ Figure 1
278
geometry and 20 th century topology
well, so that the analogy in its fullest form looks as it is represented in the table given in Figure 10.1. Addison’s last analogy provides the finest correlation of structure between the two hierarchies and thus leads to deep results like the common validity of Lusin’s Separation Theorem and the Interpolation Theorem for a generalized version of predicate logic (Lω1 ω ) for both the recursive and the topological structures. More generally, over the years these analogies developed into a comprehensive theory which yields in a unified manner both the classical topological results and the theorems of the recursion theorists, and which serves as an important stage in the broader development of descriptive set theory. Addison’s work modified both sides of the original analogy by assimilating one to the other. This process of assimilation produces and provides a context for hybrids. Addison makes use of the Baire space, a topological space that mimics the reals but has recursive structure; other topological items modified to have recursive structure, like the effective Borel hierarchy; and recursive structures modified to approximate the complexity of the continuum, like the hierarchy of sets of functions recursive in a real. By means of his correlation, Addison produces models to which neither predicate logic nor topology alone would have given rise. Given that the integers are the canonical items of recursion theory, and the reals those of topology, Addison’s analogy would have foundered without hybrid items to work on. The integers can be given no suitable topology in this context, for if the clopen sets are to be the recursive sets, then the topology will be just the discrete topology, since all finite sets are recursive. But then not just all countable unions of recursive sets, but all sets, will be open and correspondingly r.e., and the analogy fails. The reals with their natural topology of open balls are also recalcitrant: since they are connected, they won’t have any clopen sets beside the empty set and the whole space. The work of Mostowski and Kleene in this episode shows that the iconic dimension of symbolic notation need not be figural. Some mathematical things cannot be used very well to ‘fit out’ (to use Nancy Cartwright’s phrase) certain abstract structures expressed in a certain notation because, as in the case of topological spaces here, they are not well described by it. This means that the notation is designed for another kind of thing,
logic and topology 279 that is, the notation, in this case logical notation, is not as subject matterfree as it pretends to be. First order predicate logic is ‘designed for’ sets of natural numbers, or, more generally, sets of integers. The attempt to bring logic into rational relation with topology requires the mediation of novel hybrids. It also leaves its mark on logic, as the last section will show. A further iconic aspect of the logical symbolism employed here is evident in my summary table (see Figure 10.1). The horizontal correspondences exhibit the correspondences between models (the Borel hierarchy erected on the Baire space, the arithmetical and analytical hierarchy of sets of functions recursive in a real) iconically. The vertical articulation is also meaningful, and allows for the rational relation of the finite and the infinite, as well as increasing degrees of infinitary complexity. It exhibits the mathematicians’ well-founded belief that we can move from the finite to the transfinite and moreover articulate the transfinite, iconically, giving a ‘place’ to mathematical entities and processes that only 150 years ago were off the page. In doing so, the table asserts a structural analogy among the finite, the transfinite, and the differentiated levels of the transfinite. The relation of the first to the second level of the arithmetical hierarchy is like the relation of the first to the second level of the analytical hierarchy, for example, and the similitude holds on both the logical and the topological sides.
10.5. Model Theory and Topological Logics The original goal of Tarski and Stone was to find a representation by means of which logic could express the features of many different kinds of topological spaces, not just one special structure (like, e.g., the Baire space). I now turn to a different strategy for representing topological spaces, a model-theoretic approach that uses what are called topological logics. They are various modifications of predicate logic that inhabit a middle ground between first and second order predicate logic; one of them even has a generalized quantifier Q which, as the prefix of a formula, indicates that the set of individuals satisfying that formula is, in the cases given below, open.
280
geometry and 20 th century topology
Model theory is the study of the relation between a formal theory (most often, a first order theory) and its models, in the hope of exhibiting something useful to mathematicians who work in the target field indicated by the models. At this point in time, topological model theory has mostly been forgotten, perhaps because it never did find important applications in toplogy. (More specific developments have brought about some rapprochement between logic and analytic geometry.)²⁰ Model theorists tend to restrict their attention to first order theories because they are well-behaved enough to be governed by a series of important theorems, and, ironically, because they admit a wealth of non-standard models. (Abraham Robinson’s book Non-standard Analysis devotes a chapter to general topology, and another to topological groups.)²¹ Precisely because first order theories are not categorical, they give the model theorist material to work with in the comparative study of the models of a theory. First order theories of topology cannot capture the centrally important items and features of topology categorically, that is, up to isomorphism, but they do fall under the aegis of the important meta-theorems that define logical ‘good behavior,’ that is, the formal properties of logical systems that make them tractable to study. These include Completeness, Compactness, Lowenheim-Skolem, Interpolation, and Ultraproducts. The Completeness Theorem states that any consistent set of formulas is satisfiable, has a model. Equivalently, if a formula is satisfied in every model of a theory, then it is a logical consequence of the axioms of that theory. The Compactness Theorem is an immediate consequence of the Completeness Theorem because deductive proofs are finite: it states that a set of sentences has a model if and only if every finite subset has a model. However, precisely because first order theories are compact, they cannot represent the topological notion of compactness. The compactness of first order logic implies the existence of an ultrapower (which I define just below), and we can use an ultrapower construction to build a noncompact extension of any model which is elementarily equivalent to it, ²⁰ See L. P. D. van den Dries, Tame Topology and O-minimal Structures (Cambridge: Cambridge University Press, 1998). A consequence of Abraham Robinson’s early work on model theory was the definition of o-minimality, a notion arising in logic that seems to correspond well to Grothendieck’s notion of a ‘tame’ or well-behaved topology, where the investigation of canonical geometrical items takes priority. ²¹ Non-standard Analysis (Amsterdam: North Holland, 1966; repr. Princeton: Princeton University Press, 1996).
logic and topology 281 that is, which satisfies exactly the same sentences. Thus no set of sentences composing a first order theory can capture the topological notion of compactness. The Lowenheim-Skolem Theorem shows that a first order theory cannot capture the cardinality of models. The Craig Interpolation Theorem says that when X is a logical consequence of Y , a third sentence Z can be interpolated so that X is a logical consequence of Z and Z is a logical consequence of Y . These theorems are used in the construction of model-completions. The Ultraproducts Theorem involves the construction of new models by forming certain product spaces of collections of models. A filter D over a set I is a subset of the power set of I that includes I itself, all intersections of the elements of D, and all Z when, if X ∈ D, X ⊂ Z ⊂ I. An ultrafilter is a maximal proper filter. Consider the Cartesian product of sets Ai indexed by I; it is the collection of all functions f mapping each i ∈ I to an element of Ai . Using the ultrafilter of I as an indexing set when all the Ai are the same, we can construct the Cartesian product of all the Ai , and this is then called the ultrapower of A modulo D. The Ultraproducts Theorem shows that when we construct the ultrapower of a model, we obtain a new model which is elementarily equivalent to the original, and in which the original can be embedded in a simple way.²² The limits of the expressive powers of first order theories with respect to topology are forbidding. Thus, model theorists may address the task of representing topology by using logics that are called weak models, a pairing of a first order model with a collection q of subsets of the domain A of the model. When q is a topology, the pair is a called a topological model. The notation in which these theories is expressed includes a generalized quantifier Q, satisfying certain conditions so that when q is a topology, the set of individuals represented by a formula preceded by the quantifier Q is an open set. Although there is no direct relation of inclusion between first and second order logic, we can say that weak logics are intermediate, because weak logics are more expressive and yet not all the good behavior of first order logic has been sacrificed. Research into these languages and their models mostly exhibits the inevitable trade-off between expressiveness and good behavior. L(Q) is ²² For more detail, see C. C. Chang and H. J. Keisler, Model Theory (Amsterdam: North Holland, 1973/77; Elsevier, 1990).
282
geometry and 20 th century topology
governed by Completeness, Lowenheim-Skolem, Compactness and an Ultraproducts Theorem, but does not have Interpolation or Beth definability. It can express T1 separation and discreteness, but not T0 , Hausdorff, T3 or T4 , or notions like ‘compactness,’ ‘the interior of a definable set is not empty,’ and ‘open in the product topology.’ L (Qn ) has Completeness, Lowenheim-Skolem, Compactness, and an r.e. set of valid formulas, though again Interpolation and Beth definability fail. Its expressive power is greater: Hausdorff is definable, though not T3 or T4 , nor the notion ‘compact.’²³ A pair of related but stronger languages are L(I) and L(In ), languages formed by adding an ‘interior operator’ I to predicate logic, where (Ix) ϕ defines the interior of the set of points defined by ϕ. This logic has Completeness, Lowenheim-Skolem, Compactness, and an r.e. set of valid formulas; because it is monotonic, it has Interpolation and Beth definability. It captures the notion of Hausdorff, but it does not capture T3 and T4 separation. It can be extended to L(In ), in a way analogous to the extension of L(Q) to L(Qn ). This logic has Completeness, LowenheimSkolem, Compactness, and an r.e. set of valid formulas, Interpolation, and Beth definability, and can express Hausdorff separation.²⁴ A final logic worth mentioning from this series is Ltop , based on a two-sorted first order language with both individual and set variables, but quantification only over the former. It has in addition a topology on the domain of the individuals, and a membership relation ∈. This language is expressive enough to define even T3 separation, but T4 still escapes its scope, as does that of topological compactness. It has Completeness, Lowenheim-Skolem, Compactness, and an r.e. set of valid formulas, Interpolation, and an Isomorphic Ultrapowers Theorem.²⁵ None of these logics can express topological compactness, and it was suggested that a sheaf-theoretic, category-theoretic representation might be more successful. A more recent project, the development of ‘geometric logic’ by Steven Vickers, has a close relationship to topology: the class of models for a ²³ J. Sgro, ‘Completeness Theorems for Continuous Functions and Product Topologies,’ Israel Journal of Mathematics, 25 (1976), 249–72; see also ‘Completeness Theorems for Topological Models,’ Annals of Mathematical Logic, 11 (1977), 173–93. ²⁴ C. C. Chang, Modal Model Theory, Lecture Notes #337 (Berlin: Springer-Verlag, 1973). ²⁵ S. Garavaglia, ‘Model Theory of Topological Structures,’ Annals of Mathematical Logic, 14 (1978), 13–37.
logic and topology 283 propositional geometric theory is automatically a topological space. The definition of geometric logics is, however, rather ‘weird,’ to use its creator’s word. It makes a hard distinction between formulas and axioms. A logical formula is restricted in the connectives it can use to express conjunction, disjunction, equality and existential quantification; and an axiom for a geometric theory expresses relationships between formulas in the form ‘for all x, y, and z, ... (formula 1–formula 2)’. The missing connectives can be introduced, without nesting, in axioms. Geometric logic also allows infinite disjunctions in formulas. However, the main applications of this logic lie in theoretical computer science. Topology proves ultimately quite resistant to representation by predicate logic.²⁶ None of these languages seems to have proved fruitful in the work of topologists. The philosophical interest of this case study lies partly in this negative fact, and partly in the constitutive ambiguity of the various logics that have been investigated. On the one hand, the formal language considered symbolically is ‘about’ topology; on the other hand, the formal language considered iconically represents itself as a mathematical system, which does or doesn’t exemplify the great meta-theorems of predicate logic.
10.6. Coda Logic is one of the pillars of philosophy, but it is not the only one. I have often defended its rights in my own department. However, if we think about the project of logic in light of the case studies in this book, a fundamental conflict within that project emerges, summed up by the examples given in this chapter. Logic is the study of the rules of thought; it thus wishes to represent how we think independent of what we happen to be thinking about. Its modes of representation strive to be completely general and thus symbolic, so that it will not mistakenly offer the peculiar features of some subject matter or another as universal; its modes of representation must flee the iconic. The laws of logic should hold for thinking about anything, no matter what it is or where it is or when it ²⁶ See, for example, Topology via Logic, Cambridge Tracts in Theoretical Computer Science 5 (Cambridge: Cambridge University Press, 1988). I would like to thank Angus Macintyre and one of the readers for Oxford University Press for detailed comments on this chapter that were very helpful.
284
geometry and 20 th century topology
shows up in history. Logic must then also formalize what we call subject and predicate, and then formalize the link between subject and predicate, and further between premises and conclusion; this means that it assumes a kind of abstracted homogeneity in the objects of its study reflected in its single, univocal idiom. These features of logic stem from its enterprise, to catalogue the laws of thought. As any philosophical logician knows, these features of logic make it difficult for logic to accommodate reasonings that involve modalities (contingency, necessity, possibility), or that are affected by the passage of time, or that concern individuals, or that require iconic displays, or that link heterogeneous acts of thought or speech or kinds of things. Either logic must be altered, and bear some of the marks of the peculiar objects or processes reasoning strives to cover, splitting into a family of heterogeneous idioms; or it must announce its limits. The point of this chapter, and my whole book, is to show that mathematicians typically reason about individuals as well as abstractions, refer successfully to specific things, link heterogeneous items, and exploit and construct ambiguity, in problem contexts that cannot escape geometry, modalities and historicity. Because they do this successfully all the time, it is no wonder that logic all by itself cannot express mathematics. When Paul Benacerraf limits the articulation of mathematical truth to logic and then complains that the ability of mathematicians to refer has been lost, it is no wonder; it is also no wonder that number theorists and geometers have not borrowed the language of logic to do their work. Mathematicians must and do employ a variety of modes of representation in tandem and in superposition, using natural language to explain what cannot be formalized: the relation of the idioms to each other, to the reader and to the regulative object. Moreover, as logic enters mathematics and becomes more and more mathematical, its iconic aspects (which in fact it has always had even when denying them) and useful ambiguities become more important, and it splits up into a family of no longer homogeneous idioms. This is one reason why the Aristotelian theory of the syllogism, and Stoic or propositional logic, being more purely logical and thus also more constrained, must persist alongside modern mathematical logic.
List of Illustrations 1.1. Galileo, Discorsi, Third Day, Uniform Motion, Theorem I, Proposition I 1.2. Galileo, Discorsi, Third Day, Uniform Motion, Theorem IV, Proposition IV 1.3. Galileo, Discorsi, Third Day, Naturally Accelerated Motion, Theorem I, Proposition I 1.4. Galileo, Discorsi, Third Day, Naturally Accelerated Motion, Theorem II, Proposition II 1.5. Galileo, Discorsi, Fourth Day, The Motion of Projectiles, Theorem I, Proposition I 3.1. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Diagrams 1–4.
6 9 10 12 15 75
Reprinted with kind permission from A. Hamilton and from Angewandte Chemie, International English Edition, 36 (23): 2680–3, A. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Wiley-VCH Verlag GmbH & Co KG, 1997.
3.2. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Figure 1.
83
Reprinted with kind permission from A. Hamilton and from Angewandte Chemie, International English Edition, 36 (23): 2680–3, A. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Wiley-VCH Verlag GmbH & Co KG, 1997.
3.3. Hamilton et al., ‘A Calixarene with Four Peptide Loops, Figure 2.
85
Reprinted with kind permission from A. Hamilton and from Angewandte Chemie, International English Edition, 36 (23): 2680–3, A. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Wiley-VCH Verlag GmbH & Co KG, 1997.
3.4. Hamilton et al., ‘A Calixarene with Four Peptide Loops, Figure 4. Reprinted with kind permission from A. Hamilton and from Angewandte Chemie, International English Edition, 36 (23): 2680–3, A. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Wiley-VCH Verlag GmbH & Co KG, 1997.
86
286
representation and productive ambiguity
3.5. Hamilton et al., ‘A Calixarene with Four Peptide Loops, Figure 3.
87
Reprinted with kind permission from A. Hamilton and from Angewandte Chemie, International English Edition, 36 (23): 2680–3, A. Hamilton et al., ‘A Calixarene with Four Peptide Loops,’ Wiley-VCH Verlag GmbH & Co KG, 1997.
4.1. McClintock, ‘Chromosome Organization and Genic Expression,’ Figure 8. Reprinted with kind permission from Cold Spring Harbor Symposia on Quantitative Biology, 16, 13–47, B. McClintock, ‘Chromosome Organization and Genic Expression,’ Cold Spring Harbor Laboratory Press, 1951.
4.2. McClintock, ‘Chromosome Organization and Genic Expression,’ Photographs 10 and 12, p. 23. Reprinted with kind permission from Cold Spring Harbor Symposia on Quantitative Biology, 16, 13–47, B. McClintock, ‘Chromosome Organization and Genic Expression,’ Cold Spring Harbor Laboratory Press, 1951.
4.3. McClintock, ‘Induction of Instability at Selected Loci in Maize,’ Tables 4 and 5.
106
108
109
Reprinted with kind permission from Genetics, 38, 579–99, B. McClintock, ‘Induction of Instability at Selected Loci in Maize,’ The Genetics Society of America, 1953.
4.4. Fedoroff and Brown, ‘The Nucleotide Sequence of the Repeating Unit in the Oocyte 5S Ribosomal DNA of Xenopus laevis,’ Figure 1.
114
Reprinted with kind permission from N. Fedoroff and from Cold Spring Harbor Symposia on Quantitative Biology, 42, 1195–200, N. Fedoroff and D. Brown, ‘The Nucleotide Sequence of the Repeating Unit in the Oocyte 5S Ribosomal DNA of Xenopus laevis,’ Cold Spring Harbor Laboratory Press, 1977.
4.5. Fedoroff and Brown, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-Rich Spacer,’ Figure 11.
115
Reprinted with kind permission from N. Fedoroff and from Cell, 13, N. Fedoroff and D. Brown, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-Rich Spacer,’ 701–16, with permission from Elsevier, 1978.
4.6. Sanger and Coulson, ‘A Rapid Method for Determining Sequences in DNA,’ Plate 1. Reprinted from Journal of Molecular Biology, 94, F. Sanger and R. Coulson, ‘A Rapid Method for Determining Sequences in DNA by Primed Synthesis with DNA Polymerase,’ 441–8, with permission from Elsevier, 1975.
118
list of illustrations 287 4.7. Fedoroff and Brown, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-Rich Spacer,’ Figure 4.
119
Reprinted with kind permission from N. Fedoroff and from Cell, 13, 701–16, N. Fedoroff and D. Brown, ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-Rich Spacer,’ Elsevier, 1978.
4.8a. N. Fedoroff et al., ‘Epigenetic Regulation of the Maize Spm Transposon,’ Figure 1.
123
Reprinted with kind permission from N. Fedoroff and from BioEssays, 17 (4), 291–7, N. Feodoroff et al., ‘Epigenetic Regulation of the Maize Spm Transposon,’ John Wiley & Sons, 1994.
4.8b. N. Fedoroff et al., ‘Epigenetic Regulation of the Maize Spm Transposon,’ Figure 2.
124
Reprinted with kind permission from N. Fedoroff and from BioEssays, 17 (4), 291–7, N. Feodoroff et al., ‘Epigenetic Regulation of the Maize Spm Transposon,’ John Wiley & Sons, 1994.
5.1. F. A. Cotton, Chemical Applications of Group Theory, Figure 3.2.
134
Reprinted with kind permission from F. A. Cotton and from John Wiley & Son Chemical Applications of Group Theory, 3rd edn, F. A. Cotton, 25, John Wiley & Sons, 1990.
5.2. F. Brescia et al., Fundamentals of Chemistry: A Modern Introduction, Figure 8.20.
142
Reprinted with kind permission from Elsevier and from Physical Review Fundamentals of Chemistry: A Modern Introduction, F. Brescia et al., Elsevier, 184, 1966; originally printed in Physical Review, 37, 1416, H. E. White, ‘Pictorial Representations of the Electron Cloud for Hydrogen-Like Atoms,’ American Physical Society, 1931.
5.3. F. Brescia et al., Fundamentals of Chemistry: A Modern Introduction, Figure 11.27.
146
Reprinted with kind permission from Elsevier, Fundamentals of Chemistry: A Modern Introduction, F. Brescia et al., 289 Elsevier, 1966.
5.4a. F. A. Cotton, Chemical Applications of Group Theory, 146.
148
Reprinted with kind permission from F. A. Cotton and from John Wiley & Sons, Chemical Applications of Group Theory, 3rd edn, F. A. Cotton, 146, John Wiley & Sons, 1990.
5.4b. F. A. Cotton, Chemical Applications of Group Theory, 147.
149
Reprinted with kind permission from F. A. Cotton and from John Wiley & Sons Chemical Applications of Group Theory, 3rd edn, F. A. Cotton, 147, John Wiley & Sons, 1990.
5.5. F. A. Cotton, Chemical Applications of Group Theory, unnumbered figure, 158.
150
288
representation and productive ambiguity Reprinted with kind permission from F. A. Cotton and from John Wiley & Sons, Chemical Applications of Group Theory, 3rd edn, F. A. Cotton, 158, John Wiley & Sons, 1990.
6.1. Descartes, Geometria, 2 6.2. Descartes, Geometria, 6 6.3. Bos, ‘On the Representation of Curves in Descartes’ G´eom´etrie,’ 299.
170 172 173
Reprinted with kind permission from Archive for History of Exact Sciences, 24, 295–338, H. J. M. Bos, ‘On the Representation of Curves in Descartes’ G´eom´etrie’, Springer Science and Business Media, 1981.
6.4. 6.5. 6.6. 6.7. 6.8. 7.1. 7.2.
Descartes, Geometria, 26 Descartes, Geometria, 20 Descartes, Geometria, 22 Descartes, Geometria, 28 Descartes, Geometria, 36 Newton, Principia, Book I, Section II, Proposition I, Theorem I Newton, Principia, Book I, Section II, Proposition VI, Theorem V 7.3. Newton, Principia, Book I, Section III, Proposition XI, Theorem VI 8.1. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 119.
175 178 179 181 182 185 197 199 217
Reprinted with kind permission from Georg Olms Verlag, Mathematische Schriften, G. W. Leibniz,V, ed. C. I. Gerhardt, Georg Olms Verlag, Hildesheim, 1962.
8.2. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 120. Reprinted with kind permission from Georg Olms Verlag, Mathematische Schriften, G. W. Leibniz,V, ed. C. I. Gerhardt, Georg Olms Verlag, Hildesheim, 1962.
8.3. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 139. Reprinted with kind permission from Georg Olms Verlag, Mathematische Schriften, G. W. Leibniz,V, ed. C. I. Gerhardt, Georg Olms Verlag, Hildesheim, 1962.
8.4. Leibniz, Mathematische Schriften, V, ed. Gerhardt, Figure 121. Reprinted with kind permission from Georg Olms Verlag, Mathematische Schriften, G. W. Leibniz, V, ed. C. I. Gerhardt, Georg Olms Verlag, Hildesheim, 1962.
9.1. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 1.1.
218
220
222
236
list of illustrations 289 Reprinted with kind permission from I. M. Singer and from Springer Science and Business Media, Lecture Notes on Elementary Topology and Geometry, I. M. Singer and J. A. Thorpe,3, Springer Science and Business Media, 1967.
9.2. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 1.2.
241
Reprinted with kind permission from I. M. Singer and from Springer Science and Business Media, Lecture Notes on Elementary Topology and Geometry, I. M. Singer and J. A. Thorpe, 6, Springer Science and Business Media, 1967.
9.3. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 6.4.
244
Reprinted with kind permission from I. M. Singer and from Springer Science and Business Media, Lecture Notes on Elementary Topology and Geometry, I. M. Singer and J. A. Thorpe, 159, Springer Science and Business Media, 1967.
9.4. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, Figure 5.18.
244
Reprinted with kind permission from I. M. Singer and from Springer Science and Business Media, Lecture Notes on Elementary Topology and Geometry, I. M. Singer and J. A. Thorpe, 152, Springer Science and Business Media, 1967.
9.5a. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 162.
251
Reprinted with kind permission from I. M. Singer and from Springer Science and Business Media, Lecture Notes on Elementary Topology and Geometry, I. M. Singer and J. A. Thorpe,162, Springer Science and Business Media, 1967.
9.5b. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 163.
252
Reprinted with kind permission from I. M. Singer and from Springer Science and Business Media, Lecture Notes on Elementary Topology and Geometry, I. M. Singer and J. A. Thorpe,163, Springer Science and Business Media, 1967.
9.5c. Singer and Thorpe, Lecture Notes on Elementary Topology and Geometry, 164. Reprinted with kind permission from I. M. Singer and from Springer Science and Business Media, Lecture Notes on Elementary Topology and Geometry, I. M. Singer and J. A. Thorpe,164, Springer Science and Business Media, 1967.
253
290
representation and productive ambiguity
10.1. Grosholz, ‘Two Episodes in the Unification of Logic and Topology,’ Figure 1.
277
Reprinted with kind permission from E. Grosholz, and from British Journal for the Philosophy of Science, 36, 147–57, ‘Two Episodes in the Unification of Logic and Topology,’ 153, Emily R. Grosholz.
Note Although every effort has been made to establish copyright and contact copyright holders prior to printing, this has not always been possible. The publishers would be pleased to rectify any omissions or errors brought to their notice at the earliest opportunity.
Glossary Basis: A set T ⊂ 2S is a basis for a topology T on S if T can be recovered from T by taking all possible unions of subcollections from T . Boolean algebra: A partially-ordered set of elements with a unit and zero element; binary operations ∧ and ∨ satisfying the idempotent, commutative, associative and distributive laws; and a unary operation ∼ which obeys the complementary, dualization and involution laws. Every Boolean algebra can also be treated as a Boolean ring. Clopen set: A set which is both closed and open. Closed set: A set in S (see Topological space) which is the complement of an open set. Closure of a set: For any X < S, the set itself together with all its limit points. Compact: A topological space is compact iff for every collection F of closed sets, if F1 ∩ ... ∩ Fk = Ø for each finite subcollection Fi ⊂ F , then F = Ø. F∈F
Component: A maximal connected subset of a topological space. Connected: A topological space is connected if it is not the union of any two disjoint non-empty open sets in its topology as a subspace. Discrete topology: A topological space has the discrete topology when every point is a component. Interior of a set: A point x is interior to a set S if S is a neighbourhood of x. The set of all points interior to S is called the interior of S. Isolated space: A topological space where every set in T is equal to its own closure. Lattice: A lattice is a partially ordered set in which any two elements have a least upper bound and a greatest lower bound. Limit point: A limit point of a set X ⊂ S is a point x such that for each open set Y in T such that x ∈ Y , (y − {x}) ∩ X = Ø. Metric space: A metric space is a set S together with a function f: S × S → (non-negative reals), such that for x1 , x2 , x3 , ∈ S, (1) f (x1 , x2 ) = ◦ iff x1 = x2 ; (2) f (x1 , x2 ) = f (x2 , x1 ); and (3) f (x1 , x3 ) f (x1 , x2 ) + f (x2 , x3 ). Neighborhood: A neighborhood of A ⊂ S is any set containing an open set containing A. Open set: The elements of T ⊂ 2S (see Topological space) are called open sets. Partially ordered system: A set S with a binary relation , which satisfies the reflextive, anti-symmetric and transitive laws.
292
representation and productive ambiguity
Power set: 2S is the power set of S, the set of all its subsets. Product topology: Let {Sw }w∈W be a collection of topological spaces. Let Sw , the Cartesian product of sets. A basis for the product topology P= w∈W
on P consists of products of open sets, one from each factor Sw , with all but finitely many of these being the whole space Sw . Recursive in a set S: A function recursive in a set S (for instance, a real number considered as a set of integers) can be computed by a T¨uring machine with an ‘oracle’, a black box which can tell what integers are in S. Regular open set: Open set which coincides with the interior of its own closure. Separation Properties: Topological spaces in point-set topology are classified according to their separation properties. T0 : T1 : T2 : T3 : T4 :
Has enough open sets to provide, for any two points of the space, an open set which contains one but not the other. Can separate points by two not necessarily disjoint open sets. (Hausdorf f): Can separate two points by disjoint open sets. (Regular): Provides, for any two closed sets of the space, an open set which contains one but not the other. (Normal): Has enough open sets to separate closed sets by disjoint open sets.
Topological space: A set S together with a collection T of subsets (T ⊂ 2S ) such that T contains the empty set, S itself, and all finite intersections and arbitrary unions of members of T. T is called a topology on S, and the pair (S, T) is the topological space. Totally disconnected: A topological space is totally disconnected when any two distinct points lie in disjoint components.
Bibliography Books Addison, J. W. On Some Points in the Theory of Recursive Functions, PhD thesis (Madison: University of Wisconsin, 1954). Alexandroff, P. and Hopf, H. Topologie (Berlin: Springer, 1935). Atkins, P. W. The Second Law (New York: Scientific American, 1984). Molecules (New York: Scientific American, 1987). Austin, J. L. How to Do Things with Words, ed. J. O. Urmson and M. Sbis`a (Oxford: Oxford University Press, 1976). Bell, J. L. Boolean-valued Models and Independence Proofs in Set Theory (Oxford: Clarendon Press, 1977). Berlioz, D. and Nef, F. (eds.) L’Actualit´e de Leibniz: Les deux labyrinthes, Studia Leibnitiana Supplementa 34 (Stuttgart: Steiner, 1999). Bernoulli, J. Opera Johannis Bernoulli, ed. G. Cramer (Geneva, 1742). Birkhoff, G. and MacLane, S. A Survey of Modern Algebra (New York: MacMillan, 1953). Bhushan, N. and Rosenfeld, S. (eds.) Of Minds and Molecules: New Philosophical Perspectives on Chemistry (New York: Oxford University Press, 2000). Blay, M. Les raisons de l’infini: Du monde clos a` l’univers math´ematique (Paris: Gallimard, 1993); English trans. by M. B. DeBevoise, Reasoning with the Infinite: From the Closed World to the Mathematical Universe (Chicago: University of Chicago Press, 1998). Boothby, W. An Introduction to Differentiable Manifolds and Riemannian Geometry (New York: Academic Press, 1975). Bos, H. J. M. Redefining geometrical exactness: Descartes’ transformation of the early modern concept of construction (Frankfurt: Springer-Verlag, 2001). Boyer, C. A History of Mathematics (Princeton: Princeton University Press, 1968/85). Brescia, F. et al. Fundamentals of Chemistry: A Modern Introduction (New York: Academic Press, 1966). Campos, D. G. The Discovery of Mathematical Probability Theory: A Case Study in the Logic of Mathematical Inquiry. Philosophy PhD thesis (State College, PA: The Pennsylvania State University, 2005).
294 representation and productive ambiguity Carnap, R. Der Logische Aufbau der Welt (Hamburg: Meiner, 1928). English trans. by B. Rolf and A. George, The Logical Structure of the World and Pseudoproblems in Philosophy, (Berkeley: University of California Press, 1967/69). Cartwright, N. The Dappled World (Cambridge: Cambridge University Press, 1999). Cassou-Nogu`es, P. De l’exp´erience math´emathique (Paris: Vrin, 2001). Cellucci, C. Le ragioni della logica (Rome: Editori Laterza, 1998). Filosofia e matematica (Rome: Editori Laterza, 2002). Gillies, D. (eds.) Mathematical Reasoning and Heuristics (London: Kings College Publications, 2005), 25–47. Charnay, J.-P. (ed.) Lazare Carnot ou le savant-citoyen (Paris: Presses de l’Universit´e de la Sorbonne, 1990). Chemla, K. and Shuchun, G. (tr.) Les neuf chapitres: Le classique math´ematique de la Chine ancienne et ses commentaires, with a preface by G. Lloyd (Paris: Dunod, 2004). Chihara, C. Ontology and the Vicious-Circle Principle (Ithaca: Cornell University Press, 1973). A Structural Account of Mathematics (New York: Oxford University Press, 2004). Cotton, F. A. Chemical Applications of Group Theory, 3rd edn (New York: John Wiley and Sons, 1990). Dagognet, F. Tableaux et languages de la chimie (Paris: Editions du Seuil, 1969). Ecriture et iconographie (Paris: Vrin, 1973). De Gandt, F. Force and Geometry in Newton’s Principia (Princeton: Princeton University Press, 1995). Descartes, R. Discours de la m´ethode pour bien conduire sa raison et chercher la verit´e dans les sciences (Leiden, 1637). Oeuvres de Descartes, eds. C. Adam and P. Tannery (Paris: Vrin, 1897–1913/ 1964–74). Duchesneau, F. Leibniz et la m´ethode de la science (Paris: Presses Universitaires de France, 1993). La dynamique de Leibniz (Paris: Vrin, 1994). Euclid Elements, ed. T. L. Heath (New York: Dover, 1956), 3 vols. Fedoroff, N. and Botstein, D. (eds.) The Dynamic Genome: Barbara McClintock’s Ideas in the Century of Genetics (Cold Spring Harbor: Cold Spring Harbor Laboratory Press, 1992). Fichant, M. La r´eforme de la dynamique (Paris: Vrin, 1994). Science et m´etaphysique dans Descartes et Leibniz (Paris: Presses Universitaires de France, 1998). Fr´echet, M. Les espaces abstraits et leur th´eorie consid´er´ee comme introduction a` l’analyse g´en´erale (Paris: Gauthier-Villars, 1928).
Bibliography 295 Galilei, G. Dialogues Concerning Two New Sciences, tr. H. Crew and A. de Salvio, (New York: Dover, 1914/54). Giaquinto, M. The Search for Certainty: A Philosophical Account of Foundations of Mathematics (Oxford: Clarendon Press, 2002). Visual Thinking in Mathematics: An Epistemological Study (Oxford: Oxford University Press, forthcoming). Gillies, D. (ed.) Revolutions in Mathematics (Oxford: Clarendon Press, 1992). Philosophy of Science in the Twentieth Century (Oxford: Blackwell, 1993). Granger, G.-G. Formal Thought and the Sciences of Man (Dordrecht: Reidel, 1983). Gray, J. Ideas of Space: Euclidean, Non-Euclidean, and Relativistic (Oxford: Clarendon Press, 1989). Grosholz, E. Cartesian Method and the Problem of Reduction (Oxford: Clarendon Press 1991). Yakira, E. Leibniz’s Science of the Rational, Studia Leibnitiana, Sonderheft 26 (Stuttgart: Steiner, 1998). Breger, H. (eds.) The Growth of Mathematical Knowledge (Dordrecht: Kluwer, 2000). Gueroult, M. Descartes selon l’ordre des raisons (Paris: Aubier, 1968). Hacking, I. Representing and Intervening (Cambridge: Cambridge University Press, 1983). Hallett, M. and Majer, U. David Hilbert’s Lectures on the Foundations of Geometry, 1891–1902 (New York: Springer, 2004). Hart, W. D. The Philosophy of Mathematics (New York: Oxford University Press, 1996). Hernstein, I. N. Topics in Algebra (Waltham, MA, and Toronto: Xerox College Publishing/Ginn and Company, 1964). Hilbert, D. Grundlagen der Geometrie (Teubner 2002); English trans. Foundations of Geometry (Chicago: Open Court, 1971). Cohn-Vossen, S. Anschauliche Geometrie (Berlin: Springer, 1996); English trans. Geometry and the Imagination (Providence: American Mathematical Society, reprint edition, 1999). Hintikka, J. Inquiry as Inquiry: A Logic of Scientific Discovery (Dordrecht: Kluwer, 1999). Remes, U. The Method of Analysis: Its Geometrical Origin and its General Significance (Stuttgart: Springer, 1974). Hoffmann, R. The Same and Not the Same (New York: Columbia University Press, 1995). Torrence, V. Chemistry Imagined (Washington, DC: Smithsonian Institution Press, 1993).
296 representation and productive ambiguity Hume, D. Treatise of Human Nature, ed. L. A. Selby-Bigge (Oxford, Clarendon Press, 1888/1978). Imbert, C. Pour une histoire de la logique (Paris: Presses Universitaires de France, 1999). James, G. and Liebeck, M. Representations and Characters of Groups (Cambridge: Cambridge University Press, 1993). Joergensen, K. and Mancosu, P. (eds.) Visualization, Explanation and Reasoning Styles in Mathematics (Dordrecht: Kluwer, 2003). Joesten, M. D. et al. World of Chemistry (Philadelphia: Saunders, 1991). Keller, E. F. Reflections on Gender and Science (New Haven: Yale University Press, 1996). Keller, L. Levels of Selection in Evolution (Princeton: Princeton University Press, 1999). Kitcher, P. The Nature of Mathematical Knowledge (New York: Oxford University Press, 1983). Klein, U. (ed.) Tools and Modes of Representation in the Laboratory Sciences (Dordrecht: Kluwer, 2001). Experiments, Models, Paper Tools: Cultures of Organic Chemistry in the Nineteenth Century (Stanford, CA: Stanford University Press, 2003). Koyr´e, A. Etudes galil´eenes (Paris: Hermann, 1939). Kuhn, T. The Structure of Scientific Revolutions (Chicago: University of Chicago Press, 1996). Kuratowski, C. Topologie I (Warsaw/Lvov, 1933). Lefshetz, S. L’analysis situs et la g´eom´etrie alg´ebrique (Paris: Gauthier-Villars, 1924). G´eom´etrie sur les surfaces et les vari´et´es alg´ebriques (Paris: Gauthier-Villars, 1929). Leibniz, G. W. Opuscules et fragments in´edits de Leibniz, ed. L. Couturat (Paris, 1903; repr. Hildesheim: George Olms, 1961). Mathematischen Schriften, ed. C. I. Gerhardt, 7 vols. (Berlin: A. Asher/Halle: H. W. Schmidt, 1848–63; repr. Hildesheim: Georg Olms, 1962). Die Philosophischen Schriften, ed. C. I. Gerhardt, 7 vols. (Berlin: Weidemann, 1875–90; repr. Hildesheim: Georg Olms, 1978). New Essays on Human Understanding, tr. P. Remnant and J. Bennett (Cambridge: Cambridge University Press, 1982). La naissance du calcul diff´erentiel, 26 articles des Acta Eruditorum, tr. and ed. M. Parmentier (Paris: Vrin, 1989). G. W. Leibniz’s Mathematische Schriften, Geometrie—Zahlentheorie—Algebra 1672–1676 (Series VII, Vol. 1), eds. E. Knobloch and W. Contro (Berlin: Akademie Verlag, 1990). De quadrature arithmetica circuli ellipseos et hyperbolae cujus corollarium est trigonometria sine tabulis, ed. E. Knobloch (G¨ottingen: Vandenhoeck & Ruprecht, 1993); French trans. with commentary, by M. Parmentier (Paris: Vrin, 2004).
Bibliography 297 L´etoublon, F., Br´echet, Y., and Jarry, P. (eds.) M´echanique des signes et langages des sciences (Grenoble: Publications de la MSH-Alpes, 2003). Locke, J. An Essay Concerning Human Understanding, ed. A. D. Woozley (New York: New American Library, 1974). Maddy, P. Realism in Mathematics (Oxford: Oxford University Press, 1990). Naturalism in Mathematics (Oxford: Oxford University Press, 1997). Mainzer, K. Symmetries of Nature: A Handbook for Philosophy of Nature and Science (Berlin and New York: Walter de Gruyter, 1996). McClintock, B. The Discovery and Characterization of Transposable Elements: The Collected Papers of Barbara McClintock (New York: Garland Publishers, 1987). Mendelsohn, E. (ed.) Transformation and Tradition in the Sciences (Cambridge: Cambridge University Press, 1984). Mercer, C. Leibniz’s Metaphysics, Its Origins and Development (Cambridge: Cambridge University Press, 1998). Nagel, E. The Structure of Science (New York: Harcourt, Brace & World, 1961). Newton, I. Mathematical Principles of Natural Philosophy and his System of the World, tr. A. Matte, ed. F. Cajori (Berkeley: University of California Press, 1934). Panza, M. Isaac Newton (Paris: Les Belles Lettres, 2003). Parkinson, G. H. R. (ed.) Studia Leibnitiana, Thematic Issue: Towards the Discourse on Metaphysics, 33(1) (2001), 4–18. Peano, G. Formulario mathematico (Turin: C. Guadagnini, 1894; repr. Roma: Edizioni Cremonese, 1960). Peirce, C. S. The Essential Peirce, Vol. 2 (Indianapolis: Indiana University Press, 1998). Perkins, F. Leibniz and China, A Commerce of Light (Cambridge: Cambridge University Press, 2004). Poincar´e, H. La valeur de la science (Paris: Flammarion 1970). Rashed, R. (ed.) Sciences a` l’´epoche de la r´evolution (Paris: Librairie A. Blanchard, 1988). (ed.) Math´ematiques et philosophie de l’antiquit´e a` l’age classique: Hommage a` Jules Vuillemin (Paris: Editions du Centre National de la Recherche Scientifique, 1991). Pellegrin, P. (ed.) Philosophie des math´ematiques et th´eorie de la connaissance: L’Oeuvre de Jules Vuillemin (Paris: Albert Blanchard, 2005). Rasiowa, H. An Algebraic Approach to Non-classical Logics (Amsterdam: North Holland, 1974). Sikorski, R. The Mathematics of Metamathematics (Warsaw: Polish Scientific Publishers, 1963).
298 representation and productive ambiguity Reid, D. Figures of Thought (London: Routledge, 1994). Resnik, M. Mathematics as a Science of Patterns (Oxford: Oxford University Press, 1997). Russell, B. and Whitehead, A. N. Principia Mathematica (Cambridge: Cambridge University Press, 1910/63). Rutherford, D. Leibniz and the Rational Order of Nature (Cambridge: Cambridge University Press, 1998). Sarat, A. and Ewick, P. (eds.) Studies in Law, Politics, and Society (Amsterdam: Elsevier Science, 2002). Sasaki, C. Descartes’s Mathematical Thought (Dordrecht: Kluwer, 2003). Schaffner, K. Discovery and Explanation in Biology and Medicine (Chicago: University of Chicago Press, 1994). Shapiro, J. (ed.) Mobile Genetic Elements (Orlando, FL, and London: Academic Press, Inc., 1983). Sinaceur, H. Jean Cavaill`es, Philosophie math´ematique (Paris: Presses Universitaires de France, 1994). Singer, I. M. and Thorpe, J. A. Lecture Notes on Elementary Topology and Geometry (Glenview, IL: Scott, Foresman & Co., 1967; repr. New York: Steiner). Smith, D. E. and Latham, M. L. (tr.) The Geometry of Ren´e Descartes (New York: Dover, 1954). Tarski, A. Logic, Semantics and Metamathematics: Papers from 1923 to 1938 (Oxford: Clarendon Press, 1956; repr. Indianapolis: Hackett, 1983). Van Den Dries, L. P. D. Tame Topology and O-minimal Structures (Cambridge: Cambridge University Press, 1998). Van Fraassen, B. Laws and Symmetry (New York: Oxford University Press, 1990). The Empirical Stance (New Haven, CT: Yale University Press, 2004). Van Praag, P. (ed.) Aspects de la dualit´e en math´ematiques, Cahiers du Centre de Logique, vol. 12 (Universit´e Catholique de Louvain, 2003). Veblen, O. Foundations of Differential Geometry (Cambridge: Cambridge University Press, 1932). Vuillemin, J. La Philosophie de l’alg`ebre (Paris: Presses Universitaires de France, 1962). N´ecessit´e ou contingence: L’aporie de Diodore et les syst`emes philosophiques (Paris: Les Editions de Minuit, 1984/97). English trans. (in part), Necessity or Contingency: The Master Argument (Stanford: CSLI Lecture Notes, 1996). El´ements de po´etique (Paris: Vrin, 1991). Watson, J. D. Molecular Biology of the Gene (New York and Amsterdam: W. A. Benjamin, Inc., 1965).
Bibliography 299 Wilson, C. Leibniz’s Metaphysics: A Historical and Comparative Study (Princeton: Princeton University Press, 1989). Wittgenstein, L. Remarks on the Foundations of Mathematics (Cambridge, MA: MIT Press, 1983). Wood, K. Troubling Play: Meaning and Entity in Plato’s Parmenides (Albany: State University of New York, 2006). Yakira, E. Contrainte, n´ecessit´e, choix (Zurich: Editions du Grand Midi, 1989). Articles Addison, J. W. ‘Separation Principles in the Hierarchies of Classical and Effective Descriptive Set Theory,’ Fundamenta Mathematicae, 46 (1959), 123–35. Benacerraf, P. ‘Mathematical Truth,’ Journal of Philosophy, 70 (19) (Nov. 1973), 661–79. Bishop, R. C. ‘Patching Physics and Chemistry Together,’ Philosophy of Science (Fall 2006). Bos, H. J. M. ‘Differentials, higher-order differentials, and the derivative in the Leibnizian calculus,’ Archive for History of Exact Sciences, 14 (1974/75), 1–90. ‘On the Representation of Curves in Descartes’ G´eom´etrie,’ Archive for History of Exact Sciences, 24 (1981), 298–302 Breger, H. ‘Weyl, Leibniz, und das Kontinuum,’ Studia Leibnitiana Supplementa, 26 (1986), 316–30. ‘Tacit Knowledge and Mathematical Progress,’ The Growth of Mathematical Knowledge, eds. E. Grosholz and H. Breger (Dordrecht: Kluwer, 2000), 221–30. Brown, D. and Fedoroff, N. ‘The Nucleotide Sequence of the Repeating Unit in the Oocyte 5S Ribosomal DNA of Xenopus laevis,’ Cold Spring Harbor Symposium on Quantitative Biology, 42 (1977), 1195–200. ‘The Nucleotide Sequence of Oocyte 5S DNA in Xenopus laevis. I. The AT-rich Spacer,’ Cell, 13 (1978), 701–15. Cassou-Nogu`es, P. ‘Signs, Figures, and Time: Cavaill`es on ‘Intuition’ in Mathematics,’ Theoria, 55 (2006), 89–104. Cavaill`es, J. ‘R´eflections sur le fondement des math´ematiques,’ Travaux du IXe Congr`es international de philosophie, t. VI/no. 535 (Paris: Hermann, 1937), 136–9. ‘La pens´ee math´ematique,’ Bulletin de la Soci´et´e Fran¸caise de Philosophie, 40(1) (1946) 1–39. Cellucci, C. ‘The Growth of Mathematical Knowledge: An Open World View,’ The Growth of Mathematical Knowledge, ed. E. Grosholz and H. Breger (Dordrecht: Kluwer, 2000), 153–76. ‘The Nature of Mathematical Explanation,’ forthcoming. Chemla, K. ‘Lazare Carnot et la g´en´eralit´e en g´eom´etrie. Variations sur le th´eor`eme dit de Menelaus,’ Revue d’histoire des math´ematiques, 4 (1998), 163–90.
300 representation and productive ambiguity Chemla, K. ‘Remarques sur les recherches g´eom´etriques de Lazare Carnot,’ Lazare Carnot ou le savant-citoyen, ed. Jean-Pierre Charnay (Paris: Presses de l’Universit´e de la Sorbonne, 1990), 525–41. ‘Le rˆole jou´e par la sph`ere dans la maturation de l’id´ee de dualit´e au d´ebut du XIXe si`ecle. Les Articles de Gergonne entre 1811 et 1827,’ Actes de la quatri`eme universit´e d’´et´e d’histoire des math´ematiques, Lille, 1990 (Lille: IREM, 1994), 57–72. Pahaut, S. ‘Pr´ehistoires de la dualit´e: explorations alg´ebriques en trigonom´etrie sph´erique, 1753–1825,’ Sciences a` l’´epoche de la revolution, ed. Roshdi Rashed (Paris: Librairie A. Blanchard, 1988), 149–200. ‘Histoire ou pr´ehistoire de la dualit´e. Relecture des triangles sph´eriques avec et apr`es Euler,’ in Aspects de la dualit´e en math´ematiques, ed. Paul Van Praag, Cahiers du Centre de Logique, vol. 12 (Universit´e Catholique de Louvain, 2003), 9–25. Darden, L. and Maull, N. ‘Interfield Theories,’ Philosophy of Science, 44 (1977), 43–64. Dewar, M. J. S. and Gleicher, G. J. ‘Communication,’ Journal of the American Chemical Society, 87 (1965), 3255. Emerson, R. A. ‘The Frequency of Somatic Mutation in Variegated Pericarp in Maize,’ Genetics, 14 (1929), 488–511. Fedoroff, N. ‘Structure of Deletion Derivatives of a Recombinant Plasmid Containing the Transposable Element Tn9 in the Spacer Sequence of Xenopus laevis 5S DNA,’ Cold Spring Harbor Symposium on Quantitative Biology, 43 (1979), 1287–92. ‘Controlling Elements in Maize,’ Mobile Genetic Elements, ed. J. Shapiro (Orlando, FL, and London: Academic Press, Inc., 1983), 1–63. ‘Transposable Genetic Elements in Maize,’ Scientific American, 250 (1984), 84–98. ‘Marcus Rhoades and Transposition,’ Genetics, 150 (1998), 957–61. et al. ‘Molecular Studies on Mutations at the Shrunken Locus in Maize Caused by the Controlling Element Ds,’ Journal of Molecular and Applied Genetics, 2 (1983), 11–29. ‘Isolation of the Transposable Maize Controlling Elements Ac and Ds,’ Cell, 35 (Nov. 1983), 235–42. ‘The Nucleotide Sequence of the Maize Controlling Element Activator,’ Cell, 37 (1984), 635–43. ‘Epigenetic Regulation of the Maize Spm Transposon,’ Bioessays, 17 (1994), 291–7. Garavaglia, S. ‘Model Theory of Topological Structures,’ Annals of Mathematical Logic, 14 (1978), 13–37.
Bibliography 301 Giaquinto, M. ‘From Symmetry Perception to Basic Geometry’ Visualization, Explanation and Reasoning Styles in Mathematics, ed. K. Joergensen and P. Mancosu (Dordrecht: Kluwer, 2003), 31–56. ‘Mathematical Activity’ Visualization, Explanation and Reasoning Styles in Mathematics, ed. K. Joergensen and P. Mancosu (Dordrecht: Kluwer, 2003), 75–88. Gray, J. ‘The Nineteenth-century Revolution in Mathematical Ontology,’ Revolutions in Mathematics, ed. D. Gillies (Oxford: Clarendon Press, 1992), 226–48. Grosholz, E. ‘Wittgenstein and the Correlation of Arithmetic and Logic,’ Ratio, 23(1), (1981), 33–45. ‘Some Uses of Proportion in Newton’s Principia, Book I,’ Studies in History and Philosophy of Science 18(2) (1987), 209–20. ‘Descartes and Galileo: The Quantification of Time and Force,’ Math´ematiques et philosophie de l’antiquit´e a` l’age classique: Hommage a` Jules Vuillemin, ed. Roshdi Rashed (Paris: Editions du Centre National de la Recherche Scientifique, 1991), 197–215. ‘Was Leibniz a Mathematical Revolutionary?’ Revolutions in Mathematics, ed. D. Gillies (Oxford: Clarendon Press, 1992), 117–33. ‘Plato and Leibniz against the Materialists,’ Journal of the History of Ideas, 57(2) (1996), 255–76. ‘L’analogie dans la pens´ee math´ematique de Leibniz,’ in L’Actualit´e de Leibniz: Les deux labyrinths, ed. D. Berlioz and F. Nef, Studia Leibnitiana, Supplementa 34 (Stuttgart: Steiner, 1999), 511–22. ‘The Partial Unification of Domains, Hybrids, and the Growth of Mathematical Knowledge’ in The Growth of Mathematical Knowledge, ed. E. Grosholz and H. Breger (Dordrecht: Kluwer, 2000), 81–91. ‘Theomorphic Expression in Leibniz’s Discourse on Metaphysics,’ Studia Leibnitiana, Thematic Issue: Towards the Discourse on Metaphysics, ed. G. H. R. Parkinson, 33(1) (2001), 4–18. ‘The Cartesian Revolution. La G´eom´etrie. Understanding Descartes: Reception of the G´eom´etrie.’ In Italian. Storia della scienza, ed. S. Petruccioli, 10 vols. (Rome: Istituto della Enciclopedia Italiana), vol. V (2002), 440–52. ‘Jules Vuillemin’s La Philosophie de l’alg`ebre: The Philosophical Uses of Mathematics’, Philosophie des math´ematiques et th´eorie de la connaissance: L’Oeuvre de Jules Vuillemin, ed. R. Rashed and P. Pellegrin (Paris: Albert Blanchard, 2005), 253–70. Hoffmann, R. ‘How Symbolic and Iconic Languages Bridge the Two Worlds of the Chemist: A Case Study from Contemporary Bioorganic Chemistry,’ Of Minds and Molecules: New Philosophical Perspectives on Chemistry, ed. N. Bhushan and Stuart Rosenfeld (New York: Oxford University Press, 2000), 230–47.
302 representation and productive ambiguity Hamilton, A. et al. ‘A Calixarene with Four Peptide Loops: An Antibody Mimic for Recognition of Protein Surfaces,’ Angewandte Chemie, International English Edition, 36 (23): 2680–3. Hempel, C. and Oppenheim, P. ‘Studies in the Logic of Explanation,’ Philosophy of Science, 15 (1948), 491–9. Hendry, R. F. ‘Mathematics, Representation, and Molecular Structure,’ Tools and Modes of Representation in the Laboratory Sciences, ed. U. Klein (Dordrecht: Kluwer, 2001), 221–36. Hintikka, J. ‘Analyzing (and Synthesizing) Analysis,’ (forthcoming.) Hoffmann, R. ‘Nearly Circular Reasoning,’ American Scientist, 76 (1988), 182–5. Lazlo, P. ‘Representation in Chemistry,’ Angewandte Chemie, International English Edition, 30 (1991), 1–16. Shaik, S., and Hiberty, P. C. ‘A Conversation on VB or MO Theory: A Never-Ending Rivalry?’ Accounts of Chemical Research, 36 (10) (Oct. 2003), 750–6. Jacquette, D. ‘Intentionality and the Myth of Pure Syntax’, Protosoziologie, 6 (1994) 76–89, 331–3. Keirns, C. ‘Seeing Patterns: Models, Visual Evidence, and Pictorial Communication in the Work of Barbara McClintock,’ Journal of the History of Biology, 32 (1999), 163–96. Kemp, C. ‘Law’s Inertia: Custom in Logic and Experience,’ Studies in Law, Politics, and Society, eds. A. Sarat and P. Ewick (Amsterdam: Elsevier Science, 2002), 135–49. Kleene, S. C. ‘A Symmetric Form of Go¨ del’s Theorem,’ Indagationes Mathematicae 12 (1950), 244–66. Kvasz, L. ‘Changes of Language in the Development of Mathematics,’ Philosophia Mathematica, 8 (2000), 47–83. ‘Similarities and Differences between the Development of Geometry and the Development of Algebra,’ Mathematical Reasoning and Heuristics, eds C. Cellucci and D. Gillies (London: Kings College Publications, 2005), 25–47. Lazlo, P. ‘Chromotographie,’ Tr´esor. Dictionnaire des sciences (Paris: Flammarion, 1997). ‘Chemical Analysis as Dematerialization,’ Hyle, 4 (1998), 29–38. Leibniz, G. W. ‘De la Methode de l’Universalit´e,’ (1674), C, 97–122; Bodemann V, 10, f, 11–24. ‘De vera proportione circuli ad quadratrum circumscriptum in numeris rationalibus expressa,’ Acta Eruditorum (Feb. 1682), M. S. V, 118–22. ‘R´eplique a` l’abb´e D.C. sous forme de lettre a` Bayle,’ Nouvelles de la R´epublique des Lettres (Feb. 1687), P.S. III, 45.
Bibliography 303 Leibniz, G. W. ‘De linea isochrona in qua grave sine acceleratione descendit et de controversia cum Dn. Abbate D. C.,’ Acta Eruditorum (April 1689), M. S. V 234–7. ‘Analysis des Problems der isochronischen Kurve,’ (1689?), M. S. V, 241–3. ‘De linea in quam flexile,’ Acta Eruditorum (June 1691), M. S. V, 243–7. ‘Supplementum geometriae dimensoriae seu generalissima omnium tetragonismorum effectio per motum: similiterque multiplex constructio lineae ex data tangentium conditione,’ Acta Eruditorum (Sept.1693), M. S. V, 294–301. ‘Tentamen anagogicum,’ (c.1696), P. S. VII, 270–9. ‘Epistola ad V. Cl. Christianum Wolfium, Professorem Matheseos Halensem, circa Scientiam Infiniti,’ Acta Eruditorum, Supplementa (1713), t. V, section 6; M. S. V, 382–7. Maxam, A. M. and Gilbert, W. ‘A New Method for Sequencing DNA,’ Proceedings of the National Academy of Sciences USA, 74 (1977), 560–4. McClintock, B. ‘Maize Genetics,’ Carnegie Institution of Washington Year Book, 45 (1946), 176–86. ‘Cytogenetic Studies of Maize and Neurospora,’ Carnegie Institution of Washington Year Book, 46 (1947), 146–52. ‘Mutable Loci in Maize,’ Carnegie Institution of Washington Year Book, 47 (1948), 155–69. ‘Mutable Loci in Maize,’ Carnegie Institution of Washington Year Book, 48 (1949), 142–54. ‘The Origin and Behavior of Mutable Loci in Maize,’ Proceedings of the National Academy of Sciences USA, 36 (6) (1950), 344–55. ‘Chromosome Organization and Genic Expression’, Genes and Mutations, Cold Spring Harbor Symposia on Quantitative Biology, 16 (1951/52), 13–47. ‘Induction of Instability at Selected Loci in Maize,’ Genetics, 38 (1953), 579–99. ‘The Control of Gene Action in Maize,’ Genetic Control of Differentiation, Brookhaven Symposia on Biology, 18 (1965), 162–84. Mostowski, A. ‘On Definable Sets of Positive Integers,’ Fundamenta Mathematicae, 34 (1946), 81–112. Mulliken, R. S. and Parr, R. G. ‘LCAO Molecular Orbital Computation of Resonance Energies of Benzene and Butadiene, with General Analysis of Theoretical Versus Thermochemical Resonance Energies,’ The Journal of Chemical Physics, 19(10) (Oct. 1951), 1271–8. et al. ‘Hyperconjugation,’ Journal of the American Chemical Society, 63 (1941), 41–56.
304 representation and productive ambiguity Nickles, T. ‘Two Concepts of Inter-Theoretic Reduction,’ Journal of Philosophy, 70 (1973), 181–201. Oliveri, G. ‘Mathematics as a Quasi-Empirical Science,’ Foundations of Science, 11 (2006), 41–79. Parr initial, et al. ‘Communication,’ Journal of Chemical Physics, 18(12) (Dec. 1950), 1561–3. Parsons, C. ‘The Structuralist View of Mathematical Objects,’ The Philosophy of Mathematics, ed. W. D. Hart (New York: Oxford University Press, 1996) 272–309. Peirce, C. S. ‘On the Algebra of Logic: A Contribution to the Philosophy of Notation,’ The American Journal of Mathematics, 7(2) (1885), 180–202; repr. in Collected Papers of Charles Sanders Peirce, eds. C. Hartshorne and P. Weiss, (Cambridge: Harvard University Press, 1931–), vol. 3, 359–403. ‘Pragmatism as the Logic of Abduction,’ The Essential Peirce, 2 (Indianapolis: Indiana University Press, 1998), 226–41. Person, W. B. et al. ‘Communication,’ Journal of the American Chemical Society, 74 (1952), 3437. Reeves, P. et al. ‘Communication,’ Journal of the American Chemical Society, 91 (1969), 5888–90. Rhoades, M. ‘Effect of the Dt Gene on the Mutability of the a1 Allele in Maize,’ Genetics, 23 (1938), 377–97. Ribet, K. A. ‘On Modular Representations of Gal (Q/Q) Arising from Modular Forms,’ Inventiones Mathematicae, 100 (1990), 431–76. ‘From the Taniyama-Shimura conjecture to Fermat’s Last Theorem,’ Annales de la Facult´e des Sciences de Toulouse—Math´ematiques, 11 (1990), 116–39. Robadey, A. ‘Exploration d’un mode d’´ecriture de la g´en´eralit´e: L’Article de Poincar´e sur les lignes g´eod´esiques des surfaces convexes (1905),’ Revue d’histoire des math´ematiques, 10 (2004), 257–318. Rohlich, F. ‘Pluralistic Ontology and Theory Reduction in the Physical Sciences,’ British Journal for Philosophy of Science, 39 (1988), 295–312. Roothan, C. C. J. and Parr, R. G. ‘Calculations of the Lower Excited Levels of Benzene,’ Journal of Chemical Physics, 17 (July 1949), 1001. Sanger, F. and Coulson, A. R. ‘A Rapid Method for Determining Sequences in DNA by Primed Synthesis with DNA Polymerase,’ Journal of Molecular Biology, 94 (1975), 441–8. Scerri, E. ‘Has Chemistry Been at least Approximately Reduced to Quantum Mechanics?’ Philosophy of Science Association, I (1994), 160–70. Schaffner, K. ‘Approaches to Reduction,’ Philosophy of Science, 34 (1967), 137–47.
Bibliography 305 Schaffner, K. ‘The Peripherality of Reductionism in the Development of Molecular Biology,’ Journal of the History of Biology, 7 (1974), 111–29. Scott, D. ‘A Proof of the Independence of the Continuum Hypothesis,’ Mathematical Systems Theory, 1 (1967), 89–111. Sgro, J. ‘Completeness Theorems for Continuous Functions and Product Topologies,’ Israel Journal of Mathematics, 25 (1976), 249–72. ‘Completeness Theorems for Topological Models,’ Annals of Mathematical Logic, 11 (1977), 173–93. Shaw, R. et al. ‘Spontaneous Mutational Effects on Reproductive Traits of Arabidopsis thaliana,’ Genetics, 155 (2000), 369–78. Stone, W. H. ‘The Theory of Representation for Boolean Algebras,’ Transactions of the American Mathematical Society, 40 (1936), 37–111. ‘Applications of the Theory of Boolean Rings to General Topology,’ Transactions of the American Mathematical Society, 41 (1937), 375–481. Sylla, E. ‘Compounding Ratios,’ Transformation and Tradition in the Sciences, ed. E. Mendelsohn (Cambridge: Cambridge University Press, 1984), 11–43. Tarski, A. ‘Sentential Calculus and Topology,’ repr. in Logic, Semantics and Metamathematics: Papers from 1923 to 1938 (Oxford: Clarendon Press, 1956/Indianapolis: Hackett, 1983), 421–54. Vuillemin, J. ‘La question de savoir s’il existe des r´ealit´es math´ematiques a-t-elle un sens?’ Philosophia Scientiae, 2(2) (1997), 275–312. Weininger, S. ‘Contemplating the Finger: Visuality and the Semiotics of Chemistry,’ Hyle, 4(1) (1998), 3–25. Wiles, A. ‘Modular Elliptic Curves and Fermat’s Last Theorem,’ Annals of Mathematics, 141(3) (1995), 443–551. Zeidler, P. and Sobczynska, D. ‘The Idea of Realism in the New Experimentalism and the Problem of the Existence of Theoretical Entities in Chemistry,’ Foundations of Science, 4 (1995), 517–53.
This page intentionally left blank
Index abduction 41–4, 53, 130 Abel, N. 228 abstraction 30, 34, 44–6, 50–3, 96, 117, 189, 247, 254–6, 284 Addison, J. W. 274–8 Alexandroff, P. 46 algebraic geometry 46, 67, 76 analysis (mathematical field) 52, 215, 228, 235, 241, 243, 247, 268, 272, 280 n. analysis (search for conditions of intelligibility) 3–16, 22, 33–4, 56, 59, 68–70, 97, 126–30, 137, 140, 145, 151–5, 166, 174, 186, 201–3, 217–19, 237, 250, 256 of arguments 44–6 of propositions 40–4 of terms 34–40 analytic geometry 42, 165–83, 215, 228, 230–1, 234, 245, 280 antibody mimic, 68, 70–4, 82–9 Apollonius 14, 16 Arabic numerals 258, 265–7 Archimedes 38, 196 Arents, J. 141 n. Ariew, R. 183 n. Aristotle 159, 161, 207, 210, 229, 257 Atkins, P. W. 74 n. Austin, J. L. 66 Axiom of Choice 231 Ayer, A. J. 57, 194 Bacon, R. 7 Baire R. L. 274 Baire space 276, 278–9 Baire Theorem 270 Barrow, I. 193 Bayle, P. 205 Bell, J. L. 273 n. Benacerraf, P. 65, 284 benzene 67, 74, 77–8, 80, 130, 132, 139–56 Bernoulli, J. 52, 215, 221
Berthelot, M. 27 Berzelian formulas 22, 24, 26–31, 67, 70, 76, 154 Berzelius, J. 26–9 Beth Definability 282 Betti groups 46 Betti numbers 247–8 biochemistry 96, 102, 112–3 bioorganic chemistry 63, 71–2 Birkhoff, G. 132 Bishop, R. C. 26, 138, 141, 143 n., 150 n. Blay, M. 193 n. Boolean algebra 269–73 Boolean space 271–2 Boothby, W. 237–42 Borel hierarchy 274–9 Borel sets 274–6 ´ 274 Borel, E. Bos, J. M. 173, 174 n., 175 n., 177 n., 180 n., 193 n. Boullay, P. 29 Bourbaki, N. 44, 46, 133, 261 Boyer, C. 45 Brahe, T. 194–5 Br´echet, Y. 71 n. Breger, H. 40 n., 47, 49 n., 50–2, 95, 204 n., 207, 232 Brescia, F. 141 n., 142, 145 n., 146 Brouwer, L. 25, 189, 231 Brown, D. 113–15, 117, 119 Brunschvicg, L. 189–90 Byers, D. 125 n. Calama, M. C. 71, 89 calixarene 71–7, 81–9 Campanus, J. 7 Campos, D. 42–4 canonical object 3, 20, 31, 39, 40, 45–6, 58, 63–4, 136–7, 139–40, 191, 216, 224, 237–42, 243, 250, 275 Cantor set 271–2 Cantor space 271–2, 276 Cantor, G. 189
308 index Carnap, R. 4, 16–20, 22–4, 43–4, 64–5 Carnot, L. 45 Cartesian parabola 178–83 Cartwright, N. 17, 21, 31, 65, 70, 95, 102, 118, 224, 243, 254–5, 279 Cassirer, E. 97 Cassou-Nougu`es, P. 50 n. catenary 186, 202, 217, 219, 221–4 Cavaill`es, J. 25, 33, 189, 190, 242, 259, 261 Cavalieri, B. 210 Cellucci, C. 40–2, 49 n., 79 n., 95 center of force 27, 194–203 Chaleff, D. 100 n., 120 n. Chang, C. C. 281 n., 282 n. Chasles, M. 45 chemical reaction 26, 28–30, 74, 81, 89, 117 Chemla, K. 31, 44, 45, 174 Chihara, C. 257 Cohen, P. 269–70, 273 Cohn-Vossen, S. 39 cohomology 247–50 combinatorial space 25, 32, 50, 58, 64 Compactness Theorem 280 Completeness Theorem 270, 280–1 n., 282 conic section 168, 176, 182, 211, 212, 240 continuum 35, 64, 98, 205, 207, 208, 234–5, 263, 278 Continuum Hypothesis 270, 273 Copernicus, N. 194 Correns, C. 111 Costabel, P. 204 n. Cotton, F. A. 132–40, 143–50, 155–6 Coulson, A. R. 117, 118 Coulson, C. 150 n. Couper, A. 27 Couturat, L. 48, 53, 189, 204, 207, 258 Craig, D. P. 151 Craig’s Interpolation Theorem 278, 281, 282 Crick, F. 105 cytochrome 84–9 Dagognet, F. 27 Dalton, J. 26, 28 Dante 155 Darden, L. 96, 101, 102 n., 125 Darmo, E. 125 n.
De Gandt, F. 27, 192 De Rham cohomology 247, 251 De Rham, G. 254 n. De Rham’s Theorem 229, 243–55, 256 DeBevoise, M. B. 193 n. Dedekind, R. 189 deduction 40, 160 Derrida, J. 229 Descartes, R. 9 n., 24, 25, 33–6, 48, 51, 54–6, 133, 159–83, 190, 194, 198, 208, 211, 215–16, 223, 228–32, 240 DeVries, H. 111 Dewar, M. J. S. 155 n. differential equations 43, 63, 67, 140, 141, 192, 193, 202, 206, 221–2, 224 differential geometry 39, 46, 242 Dries, L. P. D. 280 n. Drosophila 30, 105, 111, 121 Duchesneau, F. 204 n. Dumas, J. 29, 30 Einstein, A. 92–4, 101 Emerson, R. A. 103, 104, 111 empiricism 17, 47, 57, 159, 187, 263 n. Erlanger Programm 132 Euclid 14, 16, 25, 36–8, 44, 54–6, 133, 159, 169, 173, 183, 229, 242 Euclid’s Elements 36, 38, 44, 54, 160, 230 Eudoxus 159 Euler characteristic 248–9 Euler number 249 Euler, L. 45, 52, 215 experience 4, 24–5, 33–4, 40, 47–60, 76, 77, 164, 188, 190, 214, 227–35, 242 formal 47, 50, 52–60, 229–35, 242 explanation 5, 20, 21 n., 25, 27, 33, 40–4, 47, 91–2, 96, 101, 104–11, 140, 150 n., 155–6, 174, 185–6, 192, 194, 202 Fedoroff, N. 97–100, 104, 110 n., 113–15, 117, 119–25 Fermat, P. 240 Fermat’s Last Theorem 67, 134 n. Fichant, M. 204 n. Fontan, M. 183 n. formalism 19, 25, 42, 52–3, 57, 163 formalization 20, 23, 25, 31, 32, 51, 64, 80, 95, 162, 186
index 309 Franklin, R. 105 Fr´echet, M. 46, 245 free fall 3, 5, 8–16, 197, 203 Frege, G. 18, 49, 97, 228, 257, 268, 269 Fundamental Theorem of Calculus 249 n., 254 n. Galilean invariance 94 Galileo 3–16, 19, 24, 194, 197, 198, 250 n. ´ 228, 229, 254 Galois, E. Garavaglia, S. 283 n. Garber, D. 183 n., 204 n. Garver, E. 183 n. Gauss, C. F. 228, 230 generalization 32, 34, 4–6, 50–2, 56, 88–9, 171–7, 177–83, 194–5, 245–6, 249, 269 genetics 91, 102–13, 119–25 Gergonne, J. 45 Gerhardt, C. I. 205, 217, 218, 220, 222 Geyer, C. 111 n. Geyer, P. 111 n. Giaquinto, M. 42 Gilbert, W. 117 Gillies, D. 56 n., 57, 79 n., 187, 215 n., 263 Glauert, A. 99 n., 107 n. Gleicher, G. J. 155 n. G¨odel numbers 25 G¨odel, K. 49, 66, 128, 130, 269, 270, 275 Goeppart-Mayer, M. 151 Goldenbaum, U. 204 n. Goodman, N. 27, 28 Golius, J. 173 Grandi, G. 205 Granger, G. G. 79, 80, 207 Grassman algebra 246, 247 Gray, J. 56 Green, G. 254 n. Gregory of St. Vincent 210, 211 Grimaux, E. 27 Grosholz, R. G. 111 n. Grosholz, E. D. 111 n. Grothendieck, A. 254 n. group theory 126–30, 131–9, 145–50, 228, 243 Gueroult, M. 165 n., 166 n., 204 n. Guldin, P. 210 Hacking, I. 17, 21, 30, 64–66
Halmos, P. R. 271 n. Hamilton, A. 71–89 Hamiltonian operator 138, 140, 143 Hamuro, Y. 71–89 Hausdoff space 246 Hausdorff, F. 39, 246, 271, 282 Hegel, G. 189–91 Heitler, W. 140 Hempel, C. 17, 20, 43, 91 n., 92, 96 Hendry, R. 17, 20–3, 26, 31, 96–8, 150 n., 191 Henery, J. 155 n. Herstein, I. N. 132 Hiberty, P. C. 156 n. Hilbert, D. 38, 39, 42, 242, 269 Hintikka, J. 41, 42 Hobbes, T. 204 n. Hoff, J. 27 Hoffman, R. 26, 71–90, 134 n., 156 n. homeomorphism 39, 245, 248–50 homology 46, 247–50 Hopf, H. 46 H¨uckel Approximation 144–7 H¨uckel, E. 140, 150, 151 H¨uckel’s Rule 156 Hudde, J. 215 Hume, D. 4, 33, 52, 56–9 Hund, F. 140 Husserl, E. 228 Huygens, C. 193–4, 215–16 hybrid 49, 120, 165, 200, 216, 223, 237, 278, 279 icon 3–4, 14, 16, 22–4, 24–32, 39, 49, 51, 54–56, 69, 76–82, 87–90, 97–101, 103–11, 113–25, 126–128, 130, 132, 137, 140, 155, 163–4, 165, 169–171, 181–3, 186, 192–203, 218–21, 240–2, 243–53, 254, 256, 259, 262–8, 279, 283–5 Imbert, C. 49 incommensurability 31, 65 Incompleteness Theorem 49, 128 induction 43, 53, 104, 105 n., 108 n., 109 infinitesimal calculus 48, 204, 207, 214 inorganic chemistry 30 intelligibility 24, 33, 34, 36, 39, 47, 49, 51, 127, 130, 161, 194, 195, 198, 202, 203, 215, 216, 221, 223, 224, 237, 254, 256, 259, 264 Interpolation Theorem 278
310 index intervention 64–7, 154, 159, intuition 25, 38–9, 41, 50 n., 53, 56, 161–4, 166, 185, 189, 194, 207, 229, 231–35, 242, 273 intuitionism 25, 161, 233 inverse square law 186, 190 isochrone 215, 217, 218 isomorphism 21–3, 38, 91, 95, 96, 191, 238, 240, 245, 248–50, 280 Jacob, F. 112, 113 Jacquette, D. 194 n. James, G. 129 n., 131 n., 132 Jarry, P. 71 Joesten, M. D. 72 n., 74 n. Johnston, D. O. 72 n. June, A. E. 222 n. Kant, I. 22, 25, 33, 35, 38, 47, 97, 185, 187, 194, 227–9, 231–235, 262 Kaulbach, F. 207 Keirns, C. 100, 101, 125 n. Keisler, H. J. 281 n. Kekul´e structures 147, 151–3, 155 Kekul´e, F. 78, 154 Keller, E. F. 99 n. Keller, L. 101 n. Kemp, C. 57–60 Kepler, J. 194–6, 198, 220 Kitcher, P. 186–9, 192 Kleene, S. 274, 275, 280 Klein, F. 132, 154, 228, 232 Klein, U. 4, 17, 20 n., 22–31, 65, 66, 96 n., 97 n. Knobloch, E. 204 n., 215 n., 216 n. Koyr´e, A. 9 Kronecker, L. 234 Kuhn, T. 31, 65, 89, 187 Kuratowski, C. 46 Kvasz, L. 79 n. Lagrange, J. 52, 228 Lakatos, I. 187 language game 50 Laszlo, P. 78 n. Latour, B. 31 Laugwitz, D. 51 Laurent, A. 29, 30 Lavoisier, A. 84 law of areas 184–5, 194–6, 198
Lawvere, F. 261 Lebesgue, H. 274 Lefshetz, S. 46 Leibniz, G. 4, 28, 33–5, 47–9, 51–9, 68, 95, 98, 101, 111 n., 129 n., 159, 160, 184, 186, 193, 194, 202, 204–23, 228, 232, 234, 235, 254–56, 258, 259, 263, 264 L´etoublon, F. 71 n. L’Hˆopital, G. 215 Li, W. 67 n. Lie, S. 228 Liebeck, M. 129 n., 131 n., 132 linear algebra 63, 129 Locke, J. 33, 52–60, 159–64, 194 logical positivism 20 logicism 48, 189, 204, 207, 267 London, F. 140 Lorentz invariance 94 Łos, K. 273 Lowenheim, L. 269 Lowenheim-Skolem Theorem 280–2 Lusin, N. 275 Lusin’s Separation Theorem 278 Macintyre, A. 183 n., 258, 283 n. MacLane, S. 132 Maddy, P. 187, 259–61 Mainzer, K. 131 n., 132 n. Maize 98–100, 103–11, 120–5 Maull, N. 96, 101, 102 n., 125 Mauvais, J. 100 n., 120 n. Maxam, A. M. 117 Mazur, J. 243 n., 254 n. McClintock, B. 97–113, 118, 119–25, 184 Meislich, H. 141 n. Menaechmus 230, 231 Mendel, G. 111 Mercator, N. 211 Mercer, D. 204 n. Messing, J. 100 n., 120 n. model theory 130, 279–83 models 19–32, 38, 63–7, 68, 78, 202–3, 262–8 molecular biology 43, 91, 97–102, 113–25 Moli`ere (Poquelin, J.) 229 Monod, J. 112, 113 Morrison, M. 17, 21 Mostowski, A. 274–5, 279
index 311 Mulliken symbols 139, 145 Mulliken, R. 139, 151–5 Nachtomy, O. 111 n. Nagel, E. 91–3 National Academy of Sciences 104 n., 105, 117 n. Nemorarius, J. 7 Netterville, J. T. 72 n. Newton 3, 14, 24, 27, 92–5, 184–203, 220 Newtonian mechanics 92–5, 101, 215 Nickles, T. 92, 93 Noether, E. 189 nomological machine 65, 90, 265, 267 Oppenheim, P. 91 n. Oresme, N. 9 Oxford Calculators 14 Panofsky, E. 97 Panza, M. 174 n., 193 n. paper tools 22 n., 25–31, 64, 66, 90, 113, 265 Pappian loci 179, 180, 183 Pappus 34, 173, 174 Pappus’ problem 171–83, 190 Parallel Postulate 37 Park, H.-S. 71, 89 Parmentier, M. 206 n., 215 n., 216, 222 n., 223 n. Parr, R. 140, 151–5 Parsons, C. 263 Pauli exclusion principle 140 Pauling, L. 21, 78, 140 Peano, G. 186 Peirce, C. S. 4, 33, 41–3, 53, 97, 130, 189, 191 Pell, J. 211 performative utterance 66 Perkins, F. 214 Perrault, C. 219 Person, W. B. 156 n. Pettit, R. 155 n. physical chemistry 112 Pickering, A. 30 Pimental, G. C. 156 n. Pitzer, K. S. 156 n. Plato 47, 101, 116, 129, 187, 259, 267
Poincar´e, H. 45, 46 n., 55, 56, 180, 232 Polhman, R. 100 n., 120 n. Poncelet, J. 45 pragmatic philosophy of science 4, 5, 19, 21, 23–4, 31, 42, 59, 150, 240, 262, 265–6, 268 predicate logic 25, 66, 91, 93, 126, 128, 130, 159, 257–9, 261, 268–70, 274, 278–80, 282, 283 Principle of Continuity 204, 205–7, 209, 210, 213 Principle of Contradiction 204 Principle of Perfection 204, 213 Principle of Sufficient Reason 204 productive ambiguity 3, 25, 32, 44, 70, 213, 221 projectile motion 3, 5, 14–16, 19, 24 proportion 3–8, 13–15, 26, 41, 55, 78, 159, 160, 163, 168, 169, 171, 174, 178, 192, 195–8, 200–3, 206, 216, 218, 222, 223 Ptolemy 7 Pythagorean metric 246 Pythagorean Theorem 25, 36, 37, 55, 160–3, 170, 171 Quine, W. V. O. 227, 229 Raina, R. 120 n. Rasiowa, H. 269 n., 270 n., 273 n. ratio 6–8, 11–13, 55, 94, 159, 163, 164, 166, 171, 173, 192, 197, 200, 201, 203, 206, 212, 216, 219, 223 rationality 4, 5, 13, 25, 32, 41, 51, 59, 64–7, 126, 227 realism 48, 64, 66, 77 n., 81, 187 reduction Reed, D. 183 n. Reeves, P. 155 n. Remes, U. 41 representation 4, 5, 10, 13, 14, 19–25, 27, 31, 32, 39–45, 47–51, 54–6, 58, 64–70, 73, 76, 78–80, 82, 87, 88, 96–9, 101, 102, 116, 117, 123, 127–45, 147, 152, 154, 159, 163, 164, 173–5, 181–3, 186, 191, 193, 204, 213–15, 224, 228, 232, 235, 237, 240–3, 245–7, 250, 254, 256, 258–9, 262–6, 268–73, 279, 283, 284
312 index representation theory 127–9, 131–4, 139 Rhoades, M. 103, 104, 111 Ribet, K. 67, 134 n. Rieke, C. 153, 154 Riemannian structure 244 Riemann integral 253 Robadey, A. 45, 46 n. Robinson, A. 261, 280 Rohrlich, F. 92–5, 101 Roothan, C. 151 n. Ross, I. 151 Russell, B. 17–18, 24, 48–9, 53, 189, 204, 207, 257–9, 268–9 Rutherford, D. 204 n. Sanger, F. 117, 118 Sasaki, C. 168 n. Scerri, E. 76 n. Schaffner, K. 21–4, 92, 93, 102 Schlappi, M. 120 n. Schooten, F. 193, 215 Schr¨odinger equation 131, 141–7, 143 n. Scott, D. 270, 273 Semantic philosophy of science 4, 5, 19–23, 31, 66, 95, 191, 240, 262, 264–6, 268, 269, 271 set theory 227, 234–42, 256, 259–61 Sgro, J. 282 n. Shaik, S. 156 n. Shapere, D. 96 Shapiro, J. 99, 100, 121 Shaw, F. 111 n. Shaw, R. G. 111 n., 125 n. Shuchun, G. 44 Shure, M. 100 n., 120 n. Sikorski, R. 269, 270 n. simplicial complex 243, 248–54 Sinaceur, H. 49, 190 n. Singer, I. M. 39 n., 235–7, 240–56 Sklar, A. L. 151 Skolem, T. 269 Slater, J. 140, 150, 153 Sluse, R. 215 Smigelskis, D. 183 n. smooth manifold 243–56 Sobczynska, D. 77 n. Solovay, R. 270, 273 Speusippus 230 squaring of the circle 36, 168, 216
Stevin, S. 166 Stokes’ Theorem 249, 253 Stokes, G. 254 Stone, W. H. 269, 270–3, 279 Stone’s Representation Theorem 270, 272–3 Sylla, E. 7 n., 201 symbol 3, 4, 14, 16, 18–9, 22–3, 25, 27–9, 39, 48, 49, 51, 54–6, 69, 76, 78–82, 87–90, 97, 98, 101, 107, 116, 120, 124, 125–8, 130, 132, 133, 137, 139, 140, 145, 163–5, 169–71, 181, 184, 186, 192, 202, 208, 216, 219, 221, 241, 259, 262, 265, 279, 283–4 symbolic logic 18 syntactic philosphy of science 4, 5, 19–21, 23, 31, 68, 79, 95, 191, 265 System of the World 198, 202 Szczeciniarz, J. J. 174 n., 243 n. Tables of Elements 116 Tarski, A. 269–71, 279 Theon 7 Thorpe, J. A. 39 n., 235–7, 240–56 topology 39, 44, 46, 49, 141, 227, 229, 235–7, 239, 241, 243–6, 248, 251–3, 256–7, 259, 261, 268–73, 275, 276, 278–83 Torrence, V. 73 n. Torricelli, E. 3 Toulmin, S. 187 tractrix 186, 215, 217, 219–21, 223 Transcendental Aesthetic 97, 232–5 transcendental curve 59, 180, 186, 204, 213, 215–24 transposition of genes 97–100, 102, 104, 105, 107, 111–13, 120–22 T¨uring machine 278 Turk, A. 141 n. Ultraproducts Theorem 280–2 Van der Waals, J. D. 72 Van Fraassen, B. 17, 21–4, 57, 102, 264–5 Veblen, O. 46 Vickers, S. 283 Vienna School 4 Vi`ete, F. 166, 167
index 313 Von Neumann ordinals 260 Vuillemin, J. 10 n., 33 n., 48 n., 227–35, 240, 242 Wallace, D. 232 Wallis, J. 216 Watson, J. D. 105, 111–13, 117, 121 Weininger, S. 78 n. Wessler, S. 100 n., 120 n. Weyl, H. 207 White, H. E. 141 n. Whitehead, A. 17, 18, 257 Wiles, A. 67, 134 n. Wilson, C. 204 n.
Wittgenstein, L. 50 Wolff, C. 205 W¨olfflin, H. 97 Wood, J. L. 72 n. Wood, K. 116 n. Wurtz, A. 27 Xenopus laevis 113–19 Yakira, E. 35 n., 204 n., 256 n. Zeidler, P. 77 n.