O n e M o r e R e f e r e n c e on S e l f - R e f e r e n c e K. D a s e n
There is a plethora of papers, not to menti...
8 downloads
582 Views
24MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
O n e M o r e R e f e r e n c e on S e l f - R e f e r e n c e K. D a s e n
There is a plethora of papers, not to mention whole books, about self-reference in the logical and paramathematical literature of our century. Some of them purport to establish important mathematical facts, though of a kind that leaves most of the so-called working mathematicians rather impassive. If a typical working mathematician shows an interest in this phenomenon, it is more on the recreational side, in a Gardner mood. This is w h y in The Mathematical Intelligencer the place occupied by self-reference is disproportionate to the importance an ordinary reader of this journal would attach to it in a working mood. This is also w h y in a paper on self-reference one is likely to find in The Mathematical Intelligencer one can hardly expect to learn much. One just gets the thrill of something amusing, but also whimsical, confused, and a little bit unnerving-intriguing but not worth pursuing seriously-something basically hollow. One abandons such papers without enlightenment, without any sense of achievement. Like every amusement, in excessive quantities they may lead to boredom. In this respect, [1] is no exception. However, its author seems to pretend that it outdoes papers published up to now in journals like The Mathematical Intelligencer, because in it self-reference is not limited to isolated paradoxical sentences, but the paper is pervaded by it. By moving from local to global self-reference, [1] may have achieved the dubious merit of being globally hollow. Perhaps its small size saves it from being globally boring. An unquestionable advantage of [1] is that reading it does not require the physical effort most mathematicians inflict upon their readers with such insouciance. The paper has only one reference, given to the reader together with the paper. So there is no need to rise
4
from y o u r chair and search through your badlyordered shelves, or dig into your worse-ordered xerox copies, or write memos to be taken on your next walk to the library. And, even if this searching, digging, or walking proves successful, one is likely to find that the reference provided by the author is hardly useful if one does not consult references provided by the authors referred to (typically, some of these will lead into a mist of preprints, dissertations, unpublished papers, unwritten papers, papers in exotic journals, papers in Slavic idioms). Can anyone show, without dirty tricks like the Axiom of Choice, that the set of references can be well-ordered? Anyway, with [1] we are at least spared the physical effort, though the well-ordering business is in as bad shape as ever.
THE MATHEMATICAL 1NTELL1GENCER VOL. 14, NO. 4 9 1992 Springer-Verlag New York
This slight advantage of [1] can hardly compensate for the rest of what we find, or, rather, fail to find, in it. We shall now concentrate on the result the author attempts to prove in [1] and show that it leads to absurdity. In Theorem I of [1] it is asserted that a certain theorem is false. We shall now refute this assertion: THEOREM 1. Theorem 1 of [1] is false.
Proof: Let p be an abbreviation for Theorem 1 of [1]. Then the equivalence numbered (1) in the proof of p amounts to: p if and only if not p.
(1)
To verify this equivalence the reader must consult reference [1] of [1]. If we suppose p, from (1) we obtain not p. Because p also implies itself, we get the absurdity p and not p. Hence, we may conclude not p. Q.E.D. This reasoning is self-evidently correct. Only a substructural logician that rejects Gentzen's structural rule of contraction (a creature by whom the pages of The Mathematical Intelligencer have not yet been visited, but likely to charge soon in his linear or similarly garish costume) would deny us the right to consider a premise used twice as a single premise, which is what we do when, with the help of (1), we infer that p implies an absurdity. This expedient for evading paradox is known since Curry's work on the combinator W. But should we go so far as to accuse the author of [1] of substructurality? Note that we use in our proof only the left-to-right direction of (1). One would suppose that the author of [1] mentions the other direction too in order to infer p from the conclusion not p he has reached, but for some reason he refrains from doing that. Because Theorem 1 of [1] is false, its proof must have gone wrong somewhere, and the belief in the selfevident correctness of his reasoning which the author of [1] expresses just after the proof of the theorem is unwarranted. The desperate expedient of switching to an extravagant logic, of which we suspected him (we hope unjustly), is just a symptom of his malaise.
The fact that The Mathematical Intelligencer has accepted a paper containing a demonstrable falsehood is bad enough, but in this respect The Mathematical Intelligencer is not worse than most mathematical journals. What a serious mathematical journal should refuse to print is the rhetorical part of [1], where the author makes euphuistic remarks about the literature on selfreference and expresses misgivings about its recreational aspects. The referee of a serious journal would not fail to note that the set of mathematical references, which the author finds so unwieldy when he complains about the labours of reference-hunting, is after all finite and can certainly be well-ordered (all the more so, if one applies judiciously not Choice but Personal Choice, viz., the principle that one may ignore as m a n y references as one wishes). This referee should also be able to suggest a good number of references to supplement, or perhaps supplant, the author's cherished single reference. Although what he tries to exhibit is very well known from other sources, in a remarkably conceited way, the author's only reference is a selfreference. His paper should probably have been rejected because it is nothing but a hollow extravag a n z a - t h e injury of falsehood coupled with the insult of pomposity and egotism. The fact that in the penultimate paragraph of [1] the author himself is led to judge these defects severely can hardly excuse him. For if his judgment is true, the defects are real, and if his judgment is false, this is itself a defect. He should better follow a famous authority on self-reference whose words from the First Epistle to the Corinthians (4: 3) he quotes at the end of his last sentence: "I judge not my own self."
Reference 1. K. Dosen, One more reference on self-reference, The Mathematical Intelligencer 14, No. 4 (1992), 4-5.
Mathematical Institute P.O. Box 367 11001 Belgrade, Yugoslavia
THE MATHEMATICAL 1NTELLIGENCER VOL. 14, NO. 4, 1992 5
The Opinion column offers mathematicians the opportunity to write about any issue of interest to the international mathematical community. Disagreement and controversy are welcome. An Opinion should be submitted to the Editorin-Chief, Chandler Davis.
Mathematics and History W. S. Anglin I begin with a caricature---indeed, a travesty--of the typical history of mathematics textbook. This caricature will draw attention to some of the philosophical presuppositions which often underlie these books. I shall then discuss alternatives to these presuppositions. Mathematics epitomises Reason. It began in Egypt and Mesopotamia. However, it really began in Greece, because that is where pure mathematics began, and pure mathematics is better than applied mathematics, because pure Reason is better than impure Reason. The greatest Greek mathematicians were Eudoxus, Apollonius, Archimedes, and Hypatia. Hypatia did very little compared to Archimedes, but Hypatia was the only woman who did mathematics, so we can be sure that her true greatness was hidden by male chauvinism. In spite of their preference for geometry, and their rejection of motion in mathematics, the Greeks were enormously wonderful. Unfortunately, superstition and ignorance m a d e a comeback w h e n Cyril, the Christian bishop, had Hypatia murdered (in 415 AD), and, for one thousand years, no one in Western Europe did any mathematics. Meanwhile, the Arabs were developing algebra. Although he could not prove the Theorem of Pythagoras for non-isosceles right triangles, AI-Khwarizmi was a magnificent algebraist. He once found two solutions to a quadratic equation, and he used three different values for pi. In the sixteenth century, Europe rebelled against the Church, and Reason (and happiness) returned. Taking up where Hypatia had left off, Newton and Leibniz invented calculus, and introduced motion into mathematics. The odd thing was that Newton and Leibniz worked independently. Therefore both of them deserve the praise and glory for creating calculus. 6
In the nineteenth century, Reason really came into her own. Before that, calculus was not rigorous (partly because it dealt with motion). Today, however, calculus is the most rigorous, most reasonable thing possible. Unfortunately, we cannot tell you much about contemporary mathematics because it would take all our time and energy just to find out ourselves what is happening. We do know, however, that it is wonderful. Then there was the extraordinary mathematician X. He was born in Ruritania, and all properly patriotic Ruritanians are justly proud of him. One of X's parents died w h e n X was very young, and X showed marvelous mathematical abilities at age three. X's remaining parent wanted X to be a plumber, but X persevered in
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4 9 1992 Springer-Verlag New York
his mathematical researches until he was broke and unemployed. N o b o d y offered X a university position, because no one was able to understand his proof of the central theorem that there is no four by four magic square whose entries are the squares of the first sixteen positive integers. Thus ends the caricature. It is meant to raise some questions about the nature of the history of mathematics, and, in what follows, we deal with ten of these questions, and suggest some new ways of writing a history of mathematics.
Should the historian write as though mathematics were always a good thing? Often a person writes a history of mathematics because he or she loves mathematics. William Dunham, for one, does not hide his enthusiasm. His Journey through Genius begins by telling us that Bertrand Russell's desire to know more mathematics was strong enough to prevent him from committing suicide [4, p. v]. Dunham then compares the great theorems to the plays of Shakespeare and the paintings of Van Gogh. Here is beauty well worth living for! A history of mathematics written by someone who hated the subject would make dreary reading. Beside the various advantages of an enthusiastic author, however, we must place one of the drawbacks. Just as a deeply pro-Catholic writer is going to have trouble giving an unbiased account of Luther, so a deeply promathematics historian is going to have trouble giving an unbiased account of, say, the shortage, in a certain century, of university positions for mathematicians. The pro-Catholic author will tend to make martyrs out of Catholics w h o suffered at the hands of Protestants, and the pro-mathematics author will tend to glamorise or pity mathematicians who could not find jobs. According to Aristotle, humans are characterized by rationality. With this as an assumption, it is not hard to construct an argument for the conclusion that any advance in Reason is a good thing for humanity. From this conclusion, moreover, it is not far to Neo-Platonic religions in which Reason is a god, and mathematical activity is liturgy. This is not the place to settle the relative importance and goodness of mathematics, but it is the place to note an example where the exaltation of mathematics often introduces bias into history of mathematics textbooks. This is the case of Hypatia, a fifth-century w o m a n mathematician who was killed by a street mob in Alexandria. Hypatia was a pagan, and some of the p e o p l e in the m o b were " C h r i s t i a n s . " An antimathematics historian might describe Hypatia's death as the removal of a proud reactionary w h o stood in the w a y of t h e n e w C h r i s t i a n s o c i e t y . T h e p r o mathematics historian, however, invariably beatifies Hypatia, lamenting her death as a sign of the decline of
Reason in the West. Whether or not one would dispute the pro-mathematics interpretation of her death, I think one would want the student of the history of mathematics to realise that a value judgement lurks behind it. Hypatia was a martyr for science, but she was also a benighted polytheist and an ivory-tower elitist. Believers in Reason feel certain that Hypatia's (rather unoriginal) work in Diophantine equations was more important than the Alexandrians' need to reject paganism, but this feeling is not an objective fact about the fifth century. Those who place a high value on mathematics are tempted to judge a culture in terms of the number of theorems it proved. This leads to an unrealistic view of cultures, such as the ancient Chinese culture, which did not produce many theorems, but which did produce some excellent poetry and philosophy. A properly balanced history of mathematics textbook would recognise the importance of many forms of intellectual activity--including the pro-monotheistic theological investigations of fifth-century Alexandria.
Should a history of mathematics revolve around individuals and their private lives? In Boyer and Merzbach's A History of Mathematics, ten out of twenty-eight chapters are named after individual mathematicians. In An Introduction to the History of Mathematics (5th ed.), Howard Eves thinks it a good idea to include pictures of Pythagoras and Archimedes, even though we do not know what they really looked like. In the Preface of Journey through Genius, Dunham asserts that an understanding of the private lives of mathematicians "can only enhance an appreciation of their work." In the Preface of The History of Mathematics, Burton [3] writes, Considerable prominence has been assigned to the lives of the people responsible for progress in the mathematical enterprise. In emphasizing the biographical element, I can say only that there is no sphere in which individuals count for more than [in] the intellectual life. Most histories of mathematics are individualist (as opposed to, say, Marxist). The historian arranges the material in terms of individuals, and goes to much trouble to ensure that the right individual gets the right amount of praise for the right theorem. Individualist historians tend to be thrown off balance when a theorem does not have a unique, nameable first discoverer. One should note, however, that there are alternafives. There is no reason w h y one could not write a history of mathematics entirely from a communitarian point of view. Instead of singling out the individual person responsible for the theorem, one could single out the technological capacities or social needs which were responsible for it. Instead of glorifying the lucky person who happened to be the first to get the proof, one could glorify the ethical ideals of the community THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
7
which led it to educate people in such a way that this proof was inevitably found. Individualist historians of mathematics will tell you an anecdote about Euclid (see p. 10, bottom of first column)--even though they know that there is practically no factual basis for these anecdotes. They will not tell you whether the Elements is an expression of the upper-class elitism we can detect in Plato's dialogues, and they will not address the issue of whether the purity and precision of Euclid's proofs proclaim a contempt for ordinary manual work (this being performed by the slaves). My object here is not to defend a particular theory about the social context of the Elements, or the nature of history. I merely wish to submit that, in many cases, it may be more illuminating to relate a piece of mathematics to its social environment than to a fictitious anecdote about the private life of the author of that piece of mathematics.
Should a history of mathematics be organised in terms of nations or races? D. E. Smith divides the first volume of his History of Mathematics into 10 chronologically determined chapters. These chapters are subdivided into 67 sections, and 32 of these sections are named after nations (e.g., Egypt, England). Smith thus follows the customary practice of organising the history of mathematics by nationality9 In his Preface Smith asserts [10], While it is evident that no race or country has any monopoly of genius, and while the limits of successive countries are only artificial boundaries with no significance in the creation of the masterpieces of science, nevertheless linguistic and racial influences tend to develop tastes in mathematics as they do in art and in letters9 He continues, In this treatment of the subject an attempt has been made to seek out the causes of the advance or the retardation of mathematics in different countries and with different races9 One example of Smith's nationalistic analysis occurs at the beginning of the "Germany" section in the chapter on the sixteenth century: The mathematics of Germany was Gothic, unpolished, but virile; the mathematics of France was Renaissance, polished, but generally weak. I leave it to the reader to decide what, if anything, that means. There are two reasons for asking historians to stop organising histories of mathematics along nationalistic or racial lines. The first is ethical. Patriotism and racism lead to war. It is not good for readers to have their patriotic blood heated up over the fact that, say, the Germans beat the "weaker" French to the discovery of non-Euclidean geometry. 8
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
The second reason for asking historians to stop describing mathematical activity in terms of nations is that mathematics is a universal enterprise, with intellectuals from every nation pursuing a common goal, a goal which transcends political and genetic boundaries. Admittedly, some mathematical research (e.g., in cryptanalysis) is tied to the military ambitions of particular nations, and, in such cases, it may be historically important if one country, rather than another, is the first discoverer of some theorem; however, most mathematics is not like this. On the contrary, typical mathematical research furnishes one of the best examples of international cooperation. Smith's treatment of Fibonacci (1170-1250) is interesting in this connection. Smith's nationalistic analysis leads him to remark that the work of Fibonacci (an Italian) was beyond the competence of any professor in Paris. What needs to be told is quite different, however. The scholars of Medieval Europe were not tied to their particular countries. They spoke a common language (Latin), and they travelled all over the continent 9 At that time, the sharp social boundaries were not national or racial, but economic or religious. The point which Smith's nationalism obscures is that Fibonacci was the best mathematician, not just in Italy or France, but in the whole of the European scholarly community.
H o w should the historian tackle the scarcity of w o m e n mathematicians? Men and women are equal intellectually. Apparent differences between male and female mathematical ability are due to social factors such as cultural systems in which men take all the educational opportunities for themselves. However, for whatever reason, the fact remains that, prior to 1900, there were fewer than a dozen famous women mathematicians. What should the historian do about this fact? There are at least three things to be done, and one thing to be avoided. The thing to be avoided is a false exaggeration of the role of women in mathematics 9 No doubt for the best of reasons, David Burton exaggerates the importance of certain women mathematicians9 In The History of Mathematics, he tells us that Theano, the wife of Pythagoras, was 9 . . a remarkably able mathematician, who not only inspired him during the latter years of his life, but continued to promulgate his system of thought after his death [3, p. 981. There is no evidence at all for these assertions. We know that Theano was herself a Pythagorean, but we have no testimony about her mathematical abilities, or about her influence on Pythagoras. Indeed, we do not know whether Pythagoras himself was a remarkably
able mathematician. (As far as the Theorem of Pythagoras is concerned, we do not know whether Pythagoras was the first to prove it, and we do know that he was not the first to discover it.) Burton also exaggerates the importance of Hypatia. He assures us that, "with the death of Hypatia, the long and glorious history of Greek mathematics was at an end." This is going too far. The last first-rate Greek mathematician was Pappus, who lived a century before Hypatia. The last second-rate Greek mathematician was Proclus, w h o died seventy years after Hypatia. Depending on one's standards of glory, the "long and glorious history of Greek mathematics" ended with either Pappus or Proclus---but not with Hypatia. Attempts to play up the role of w o m e n in mathematics distort history and patronise women. However, if the historian is not to falsify the importance of the few women mathematicians, is there some other way of coping with the ugly results of male chauvinism? There are at least three ways. First, the historian can write in such a w a y that the sex of the individual is not emphasised. Instead of "John Kepler," one can have "J. Kepler." Instead of "he," one can have "the mathematician." Second, the historian can challenge the veneration that histories of mathematics usually accord to their subject. Perhaps the reason there are, even today, so few women in pure mathematics is that w o m e n realise that careers in pure mathematics are stressful, lowpaying, and socially useless. Only a man would be stupid enough to want such a career. In describing Gauss's choice of mathematics over philology, the historian might write, "this unhappy choice robbed the world of some important work in linguistics, and can only be attributed to the fact that Gauss was not a woman." Finally, the historian can abjure the individualism so prevalent in histories of mathematics. Why does the praise go only to the person who actually solves the mathematical problem? Could he have solved it had he never had a mother? Could he have solved it without the support of his wife or lover? The cultural output of a society depends on the effort and creativity of the whole population, and it is inaccurate to attribute that output to the few individuals who merely add the finishing touches.
Should the history of mathematics be toId in terms of chronological periods? W. W. Rouse Ball organises A Short Account of the History of Mathematics (Dover Publications, N e w York) in terms of three chronological segments: the Greek period (600 BC-641 AD), the medieval and Renaissance period (641-1637), and the modern period. (The Arabs captured Alexandria in 641, and Descartes published
his work on analytic geometry in 1637.) A similar chronological division is found in many histories of mathematics---although some accord a period to the Egyptians and Mesopotamians, and some split the Renaissance period off from the Middle Ages. Indeed, some authors make a very sharp division between medieval mathematics and mathematics of other periods. For some authors, nothing happened in mathematics between 641 and, say, 1545. Hollingdale [8, p. 92] entitles his short chapter on medieval mathematics "The Long Interlude." Pro-medievalists, on the other hand, argue that the Renaissance and modern periods would have been impossible without the advances of the Middle Ages. In A History of Mathematics, Boyer and Merzbach [2] give n u m e r o u s examples of key medieval accomplishments. Some of these are the following: 9 The Chinese developed " H o m e r ' s method," making it possible to compute cube roots of arbitrary numbers, and thus giving Cardano a numerical basis for his otherwise unsupported assumption that alI numbers have cube roots. 9 Indian mathematicians developed an efficient long division algorithm, and introduced negative numbers. 9 Fibonacci developed sophisticated techniques for handling Diophantine equations. He was able to find integer solutions to the system [6]
x+y+z+~=~, w2 + y 2
= u 2,
U 2 q- Z 2 =
V 2.
9 Whereas ancient Greek mathematicians could not cope with the infinite, the medievals quite regularly investigated it, and laid the foundations for calculus. Oresme, for example, summed the series 1/4 + 2/42 + 3/43 + . . . and gave the first proof of the divergence of the harmonic series. As another example, Albert of Saxony anticipated Galileo's discovery that the members of an infinite set can be put in one-to-one correspondence with the members of a proper subset of it [5, pp. 355-6]. A thousand years from n o w (if the world has not ended), historians may give the year 1950 as the end of the "'European calculus period" and the beginning of the "international computer period." Hopefully, students will not think that, in 1950, the computer suddenly took over. It is natural to divide history into chronological periods, but one must not oversimplify. One must be careful not to lead one's reader into concluding that the Greeks were the first real mathematicians, the medievals were utterly ignorant, and we m o d e m s are just perfect. THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 9
What is the relation between pure mathematics and calculating devices? By "calculating devices," I mean algorithms for calculating, as well as machines used to implement them. In this sense, calculating devices include the abacus, Horner's method, decimal arithmetic procedures, slide rules, computer networks, and pocket calculators. Even in pure mathematics, progress is often tied to calculation. Often the reason a problem cannot be solved is that the computational equipment is too primitive. Historians of mathematics sometimes neglect this practical feature of mathematical development, and it may be worthwhile to list some ways in which calculating devices have proved essential. 9 Lacking any equivalent of infinite decimal expansions, the ancient Greeks had to use geometry to give a rigorous treatment of irrationals. They understood numbers as line segments or distances. Since there are no negative line segments or distances, this approach prevented them from discovering negative numbers. 9 In creating calculus, it was necessary to compare large numbers of functions, areas, and slopes. This required the calculation of many tables of values, a task that would not have been feasible without the decimal system, and the use of logarithms. 9 Much contemporary Number Theory is done by means of frequent computer interaction. It is interesting that, before the high-speed computer, we knew of only 12 perfect numbers; n o w we know of 31. 9 It is only thanks to the computer that the FourColour Theorem was finally proved. There are other examples, and it is odd if a historian of mathematics neglects this aspect of the subject. Burton's book (The History of Mathematics), for example, is incomplete because, apart from some brief remarks on Pascal's calculating machine (built in 1642), Burton says nothing about computers. The word "computer" does not appear in the index of his book.
Should mathematics be portrayed as transcendent? Some historians exalt the purity, rigour, or perfection of mathematics, and they present the pure mathematician as an otherworldly mystic. For example, in Mathematics and Its History, John Stillwell repeats the anecdote about Euclid's reply to a student who had asked about the job market for Mathematics graduates: Euclid called his slave and said "Give him a coin if he must profit from what he learns" ([11, p. 25]). The idea is that the pure mathematician is above money. Historians of mathematics sometimes take this anec10
THE MATHEMATICAL 1NTELL1GENCER VOL. 14, NO. 4, 1992
date to heart. If the famous mathematician is poor and unemployed, they glamorise his spirituality, but if he is shrewd and rich, they are embarrassed to mention it. We hear about poor, young Abel, but not about rich, old Gauss [5, p. 111]. If the mathematician devotes himself to some ultra-otiose topic, the historians long to reveal the details, but if the mathematician is a sensible sort, working in actuarial science, then the historians ignore him. The only financial mathematics ever mentioned consists of a few quaint medieval inheritance problems. The fondness of some historians for the transcendent shows up in their treatment of kinematics. According to Plato, the transcendent world is changeless, and mathematics related to motion is not real mathematics. Historians who agree with Plato present geometry rather than astronomy as the heart of Greek mathematics (although the Greeks themselves may not have seen it this way), and they explain that Weierstrass et al. did calculus a great service by purging it of the concept of motion.
Should the historian idolise rigour? The two extreme positions on rigour can be described as follows. The first is that rigour is the essence of mathematics. If it is not rigorous, it is not real mathematics, and it does not belong in a history of mathematics--unless, perhaps, it can be presented as a confused first approximation to some real mathematics. This position is upheld by Hollingdale in his Makers of Mathematics, w h e n he quotes G. H. Hardy: The Greeks were the first mathematicians who are still "real" to us today. Oriental [i.e., ancient Egyptian and Mesopotamian] mathematics may be an interesting curiosity, but Greek mathematics is the real thing [8, p. 12]. For Hollingdale, as for Ball, it is the Greek period which is really the first period of mathematics. The other extreme position is that rigour is the fossilization of true mathematics. Mathematics progresses, not via deduction, but via experimental science and artistic insight. Mathematics is not a careful march d o w n a well-cleared highway, but a journey into a strange wilderness, where the explorers often get lost. Rigour should be a signal to the historian that the maps have been made, and the real explorers have gone elsewhere.
M a t h e m a t i c s is not a careful march d o w n a well-cleared highway, but a journey into a strange wilderness, where the explorers often get lost.
Historians who hold one of these extreme positions might help their readers by warning them of it. It can be confusing to read that "mathematics began with the Greeks" if one is unaware of the value judgement concerning rigour which lurks behind this proclamation. Is the history o f m a t h e m a t i c s an epic or a comedy? Historians of mathematics often present the progress of their subject as a march of knowledge against evil ignorance. Every discovery is an important component in the overall victory of Reason9 Historians of this ilk seldom tell jokes about mathematical progress. Yet the history of mathematics is quite humorous. Mathematicians are reputed to be intelligent and rational, but their work is not infrequently mixed with error, confusion, or superstition. It would be easy to write a history of mathematics that was both accurate and funny. Following are some examples. In 1799, Gauss published a proof of the Fundamental Theorem of Algebra9 He prefaced this proof with an aculeate criticism of the previous proofs of the same theorem, showing that they were all wrong. Some earlier mathematicians, for example, had taken to calling the Fundamental Theorem "D'Alembert's Theorem," because they mistakenly thought D'Alembert had proved it in 1746. Evidently, these earlier mathematicians were better at praising D'Alembert's work than understanding it. (Ha! Ha!) The funny part, however, is that Gauss's own proof was flawed [11, pp. 195200]. The early workers in calculus reasoned by analogy. They took principles which apply in finite cases, and applied them to examples involving the infinite. George Berkeley (1685-1753) laughed at them: He who can digest a second or third fluxion, a second or third difference, need not, methinks, be squeamish about any point in Divinity [2, p. 465]. In the nineteenth century, calculus was put on a rigorous footing, a n d mathematicians congratulated themselves that they were, at last, above Berkeley's jibes. The funny part is that, in 1966, Robinson showed that the calculations with infinitesimals which Berkeley had ridiculed were basically sound. Euclid intended to give a rigorous treatment of geometry but he goofed in his very first demonstration. He forgot to add an axiom to ensure that the circumferences of two overlapping circles do not merely pass through each other without touching, but actually meet in a point. In 1899, Hilbert presented a rigorous reworking of Euclid, but he made the very same mistake. It was only in the French translation of Hilbert's book that he belatedly added the necessary axiom ([7], p. 25). In 1950, R. Kershner coauthored a book which noted that many published theorems are later defeated by
counter-examples. Kershner did not know that he himself was going to provide an example of this. In 1968, Kershner claimed he had established a complete list of all possible ways of tiling a plane with congruent, convex pentagons. He published his result in a prestigious journal9 Sure enough, in 1975, someone produced a drawing of a tiling Kershner had completely missed. The funny part, however, is that the expert on the subject turned out to be a San Diego housewife, Marjorie Rice, who had never been to a university, but who found several new tilings that had eluded Kershner ([9], pp. 140-166).
Mathematicians are reputed to be intelligent and rational, but their work is not infrequently mixed with error, confusion, or superstition. There are many other examples: Cauchy's misadventures with uniform convergence, Kronecker's pigheaded denial of infinite set theory, Lambert's failure to realise he had discovered non-Euclidean geometry, and so on. Usually these examples are presented as minor oversights, which are later corrected by the inevitable increase in rigour. However, presented as the ludicrous lapses they really were, these oversights would make entertaining instruction, and give a key insight into the nature of mathematical activity. H o w s h o u l d a history of m a t h e m a t i c s relate to religion?
Is there conflict between Reason and religion? The "positivist" philosopher Auguste Comte (1798-1857) thought so, and certain historians of mathematics agree with him. For example, Burton writes, 9 . . a new movement developed in Alexandria, and also in many other parts of the empire, which was to accelerate the demise of Greek leaming. This was the development of Christianity [3, pp. 234-236].
In a similar vein, E. T. Bell is unhappy that Pascal spent time doing philosophy of religion. In Men of Mathematics, Bell writes, we shall consider Pascal primarily as a highly gifted mathematician who let his masochistic proclivities for selftorturing and profitless speculations on the sectarian controversies of his day degrade him to what would now be called a religious neurotic ([1], p. 73). Eves follows suit. In An Introduction to the History of Mathematics (5th edition), he describes Pascal as a "might-have-been" and a "religious neurotic." More recently, Hollingdale has joined the assault. In Makers of Mathematics, we read that Pascal's "outstanding intellectual powers were exercised mainly on sterile THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
11
theological speculations occasioned by the sectarian religious controversies of his day" ([8, p. 151]). Believe it or not, this is the same Pascal who is the subject of a flattering c h a p t e r in v o l u m e IV of Frederick Copleston's much acclaimed History of Western Philosophy. It seems that Pascal's "profitless speculations" are ranked by present-day theologians and philosophers as being among the very greatest works in their subjects. Fortunately, not every historian of mathematics holds the rather fanatical view that any person who is interested in religion is at best wasting his time and at worst insane. D. E. Smith, for example, compares Fibonacci to his contemporary St. Francis, praising both of them for "bringing new light into the souls of men" ([10], p. 217). To give another example, Boyer and Merzbach note that the speculations of Scholastics, such as St. Thomas Aquinas, helped lead to the Cantorian theory of the infinite [2, p. 294]. For the theist, Reason is not a god, mathematics is not a w a y to salvation, and mathematicians are no holier than anyone else. A theist would not write a history of mathematics which gave all the credit to human beings and none to God. Nonetheless, a theist can welcome mathematics as a divine gift--just as a mathematician can welcome natural theology as a gift of Reason. Historians of mathematics might consider the possibility that the alleged conflict between Reason and religion is a myth. At the very least, they should acknowledge that if a historian injects a personal antireligious hostility into a history book, then that history book ceases to be objective and begins to be unpleasantly tendentious.
12
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
Acknowledgment I thank the Social Sciences and Humanities Research Council of Canada for its financial support during 1989-91.
References 1. Bell, E. T., Men of Mathematics, New York: Simon & Schuster (1937). 2. Boyer, C. B., and U. C. Merzbach, A History of Mathematics, 2nd edition, New York: John Wiley (1989). 3. Burton, D. M., The History of Mathematics, Dubuque: Wm. C. Brown (1988). 4. Dunham, W., Journey through Genius, New York: John Wiley (1990). 5. Ebbinghaus, H.-D., et al., Numbers, trans. H. L. S. Orde, New York: Springer-Verlag (1990). 6. Fibonacci, The Book of Squares, trans. L. E. Sigler, Boston: Academic Press (1987), 107. 7. Hilbert, D., Foundations of Geometry, La Salle: Open Court (1950), 25. 8. Hollingdale, S., Makers of Mathematics, London: Penguin Books (1989). 9. Schattschneider, D., "In Praise of Amateurs," The Mathematical Gardner (D. A. Klarner, ed.), Boston: Prindle, Weber & Schmidt (1981). 10. Smith, D. E., History of Mathematics, vol. I, New York: Dover (1958). 11. Stillwell, J., Mathematics and Its History, New York: Springer-Verlag (1989).
Department of Mathematics and Statistics McGill University Montreal, H3A 2K6 Quebec, Canada
Integrability in Mathematics and Theoretical Physics: Solitons S. P. N o v i k o v
Mathematical problems cannot always be solved. Sometimes they turn out to be too hard, and their solutions stretch out over decades and even centuries; sometimes they prove to be insoluble "in principle." Alas, we mathematicians are not like those Soviet giants of the thirties w h o could raise hundreds of calves in one year from two cows, or mine a mountain of coal overnight. Even thousands of years are not long enough to achieve the trisection of the angle and the quadrature of the circle with compass and straightedge; to express the roots of fifth-degree algebraic equations in radicals; to give algorithmic solutions of the problems of group theory and Diophantine equations; to invent an axiom system for arithmetic leaving no statement undecidable within the system . . . .
for an important special case: w h e n the solution you are studying is close to a known solution found in advance by other methods, or when the equation you are studying is close in an appropriate sense to one that is exactly solvable. But to carry this m e t h o d through, you will need the "exact solution" really exactly, at the level of formulas and algebraic or analytic identities, not merely numerical approximations. We will see examples of this in what follows. In classical physics of the nineteenth and early twen-
Generic Laws and Special Cases Sir Isaac Newton in his famous "Second letter to Oldenburg" (1676) gave an anagram, i.e., a short encrypted sentence, stating what he regarded as the principal discoveries made in the work. His claim, put in modern terms, was to have found a universal method of solving algebraic and differential equations, by expressing the desired solution as a power series and evaluating its coefficients by substituting it into the equation. Actually, aside from this simple procedure and its analogs, today there is still no universal theoretical approach. As for a numerical solution on a computer, n o b o d y solves equations this way. No significant quantitative or qualitative information about the behavior of solutions can be obtained in this way, except THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4 9 1992 Springer-Veflag New York
13
fieth centuries, the fundamental laws of nature were expressed by differential equations. In contemporary quantum theory, the picture is more complicated, but differential equations have not lost their significance. Many special problems are described by differential equations in an approximation depending organically on the particular concrete situation. One sometimes also encounters more complicated types of equation (difference equations, integral equations, equations with delay, etc.). What differential equations do we need, and which of them do we k n o w h o w to solve? What can be said in general about the properties of their solutions? H o w have views of this changed in the last three centuries? The first important and difficult problem beginning the era of differential equations was solved by Newton: the problem of the motion of two point masses with masses ml and m 2 under the action of mutual gravitational attraction. (Let us skip over the question of the role of the apple, and also the question whether Newton wrote down the equations all by himself or with help from another great English physicist, Robert Hooke.) Newton solved the 2-body problem exactly and derived Kepler's laws. To be sure, the 2-body problem is merely an approximate model. In real life, massive bodies are not point masses; the effect of other planets is not so small; and the stability of the full model is dubious. For more than 150 years many scientists doubted the validity of Newton's gravitational theory of the Solar System. Yet it is valid---even though we are still unable to establish strong stability mathematically for values of the parameters differing as much as our real Solar System does from the exactly solved N e w t o n 2-body model of the Earth and Sun. This example already gives us an idea of the basic structure of the approach of mathematical physics: to represent the system under study in one sense or another as a small perturbation of an exactly solvable model. Theoretical physicists proceed in this w a y even when the perturbation is not at all small. What else can they do? There are practically no alternatives. Direct numerical solution is often impossible, and anyway its results may be hard to use theoretically. The simplest models where all the difficulties can be brought out are what are called autonomous dynamical systems in some "phase space," whose points correspond to states of the system and are given by coordinates (y) = (Yl. . . . . Yn)" The dynamical system is written
dyy(t) dt = }~(yl(t). . . . .
yn(t)).
(1)
It often happens that the phase space has some nontrivial topology, for example in the description of rotations of a body; the coordinates (y) may be only "local" or may have some singularities (as do, for exam14
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
ple, polar and spherical coordinates). In Newtonian mechanics, giving the coordinates means specifying positions and velocities of all particles. In many more special problems, giving the coordinates means instead specifying values of a minimal set of parameters needed to characterize the state of the physical system to the degree of approximation relevant to the problem, where this set of parameters may change over time ("dynamical variables"). Systems that can be considered closed in the given approximation have the property that no state loses or gains energy. They are described by so-called conservative (Lagrangian or Hamiltonian) equations. In dissipative systems, frictional forces are added to the equations of motion--so that energy decreases. In more general systems, the right-hand member of (1) may be more complicated, representing dissipation, outside energy sources, etc. Such terms as "energy" (and the now forgotten "vis viva") were borrowed at the end of the seventeenth century from the occult sciences. They came to denote in theoretical physics precisely defined mathematical quantities, the so-called integrals of the motion. A general integral of the motion in our case is any function F(y) of the coordinate values (the state of the system) such that its value does not change over time as the system evolves according to (1): dF(y(t))/dt = 0, i.e., F(y(t)) is constant. In a typical dynamical system taken at random, there will not be any single-valued continuous integrals of the motion at all. In an autonomous Hamiltonian system, there will be in general only one, the energy. In a Hamiltonian system with a group of symmetries, there are more integrals: as many as there are generators in the group of symmetries. For example, the principle that the laws of physics are invariant under all motions of space (translation of time, rotation and translation of space) gives rise to these integrals: energy, momentum (coming from translations), and angular momentum (coming from rotations). One uses these integrals to eliminate some of the coordinates. Thus, from the group of translations, one gets rid of the coordinates and the velocity of the center of mass; the case of rotations is more complicated. It results from the noncommutativity of the group of rotations that one cannot eliminate all three Euler angles and their rates of change relative to the center of mass; only four variables of relative motion can be eliminated, rather than six as was done for the group of translations. So there appears in the theory of Hamiltonian systems an algebraic peculiarity, this sensitivity to the algebraic structure of the group of symmetries---in this case, the group of motions of the 3-dimensional Euclidean space R 3 in which we live. (This was the view of the classical mechanics of the seventeenth through the nineteenth centuries. The point of view of the con-
temporary theory of elementary particles and gravita- analytic methods together with computer technology. tion is more complicated, and still further complica- Computers are not used only for straightforward findtions are in the offing, but here let us not get out of R3.) ing of orbits, but join with deep mathematics to bring Anyway, we have no hope of finding more general out an incomparably more diverse array of beautiful, integrals of motion, other than total energy, total mo- intellectually satisfying aspects of the problem. mentum, and total angular momentum, for the probThe general qualitative theory of Hamiltonian syslem of the motion of n point masses. For n = 2, the tems close to integrable ones was founded by Poincar& 2-body problem, these are enough to integrate the Between 1955 and 1965, Soviet mathematicians, A. N. equations completely; for n > 2, the problem of three Kolmogorov and his student V. I. Arnold, and the or more bodies, they are not. At the end of the nine- Western mathematician J. Moser created the remarkteenth century, it was even rigorously proved (by able KAM theory, revealing the answer to the question Poincar6 and others) that in the 3-body problem there of the behavior of orbits. It must be noted, however, are no analytic integrals of the motion except those that the theorems of this theory require a degree of mentioned already, arising from the group of symme- closeness to an integrable system which is not found, tries of the laws of nature. for example, in the real Solar System, in the problem of At this stage, a prevailing attitude developed toward motion of a sputnik, etc. (what are called the small the problem of integrability, which can be briefly parameters, measuring this degree of closeness, have s u m m e d up this way: If there is no symmetry, then values in reality many orders of magnitude greater there are no integrals, at least "as a rule," in sensible than required by the KAM theorems). Nevertheless, problems encountered in natural science. It is neces- the qualitative picture of behavior of orbits is made sary to study arbitrary systems of the greatest gener- clear by the KAM theory. Algebraic and functional ality: that is what can describe reality. structures based on it are of decisive significance in Let me give a little commentary on this attitude, in theoretical analytic constructions depending on dosethree points, of which the first two are mathematical ness to an integrable system. These properties of sys("What can we do?") and the third philosophical tems are ignored by the general qualitative KAM the("What do we believe?"). ory, but they take primary roles in the ideas and meth(1) Typical, generic systems certainly should be studied. ods I will be talking about later. Unquestionably, they approximately describe many (3) Physicists and mathematical philosophers of science for particular cases. But how should they be studied? Al- the most part do not believe that the laws of nature are to be ready in the 1920s and 1930s, the language of set the- expressed by arbitrarily chosen, general equations. Most of ory and fractional dimension (fractals) was worked them somehow believe "de facto" in a higher reason. For out, and in the 1950s, the concepts of probability in example, among spherically symmetric potentials the dynamical systems; in the last 30 years, practically no Newtonian potential is distinguished; it is the one n e w mathematical concepts and terminology have which in the 2-body problem admits many periodic been added. But how could these things be perceived motions, those followed (approximately) by the planin mathematical models of actual physical systems? ets revolving about the Sun. This is a specific property Aside from a few ingenious, specially concocted mod- just of gravitational forces inversely proportional to the els in pure mathematics, this did not become possible square of the distance. Any perturbation or alteration until the use of modern computers. Ingenuity and of the analytic form of the force destroys the periodicanalysis are ineffective here, except for systems close ity of the motion, and the famous drift of perigee apto integrable ones. The author is aware of very few pears. In Newton's law there is hidden an extra symexamples where analytic methods allowed visualizing metry not entailed by any obvious group of symmestochastic behavior of orbits of an autonomous dynam- tries of space-time. But it is exactly this relationship ical system--let alone of general, arbitrarily chosen that permits the Solar System to endure! Furthermore, systems. One very restricted but curious class of such this same hidden symmetry survives in quantum mesystems I encountered in the early 1970s in the inves- chanics, and (because electrical and gravitational forces tigation of spatially homogeneous solutions of Ein- of attraction follow the same law) it determines to first stein's gravitational equations---what are called homo- approximation the quantum spectrum of a charged geneous cosmological models. A group of well-known particle and Mendeleev's table. physicists (E. Lifshits, I. Khalatnikov, and their stuBut is this important now, at the end of the twentident V. Belinskii) found analytically a very special sto- eth century? Are we not at the end of the period of chastic r6gime. But a single exception only confirms development of the mathematical apparatus of natural the rule. Without computers you cannot visualize random- sciences based on exactly solvable supersymmetric ness in real systems (in "general position"). models, going back to Fermat (derivation of the refrac(2) Integrable systems and systems close to integrable ones tion of light from the "divine" variational principle of can be studied in much greater detail using and at the same minimal time) and Newton (solution of the 2-body time adding to the arsenal of general algebraic and problem from the laws of mechanics)? THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
15
In classical mechanics that period could be considered dosed probably by the mid-nineteenth century; in the more m o d e m branches of physics, by the midtwentieth century. The theory of perturbations, i.e., the study of systems close to exactly solvable systems, remained the basic method in quantum mechanics as well. How about the yet-undiscovered laws governing the transformation of elementary particles, or the evolution of the cosmos in early or later stages at very large size scales? What should we expect--that they will be supersyrnmetric or arbitrarily chosen? Probably the former; but it would be hard to know in advance just which sort of supersymmetry it will be. The higher reason is not predictable just from our present knowledge and concepts; we can only make partial predictions. (Lucky for
generic function is completely different. This is indeed the way to tell the difference between generic and integrable systems experimentally, on the computer-though here one may be misled by computational errors, especially with the computing capabilities which existed 20-40 years ago (and even today it is often very easy to go wrong). In the early 1950s, the famous quantum physicist (and designer of nuclear reactors) Enrico Fermi, together with the mathematicians Pasta and Ulam, carded out a numerical computer experiment, following the motion of a discrete set of points on the line---a variant of the discretized 1-dimensional string. The aim was to understand how the stochastic equidistribution of energy among degrees of freedom proceeds in the presence of a nonlinear term and discretization us!) of the equation. The answer was unexpected: nothing stochastic was observed; the motion appeared quasiperiodic up to the limits attainable by the best computer available in the United States (run by the best Solitons and Scattering people). Specialists confronted with this did not immeThe repertory of admittedly important integrable sys- diately give a clear-cut response because such computems was not added to for a long time. As already tations could not be considered absolutely convincing, noted, rules have exceptions. In the nineteenth cen- and it was the first example of its kind. Other examples tury some special systems of mechanics and geometry of integrability came up in the theory of nonlinear were found to have further integrals of the motion in waves. the absence of any evident symmetry: Jacobi, NeuThis is a most curious story and is worth recalling. mann, Weierstrass, Clebsch, Kovalevski, Chaplygin, The English (or Scottish) gentleman, applied matheSteklov, etc. Plainly, some okay people were involved matician, and naval engineer John Scott Russell, as he in this! Especially popular in the 1880s was the inte- recounts in a work dated 1844, was observing in 1834 grable case of the Kovalevski top in a gravitational field the motion of a barge in a narrow canal, while riding with special values of the inertia tensor. The mathe- on horseback rather fast. The barge for some reason matical investigation of these cases was very beautiful. suddenly stopped. The mass of water moving in front It relied on deep use of the then-recent theory of Rie- naturally did not stop. It continued to move forward mann surfaces (its most nontrivial analytic aspects), by itself and shaped itself into a sort of mound of wawhich until then had not appeared in mathematical ter, about 30 feet long and 1-1.5 feet high, moving physics. With the exception of the simplest case of along the canal without noticeable change of shape at elliptic functions, it did not appear in physics after that a speed of about 8-9 miles an hour. Russell rode after either, for almost 100 years! Back in the nineteenth this object (called much later a "soliton") for some time century it seemed that these and similar special cases on his horse. It is believed today that this was the canal had nothing to distinguish them physically, and con- near the university building in Edinburgh, Scotland. (It sequently everyone outside of a small circle in classical is amusing that attempts to reproduce the phenomemechanics forgot about them. They did not have at non experimentally in the 1980s in that same canal that time any serious influence on the development of were unsuccessful. The soliton failed to appear w h e n physics. the barge stopped.) Keep in mind that the difference between integrable and I remark in passing that Russell throughout his scigeneral Hamiltonian systems appears primarily in quantita- entific teaching and engineering career never had a tive and qualitative characteristics of the orbits of the motion, regular secure long-term job (unlike our scientists), as long as nothing is known about the system but the shapes though he once had a chance to build a large ship, and of its orbits. another time he was recommended to the mathematics In the integrable case, the motion can be described faculty of Edinburgh University by Hamilton (but did (under the very general and easily verifiable hypothe- not get the appointment). This information was prosis of "compactness" of the phase surface of constant vided in an article by R. Bullow [1]. energy integral) by so-called quasi-periodic functions. A n y w a y , the well-known scientists Boussinesq The Fourier spectrum reduces to a finite number of (1872) and Rayleigh (1876) found theoretically that a basic (perhaps incommensurable) frequencies and soliton should appear in shallow water and found its their integral linear combinations. The spectrum of a analytic form. Then in 1895 the Korteweg-deVries 16
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
/
0.00
0.00
.0.50
-1.00
91 . 0 0
91.50 -2.00 q
'2.00 '2.50
-3.00 4
.3.00 3.50
-4.00
.........
~ .........
0.00
4.00
I .........
8.00
J .........
; 2.00
J .........
I .........
; 5.00
20.00
I
24-.00
.4.00 ......... 0.00
Fig.
2
], 4.00
,i ......... 8.00
f ......... 12.00
i ......... 16.00
i ......... 20.00
t 24.00
a.
Figure 1. Soliton. 0.00
(KdV) equation was derived, and from that basic equation of shallow-water theory came the already k n o w n soliton along with a n e w object, the cnoidal wave. Let us continue the story of the fate of these discoveries, leaving out the history of the rigorous derivation of shallow-water solitons from the exact equations of hydrodynamics. The discoveries that concern us came in the 1960s, not from that source but from plasma physics. As a result of the work of the applied mathematicians M. Kruskal a n d N. Zabusky (United States) and specialists in plasma physics in the USSR and the United States (Sagdeev, Gardner, Morikawa), it became clear that solitons and the KdV equation can approximately describe the evolution of waves in the most diverse media, under the most simple and crude hypotheses of nonlinearity and dispersion, if dissipation is negligibly small. This equation a n d its simplest solutions display a certain universality analogous to linear equations like D'Alembert's. Its t r a v e l i n g - w a v e solutions u(x - ct), a l r e a d y 9k n o w n in the n i n e t e e n t h century, occur in the form of solitons (Fig. 1),
K2 u =
-2
ch2(K( x _
4K2t)),
c =
4K2
-1.00 -I
-2.00 1
-3.00
-4.00
I
C
u = 2~a(x - ct + iw'Jg2,g3) - -~. (Here ~a denotes the Weierstrass ~a-function [2]. Figures 2a to 2c illustrate the decay of a cnoidal wave to solitons as the parameters g2, g3 tend to the degenerate point g3 = 27~. The maximal height of a soliton is proportional to the square of the phase velocity c. In 1965, Kruskal and Zabusky numerically investigated the evolution of the waveform w h e n the initial situation is the s u m of two
~6.~. . . . . . . . 2. 4 . 0 0
Fig. 2 b.
-1.00
-1
-2.00
-4
-,3.~
4
-4~ Fig.
or combs (Fig. 2): periodic cnoidal waves,
*~6. . . . . . . . .8.00 . . . . . . . . . . . . . .1.ZOO . . . . . . . . . . . . 15.00
o.o~ . . . . . . .
....... ,'.~6...... ~'.64................ ;zoo ;~,~...... ~6.bb...... ~kbo 2
c.
Figure 2. Cnoidal waves.
solitons with speeds c1 < c2, with the center of the faster soliton starting farther to the left: u = -232 In det A, + 1
A =
--12K1 e~(~_~ [ 9 2K2 _]
K1 + K 2 THE
MATHEMATICAL
INTELL1GENCER
VOL.
14, NO.
4, 1992
17
2.00-
0.00-
-2,00
-
-4.00
-
-6.00 -25`00
. . . . . . . . . . . . . . . . . - 15.00
i . . . . . . . . . i . . . . . . . . . i . . . . . . . . . i -5.00 5`00 15.00 25`00
Fig. 3. 2.00
0.00
-2.00
-4.00
-6.00
. . . . . . . . . i . . . . . . . . . . . . . . . . . . - 25,00 - 1,5,DO -5.00
i . . . . . . . . . i . . . . . . . . . i 5.00 15.00 25,00
Fig. 4. 2.00
0.00
-2.00
-4.00
-6.00
. . . . . . . . . -25.00
~ . . . . . . . . . i . . . . . . . . . ~ . . . . . . . . . i . . . . . . . . . - 15`00 -5.00 5`00 15.00 25.00
Fig. 5. 2.00
O.DO
-2.00
-4.03
-6.00
. . . . . . . . . -25.00
i . . . . . . . . . i,,, - 1 5`00 -5.00
.....
, , . . . . . . . . i . . . . . . . . . , 5,00 15`00 25.00
Fig. 6. F i g u r e s 3--6. I n t e r a c t i o n o f t w o s o l i t o n s . 18 THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4, 1992
How would they evolve? The result turned out most interesting (Figs. 3 to 6). The left soliton caught up with the right, as expected (it did have higher speed), and they flowed together. But quite unexpectedly, after some time had elapsed the same solitons could again be made out, retaining their individuality completely. The faster soliton as it were had jumped over the other (merging with it in the course of the jump), exchanging with it a certain phase-shift. Regarding a soliton as something like a particle, one could say there had been an elastic scattering event, with individuality of particles conserved, though during it they overlapped, producing a complicated waveform. Such behavior is the opposite of stochastic! Kruskal as early as 1965 found a series of other conservation laws (integrals of the motion) for the KdV equation, which somehow makes it clearer how the appearance of the initial shape can be remembered. Bear in mind that soliton solutions (i.e., isolated traveling waves, perhaps with internal structure) occur for a wide class of systems, some of them stochastic or nonconservative, etc., but that the behavior that has been described for a pair of solitons as in KdV theory is a sign of some internal degeneracy, some hidden symmetry. Two years later the famous 1967 paper of Gardner, Green, Kruskal, and Miura (GGKM) gave, in a certain sense, an exact construction of the general solution of the KdV equation in the soliton case, w h e n u(x,t) falls off with prescribed speed for large Ixi (evolution in the class of rapidly decreasing functions). The method was striking. It depended on the use of results of quantum scattering theory, both the direct and the inverse problems, which had been solved some time earlier by the Soviet mathematicians I. M. Gelfand, B. M. Levitan, and V. A. Marchenko for other purposes. Quantum theory came into the GGKM procedure as a mathematical trick! A year later the American mathematician P. Lax made sense out of the algebra underlying the procedure (see inset); the connection between the GGKM procedure and the language of Hamiltonian mechanics was established in 1971 by the Soviets L. D. Faddeev and V. E. Zakharov, and also by Gardner; Zakharov together with A. B. Shabat in the USSR, as well as Lamb in the United States, found in 1971 a number of new physically important integrable systems in which the Lax representation and the GGKM procedure work for rapidly decreasing waves "of soliton-like type." Beginning in 1973, systems admitting a Lax representation were found in profusion. Among such systems there are now some which are 2-dimensional (but not 3 - d i m e n s i o n a l as yet, it seems---at least no important ones). In none of them would you say there is any evident symmetry in a traditional sense. Among the 2-dimensional integrable systems with hidden symmetry appeared a known equation, which had been introduced by Soviet plasma
physicists, B. Kadomtsev and his student V. Petviashvili, in the study of transverse stability of a 1-dimensional effect described by the KdV equation. This equation, called the KP equation, has a very beautiful mathematical theory. All in all, integrable systems are numerous among physically important universal systems describing processes in the first nonlinear approximation, in one and sometimes in two spatial dimensions. And they all have in common a hidden algebra--some analog of the so-called Lax representation. I will try (below) to give an idea what that is. The point for now is that this representation associates with the nonlinear equation a certain auxiliary linear operator L, analogous to the energy operator or Hamiltonian in quantum mechanics, knowledge of whose spectrum can help in solving the nonlinear system. For the KdV equation, L is a scalar Schr6dinger operator with potential u(x,t), where t is a parameter. In the above-mentioned rapidly-decreasing case, the spectrum is described by scattering theory for L. I do not want here to get into the details of the mathematics of scattering theory for 1-dimensional Schr6dinger or Dirac operators or whatever. What is important here is that the coefficient functions of these fundamental q u a n t u m mechanical operators describe states of our nonlinear system in soliton theory. In particular, for the Korteweg-deVries equation, we have the Schr6dinger operator
the basic mathematical f r a m e w o r k of twentiethcentury theoretical physics. Numerous works have followed the idea of the GGKM method (or method of inverse scattering problems, whose Russian acronym is MOZR) as a nonlinear analog of the Fourier method [3].
New Exact Integrals However, in the modern theory of solitons this point of view is seen to be incomplete. It too is bound by the "soliton-like" character of the w a v e f o r m u(x,t), namely, the property of being rapidly decreasing in Ixl. This point of view fails for the simplest and most important class of periodic waves u(x,t), those having a period T such that u(x + T, t) = u(x,t) for every value of the time t. There are other classes of functions, important in the theory of nonlinear oscillations and quantum field theory, for which the KdV equation is still not solved. Nor was any bridge found to join the "nonlinear Fourier method" with the famous integrable cases of classical mechanics, whose complicated and obscure development I have already recalled. There ought to be in anything new some kernel, some essential part, which harks back to something old and "well-forgotten"--that is, forgotten because at the time it could not be understood (even by its authors)!
d2
L = --d-~x 2 + u ( x , t )
acting on an arbitrary function O(x) by L[O(x)] = - ~ " +
u(x,t)O.
The notion of scattering makes sense w h e n the coefficient function ("potential") u(x,t) is, for every value of t and for Ixl ~ 0% rapidly decreasing in some good sense. Therefore, sufficiently far away (mathematicians say, for x ~ ~ or x ~ - o0), the eigenfunctions of the Schr6dinger operator, i.e., the solutions of - ~ " + u~ = )~, hardly differ from ordinary exponentials exp(ikx), or more precisely from linear combinations of exp(ikx) and e x p ( - ikx) (the eigenvalue is positive, being the square of the wave number: )~ = k2, k ~ R). The comparison of the asymptotics of the same eigenfunction for x ~ ~ and for x --+ - o0 constitutes what mathematicians call "scattering." This connection of scattering theory with exponentials exp(+-ikx) makes the method of solving the KdV equation by the GGKM setup in a sense analogous to the well-known Fourier method. Remember that that method, discovered by Bernoulli and D'Alembert in the eighteenth century in connection with oscillations of a string, became later (beginning with Fourier) a universal method of solving linear differential equations with constant coefficients. Exactly this class of integrable systems has remained THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
19
Then what about the periodic case? Well, the cnoidal wave, a solution of the KdV equation known back in the nineteenth century, is periodic in x. The subject of periodic (and quasi-periodic) solutions of the KdV equation turned out to be complicated mathematically. A successful approach was found in m y work in 1974 and developed by me together with m y students Dubrovin and Krichever, and by others: Leningrad colleagues, especially Its and Matveev, and many Western mathematicians including P. Lax, H. McKean, and P. Van Moerbeke. It starts from the same algebra, but here the mathematical objects which come up are typical of solid-state quantum mechanics: the SchrOdinger operator with a periodic potential, the Bloch waves describing quantum states of electrons in crystals. In contrast to the scattering theory associated with rapidly decreasing potentials, the inverse problems of quantum mechanics had never been solved in the periodic case prior to soliton theory. Their solution turned out to be inextricably linked with soliton theory, with the KdV equation; both theories were thereby enriched. The mathematical technique for the periodic problem unified new results in the nonlinear KdV equation with a deeper understanding of spectral theory of 1-dimensional periodic structures (crystals) based on the theory of Riemann surfaces and the related computations with special functions. This marked the appearance of a bridge joining modern integrable systems of the theory of nonlinear waves (soliton theory) with the integrable cases of the top from the nineteenth century which I recalled earlier, citing S. Kovalevski and others. Riemann surfaces finally entered the toolbox of mathematical physics, after a lapse of 100 years. All this, of course, applies not only to the KdV equation but also to all the other integrable systems of soliton theory discovered in the 1970s and 1980s in the USSR, the United States, England, Italy, France, Japan, etc. The best known of them are the so-called 1-dimensional "nonlinear Schr6dinger equation;" the famous "sine-Gordon" equation, which already had come up in Lobachevskii geometry (negative curvature) in the nineteenth century and reappeared in the theory of superconductivity (the Josephson effect) and other physical problems in the 1960s and 1970s; discrete systems (the so-called Toda chains, and the discrete analog of KdV); and m a n y , m a n y o t h e r s (Zakharov, Shabat, Ablowitz, Kaup, Newell, Segur, Hirota, H6non, Flaschka, Manakov, Moser, Calogero, Degasperis, etc.). Let me try to explain for the example of the autonomous dynamical system (1) how we are led to the new integrals, spectra of operators, and Riemann surfaces---though it will still be hard to see w h y these simple ideas have such deep consequences. Assume that we are given, on the phase space with 20
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
coordinates (y) = (Yl. . . . . Yn), two matrix-valued functions depending on an additional parameter: L(y, ), A(y, ). These matrix functions must be chosen such that the equation of the dynamical system (1)
dt = fj(Yl . . . . .
yn)
is completely equivalent to a matrix Heisenberg equation of quantum mechanics dL(y(t),K) dt
-
LA
-
AL
=
(2)
[L,A]
(in soliton theory we then say that (1) admits a Lax representation). The important thing is that this relationship occurs often. It is what determines the new type of hidden symmetry. Let us look at the properties of this equation. (a) The eigenvalues of L are integrals of the system (1); for we see at once that they are independent of t if they satisfy the matrix equation (2). This is a situation familiar from quantum mechanics. (b) If L depends upon the numerical parameter ;~, then its eigenvalues ~j as functions of ~, are multivalued functions with branch points, and indeed they define, in the complex domain, a Riemann surface consisting of the roots of the characteristic equation for the eigenvalues of L(X): det(L0~) - ,,E) = 0. Here the determinant of L - ~E has been set equal to zero. It is a polynomial in two variables ~, ~, and the solutions for ~ are being considered as functions of )~.
Example 1.
Let us take the simplest 2-by-2 example. Here n = 3, Yl = u, Y2 Ux, Y3 Uxx" If we take matrix functions of the form =
=
L=
,
A=
u-~
'
a = Ux= -d,
b = 2u + 4~, c = -Uxx + 2u 2 + 2hu - 4~ 2,
then the Heisenberg-Lax equation L' = [L,A] reduces to a differential equation for the function u(x) (check it!). A solution of this equation is (up to a constant) the Weierstrass elliptic function. This function u(x - ct) is also a cnoidal wave of the KdV equation (the soliton being a degenerate special case). The determinant of L - ~,E is a quadratic trinomial in ~ with coefficients depending upon h. Its zeroes describe the simplest Riemann surface for a 2-valued function:
~2 = V ~ ( K ) ,
R(X) = a 2 + bc =
-det
A(K).
The branch points of the Riemann surface are at the zeros of the polynomial R(~) and at ~ = oo. There are three in all. The coefficients of R(h) are integrals of the differential equation; the solution of the equation is in terms of the well-known Weierstrass elliptic function associated with the Riemann surface. This very Riemann surface turns out to have deep significance in the theory of the quantum Schr6dinger operator with periodic potential u(x) which determines the electron spectrum. Even this example gives a methodological indication; things are much the same in more complicated situations, where the Riemann surfaces and their special functions are more complicated. This, then, is how the higher symmetry of dynamical systems arises, though it was not until the mid1970s that this was understood. I will not try to describe the many deep investigations in this field in the last 15 years. It is perhaps worth repeating, however, that the mutual relationship
exactly solvable problems l of quantum mechanics %
Riemann surfaces
$ integrable nonlinear systems
has paid off not only for nonlinear systems but also reciprocally for q u a n t u m mechanics and theory of Riemann surfaces. In some interesting models of the solid state, the quantum-mechanical Schr6dinger operator (operator of energy) is not at all arbitrary as in other cases, but is specially constructed from the variational principle of "minimal free e n e r g y . " It can be related by the GGKM-Lax approach to an equation of KdV type, which plays the role of mathematical symmetry of the quantum problem. This approach made it possible to solve the Peierls-Fr6hlich model from the 1930s, and it became clear that it described a number of contemporary experiments, in particular the p h e n o m e n o n of "charge waves" in 1-dimensional (quasi-l-dimensional) substances. This research was done by physicists together with the mathematician Krichever at the Landau Institute of the Soviet Academy, about 10 years ago (I. Dzyalashinskii, S. Brosovskii, and others). In this important example, the situation is a sort of dual or inverse to that of GGKM: It is the quantum problem that has physical content; the Schr6dinger equation describes the motion of real electrons, whereas the nonlinear solitons of the KdVqike system are only a "symmetry group" of the Peierls-Fr6hlich model which allow it to be exactly solved. In recent years, these ideas of soliton theory have
spread to quantum field theory and statistical physics, where curious exactly solvable models are also occasionally found. Hidden algebraic mechanisms of integrability are inherited both from soliton theory and from certain other ideas from within quantum theory, which had its own history of exactly solvable models originating in famous works of the 1930s and 1940s by H. Bethe, L. Onsager, and other leading physicists. The remarkable and in some sense exactly solvable theory of quantum strings has been a major concern of theoretical physicists for some years now. They expected "everything in the world" from them; no doubt they will at least get something useful. Strings remain a focus of attention. The ideas involved are not infrequently borrowed from soliton theory--but I cannot get into that because it would require a whole separate article. Let me just say that the theory of Riemann surfaces plays a fundamental role both in the theory of strings and in the so-called conformal field theory (CFT), growing out of the work of A. Polyakov a n d his colleagues A. Zamolodchikov, G. Belavin, and V. Knizhnik, in the early 1980s at the Landau Institute [4]. What moral might I draw for the reader? I do not know if I have succeeded in convincing anybody of anything; but for myself, reflecting on my experiences in science and those of friends and colleagues I worked with or learned from, I am led to this general conclusion: The finding of new exactly solvable models has been centrally involved in the evolution of theoretical physics and mathematics; its role has waxed and waned. It certainly seems that over the last 20 years, after an appreciable break, completely new solvable models came to a place of honor in our fields; and yet it may be that now there will follow an appreciable period of consolidation, centered on the development and application of this body of ideas, without the introduction of essentially new basic models.
References
1. R. Bullow, Solitons (M. Lokshmanan, ed.), New York: Springer-Verlag (1987). 2. Handbook of Mathematical Functions (M. Abramowitz and I. Stegun, eds), New York: Dover Publications, Inc. (1965). 3. For example, S. Novikov, V. Manakov, L. Pitaevskii, and V. Zakharov, Theory of Solitons, New York: Plenum (1984). 4. See Physics and Mathematics of Strings (dedicated to the memory of V. Knizhnik) (Brink, Polyakov, and Friedan, eds.), Singapore: World Scientific (1990). Steklov Mathematical Institute ul. Vavilova 42 Moscow, 117966 Russia THE MATHEMATICAL INTELLIGENCER VOL 14, NO. 4, 1992
21
Johann Bernoulli (1667-1748): His Ten Turbulent Years in Groningen Gerard Sierksma
Introduction Groningen is the capital of the northeastern province of the Netherlands. In the latter half of the seventeenth century there was animosity between the City Councillors and the Provincial Legislators because of religious differences, which resulted in hardships at the University of Groningen between 1648 and 1717. The number of professors had dwindled to about five and the Centenary passed unnoticed. Chairs were left vac a n t - t h e chair of mathematics as early as 1669. It is remarkable that the renowned mathematician Johann Bernoulli should have come to Groningen at this juncture and that he stuck it out for a decade, from 1695 to 1705. In spite of the heated polemics in which Bernoulli was involved, his stay in Groningen had a considerable influence on the development and dissemination of the " n e w method."
junior by 13 years, did not materialize either. He was supposed to prepare for a future career in the thriving family business in spices established in Basel. In 1682, when he was fifteen, Johann was, in fact, engaged in the spice trade. After about a year, however, he had grown tired of it. In his autobiography he claims, " . . . ich ware . . . . von Gott dem Herrn zu etwas anders destinirt." This destiny was to be mathematics, although he first studied medicine in Basel. Meanwhile, his brother Jakob had initiated him in the art of mathematics, which he mastered so successfully that after two years he was a match for Jakob, whose passion for mathematics he now shared. He writes, "darzu [i.e., to mathematics] ich ein sonder-
L e i b n i z ' s Nova Methodus In October 1684 Leibniz published in Acta Eruditorum his outstanding article "'Nova Methodus Pro Maximis et Minimis," wherein the foundations were laid for infinitesimal calculus. Leibniz was deliberately obscure in his account, undoubtedly trying to keep ahead of competitors and to be recognized as the true inventor of the new method. He was well aware of the fact that, in England, N e w t o n had developed similar ideas. The Swiss brothers Jakob and Johann Bernoulli in Basel were the first to appreciate the great significance of the new method. Jakob was professor of mathematics in Basel, a position he had gained very much against the wishes of his father, who had intended him for the clergy. The father's plans for Johann, Jakob's 22
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4 9 1992 Springer Verlag New York
Johann Bernoulli THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4, 1992 2 3
bahre lust bey mir versptihret." The brothers decided to write to Leibniz, then professor in Hannover, and ask him to elucidate the obscurities in his article. It so happened that Leibniz was on one of his many journeys, and the letter reached him only two years later. Johann and Jakob, impatient after such a long time, decided to make one more concerted effort to fathom the article. They succeeded and became supporters of Leibniz. Their conversion marks the first major advance of the infinitesimal calculus.
Monads
Before going further into the career of infinitesimal calculus or nova methodus, it may be profitable to dwell for a m o m e n t on its relation to Leibniz's so-called monadism. The concepts "infinitesimally small," "infinitesimally large," "'infinitesimally many," and "infinitesimally often" are essential. A series of Johann Bernoulli's lecture notes opens with a number of postulates in support of his theory of infinitesimal calculus. Two instances may suffice here: First---a quantity increased or decreased by an infinitesimally small quantity is neither increased nor decreased. Second---every curve is made from an infinite number of straight parts, each infinitesimally small. N o w Leibniz wrestled with the following problem: he asked himself what would eventually remain after splitting up a given piece of matter an infinite number of times, assuming that any quantity of matter is capable of being divided in this way. Evidently, the answer to the question should be: something not capable of further division. However, this "something" cannot be material in the Cartesian sense, but it is most definitely related to it, in the way that "force" is related to matter. Leibniz calls the noncomposite or indivisible unit a "monad." It is precisely the infinite number of infinitesimally small parts (or monads) that form the unique entity.
nates x and y the relationship between dy and dx is determined, in which y is a given function of x. At that time, Leibniz and Bernoulli did not treat dy/dx as the limit of Ay/Ax for Ax approaching zero.
Infinitesimally
Small and Large
The fact that 0/0 is meaningful caused a stir. In 1693 Bernoulli put the question to his friend Pierre Varignon to compute the expression of the form 0/0. Of course, Varignon was puzzled and did not know--in 1 7 0 1 what Leibniz meant by differential and monad. However, he did attempt an interpretation, and his letter to Leibniz of 28th November 1701 contains the following passage: 9. . by differential or infinitesimally small you understand a very small but constant and fixed quantity, such as those
Calculus
The above idea is the basis for integral calculus. The surface between a part of a curve and the horizontal axis in a plane surface is, in general, not the total of a finite number of rectangles determined by the curve, but the total or integral of an infinite number of "rectangles" of infinitesimally small width. Integral calculus, therefore, is all about determining surface areas and volumes, not by approximation, but by the unique determination by means of the summation or integration of an infinite number of infinitesimally small parts. The reverse process occurs in differential calculus, in which infinitesimally small parts, differentials, are determined. In a plane with coordi24
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
Bernoulli's letter in Dutch.
describing the proportion of the earth and the skies, or of a grain of sand and the earth; however, for me the infinitesimally small or the differential of a quantity denotes that in which it is unbounded. Infinite or undefined I call anything unbounded; infinitesimally or indefinitely small (in proportion to a given quantity) I call that in which it is unbounded. I have concluded from the above that in the calculus the expressions:infinite, indefinite, unbounded in size, larger than any given quantity, undefinably large on the one hand, and the terms infinitesimally or indefinitely small, smaller than any given quantity, undefinably small on the other hand, are exactly synonymous. I beg your opinion on this matter in order to stop those who are opposed to this calculus, and abuse your name and deceive the u n i n f o r m e d . . .
tions; no need, therefore, to agree that there are in the material world lines that are infinitesimally thin in the strictest sense of the word compared with our everyday lines. Hoping to make myself generally understood I attempted to avoid these fine points and contented myself with explicating that which is infinite by means of that which is incommensurable, i.e. pointing quantities that are incommensurably greater or smaller than those known to us. For in this way, it is possible to create as many degrees of incommensurable quantities as desired, provided that an incommensurably smaller element is left out in the calculation and determination of an incommensurably greater one. A particle of magnetic material going through glass, e.g., is not commensurable to a grain of sand, which is incommensurable to the globe, which, in turn is incommensurable again to the f i r m a m e n t . . .
Leibniz replied o n 2 February 1702 in the following vein:
T h e m o d e r n d e f i n i t i o n of t h e c o n c e p t of " d i f ferential" cannot be given before a strict definition of the concept of "limit" has b e e n f o u n d . This, h o w e v e r , was to take m a n y a year.
I do not precisely recall the expressions I used, but it was my intention to prove that there is no need to make mathematical analysis dependent on metaphysical contradic-
Bernoulli and Religion Referring to his earliest years, J o h a n n Bernoulli writes in his autobiography, I was born in Basel on July 27 of the year 1667, the tenth child of my father Nicolai Bernoulli and my mother Marguerite Sch6nauer, who spared no trouble or expense to give me a proper education in both morals and religion. If I have not taken full advantage of it they are not to blame, but I am. J o h a n n came from a d e v o u t Calvinist family. D u r i n g the Eighty Years' War his g r a n d f a t h e r h a d fled f r o m A n t w e r p to Basel in order to escape from the Inquisition a n d the Duke of Alva. J o h a n n r e m a i n e d a fervent believer for the rest of his life, a l t h o u g h he was to be accused of grievous heresies in G r o n i n g e n , as we shall see. J o h a n n Bernoulli's a u t o b i o g r a p h y is full of p i o u s outpourings. A few instances are w o r t h quoting. O n e e v e n i n g in 1690, w h e n r i d i n g a l o n g t h e Lake of Geneva, h e fell into a ravine, horse and all. In his thankfulness to G o d for k e e p i n g him a n d his h o r s e f r o m s e r i o u s h a r m he w r o t e : " . . . . wie ich d a n meinen Sch6pfer ..... alle Tag in m e i n e n G e b e t t hertzinniglich dancke . . . . . lobe u n d p r e y s e . " In 1697 he c o m m e n t e d o n the birth of his s e c o n d child, a daughter: " . . . . m o r g e n s u m b 10 U h r e n hatt der liebe Gott meine geliebte H a u s f r a u ihrer Leibsburdi entbunden und uns mit einem T6chterlin erfreuet." His joy was short-lived. After six weeks the child died: " . . . . dies liebe Kind ist u n s w i d r u m b zu unsereran gr6ssten S c h m e r z e n d u r c h d e n zeitlichen Tod hingerafft w o r d e n . Gott v e r l e y h e u n s allen eine selige Nachfolg." J o h a n n m a d e the most of a serious illness in Gron i n g e n a n d writes that " . . . the physicians and others THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
25
had not much hope for m y recovery because the rumour had already gone through the city that I was dead. I myself had already accommodated to my blissful death and had Gott meinem Sch6pfer umb eine gn/idige AuflOsing a n g e f l e h e t . . . "
De L'H6pital's Plagiarism When Bernoulli fell, rider and horse, into the ravine, he was on his way to Paris, where he was to stay from 1691 to 1692. In Paris Bernoulli gained academic recognition. He mixed with the circle of Malebranche the philosopher, and of the mathematicians the Marquis De l'H6pital and Pierre Varignon. His correspondence with De l'H6pital and Varignon is comprised of two hefty volumes. From 1693 to his death Leibniz, whose admiration for Bernoulli became greater and greater, received as m a n y as 283 letters which still survive. De l'H6pital was very much interested in the infinitesimal calculus and studied under Johann Bernoulli twice a week. Back in Basel, Bernoulli took up the study of medicine at first, but he kept in touch with De l'H6pital by letter and gave a sort of correspondence course in advanced calculus, at a substantial fee (50% of a professor's salary). Meanwhile De l'H6pital wrote his famous book Analyse des Infiniment Petits, published in 1696. Bernoulli felt outraged w h e n he noticed that De l'H6pital had not mentioned his name anywhere in the book except for the preface, in which it said, "And then I am obliged to the gentlemen Bernoulli for their many bright ideas; particularly to the younger Mr. Bernoulli, who is now a professor in Groningen." Bernoulli employed every possible means to convince the world of his true authorship. But the damage had been done: nobody believed him and to this very day the theorem which does in actual fact belong to Bernoulli is called the Theorem of De l'H6pital. It was finally proved in 1922 that Bernoulli was right and that De l'H6pital had stolen his ideas. The facts are that in the Basel University Library there is a copy of Bernoulli's lecture notes which, considering the mistakes occurring in it, must certainly date from the period of Bernoulli's correspondence with De l'H6pital. No doubt, Bernoulli's n e p h e w Nicolaus made this copy to prepare himself in Basel for his mathematical studies with his uncle in Groningen. De l'H6pital's book is virtually identical to the lecture notes, except for a few remarkable differences. There are, for instance, a number of mistakes in the notes that do not appear in the book. One of these is that the integral of (1/x)dx is taken to be infinity instead of log x, a mistake which De l'H6pital avoided in his book.
Bernoulli Employed by De l'H6pital As a rule, Bernoulli was not averse to a strong fight against any form of real or supposed injustice suffered 26
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
by himself. It is all the more remarkable that he waited eight years---in fact til De l'H6pital's death in 1704 before openly accusing De l'H6pitaI of plagiarism. In the meantime, he had informed Leibniz and Huygens of the state of affairs. Both gentlemen seem to have credited him. Not so Flores de Boer, a professor from Groningen, who wrote in his inaugural speech of 1896 that "it is now almost universally held that this was one of Bernoulli's not infrequent fibs." Why had Bernoulli delayed the disclosure for such a long time? In France hardly anybody would have believed Bernoulli, first because De l'H6pital was a mathematician of great distinction and second because Bernoulli's unrivaled ambition was common knowledge there. But there was more to it than that. In 1694, Johann Bernoulli and De l'H6pital had reached an agreement. De l'H6pital was to pay an annual sum of 300 francs (which was raised some time afterwards). In exchange De l'H6pital demanded: 1. That Bernoulli was to solve all mathematical problems submitted by De l'H6pital. 2. That Bernoulli should not disclose his findings to anybody but De l'HOpital. 3. That Bernoulli should not pass on to Varignon or others copies of his writings done for De l'H6pital. To all intents and purposes, De l'H6pital had hired Bernoulli. However, the second term of the agreement meant that Bernoulli could hardly publish his own work and had actually sold his discoveries to De l'H6pital. Money must have counted for much, and the privilege, for a young man, of working for such an eminent scientist may also have induced Bernoulli to enter into this agreement with De I'H6pital. At the same time, the situation must have frustrated Bernoulli's ambitions and he must have felt deeply uncomfortable. It is only natural that he could not simply resign himself to De l'H6pital's claiming all the credit and underhandedly passing on to Huygens in Holland solutions of problems which had been solved by Bernoulli.
His Appointment in Groningen Despite BernouIli's predicament his incidental work was increasingly well received, and it gradually dawned on De l'H6pital that it would not do "to harness a lion to his cart," as he said himself. Therefore, he supported Huygens's attempts to recruit Bernoulli for Groningen, where the chair in mathematics was vacant. At the provincial Convention of 4th March 1693 the city delegates insisted that three professors should be appointed without delay: a theologian, a lawyer, and a mathematician. In April 1695 the Rector Magnificus or Vice Chancellor, Johannes Braun, wrote the letter of invitation, in which he offered a generous
salary of 1,000 to 1,200 Dutch guilders. In the letter Braun referred to brother Jakob Bernoulli's visit to Groningen. Jakob had stayed with Braun on that occasion and the learned discourse had made a profound impression on Braun. Realizing that Bernoulli was also a medical man, Braun not only offered a handsome salary, but also offered Bernoulli the possibility of occuping his time with "'Praxis der Medizin.'" This included the opportunity of working in the botanical garden. The following passage from Acta Senatus Academia of 1702 shows that Bernoulli's position as medical man was just as important as his mathematical status: . . . . that the Senatus Academia would depute two gentlemen from their midst to have the dead body of Meyd opened, if need be . . . . The gentlemen Lammers and Bernoulli were subsequently commissioned to make Inspectionem Ocularum Cadaveris, in order to establish whether there were any traces of violent death . . . . It is clear that Bernoulli, together with the professor of medicine, Lammers, was sent out to conduct a postmortem on the b o d y of Meyd. Bernoulli, 27 years old, married and holding a doctor's degree, rather liked the idea of a professorship in Groningen. In Basel, where he was the town architect, there was little chance of a professorship, as his eider brother Jakob held the chair in mathematics. He would have preferred to work near Leibniz, and the opportunity to go there offered itself, but he opted for Groningen after all, because, as he writes: ich den Herren Curatoren der Universit~t zu Groningen mein Wort albereyt von mir gegeben, und meine Hardes und Bagage zu Groningen albereyt angelangt waren. [I had already given the administration of the University my word, and my goods had already arrived in Groningen.] Leibniz was highly pleased with the prospect of Bernoulli propagating calculus in Holland, "where there are so many mathematicians." In December 1695 Bernoulli wrote to Leibniz from Groningen: "'I cannot possibly imagine w h y you wrote about the numerous mathematicians supposed to be living here. I have not come across one single mathematician, not even a mediocre one.'" Leibniz was amazed and asked who could be reading all the books that were written in the vernacular. But Groningen was not the center of the Netherlands. Bernoulli himself was quite prepared to go to Groningen, but his wife and his father-in-law objected. On 27 January 1695 their first child, Nicolai, had been born, and it is hardly surprising that his wife did not fancy making such a long journey with a baby. But Johann eventually managed to persuade his wife, and on I September 1695 they set out on their long journey, which took them right through the German armies. They sailed d o w n the Rhine to Nijmegen, continued from there to Utrecht by coach, and then to Amster-
dam by boat. In Amsterdam they went on board ship again to cross the Suydersee to Wonnen (Zwolle), and then by canal boat to Groningen, where the family arrived on 22 October.
In Groningen In 1695 the Bernoulli family managed to find a house in Oude BoteringestTaat, but later they moved to Corenrijp on the east side of Grote Markt. On November 28 the official inauguration took place: the new professor was conducted by the Vice Chancellor to the choir of the University Chapel, accompanied by the sound of trumpets. Bernoulli delivered his inaugural speech entitled "'In Laudem Matheseos,'" which was widely acclaimed. His lectures enjoyed great popularity. But appredation and antagonism grew simultaneously. Life in Groningen was made difficult for Bernoulli on two counts. He was denounced for both his theological and his scientific views, and according to Jonckbloet both the clergy and theologians "decried him for corrupting the young." What was the matter?
Witch Hunt In a disputatio, where students upheld their (undergraduate) dissertations, it was not unusual for professors to put out controversial feelers, and apparently Bernoulli had commented on the continuous metabolism of the human body. The gloves were off and a regular attack was launched by the theologian Paulus Hulsius. Bernoulli, a confirmed member of the Reformed Church, was charged with denying the resurrection of the body. "All manner of heresies are attributed to him, alien to his mind, so that there is great uproar in town," is Jonckbloet's account.
Foul Mockery The Groningen student Petrus Venhuysen caused even more furor. In defense, Bernoulli wrote a letter to the Governors of the University, entitled: "'Brief account
of the wicked accusation, shameless scorn, and foul satirical mockery poured forth upon the undersigned by student Petrus Venhuysen." Bernoulli demanded that the student's writing be banned and that all copies still with the author and the publisher be confiscated, "as is customary with libel." Bernoulli concluded: it will be essential that they be collected, the sooner the better, as they are daily distributed and will be read by students and in the coffee houses and then sent outside the Town and the Province. The letter is now in the Public Record Office in Groningen and is the one and only extant piece which Bernoulli wrote in (rather shaky) Dutch. THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
27
What exactly was the student's case and how did Bernoulli vindicate himself? He did not answer the charge of heresy at all. He simply listed all the allegations made by the student and went on to give vent to his indignation at a student who was not only slow but also presumptuous enough to argue with a worldfamous professor as with an equal9 Bernoulli again: 9 . . I would not have minded so much if he had not been one of the worst students, an utter ignoramus, not known, respected, or believed by any man of learning, and he is certainly not in a position to blacken an honest man's name and honour, let alone a professor known throughout the learned world, and distract the young from their fine studies.. 9 He adds: But he has deliberately upheld his libel, scorn, and mockery in public as a Disputatio Theologia; and some ignorant people have turned against me, to such a degree that I was inveighed against from the pulpit in a most unchristian manner9
Body and Soul The student had presented eight "Theses" allegedly denied by Bernoulli9 In the four theses mentioned here it is to be noted that the equality of body and soul is emphasized, entirely in accordance with Calvinistic doctrine: 9 That man is composed of a double substance, to wit, a rational soul and a mechanical body. 9 That between the two widely differing substances there is a natural and extremely intimate connection. 9 That sinners damned for sins committed in the body and in the soul, will be punished both in body and soul. 9 That, in order to avoid God's wrath and obtain his true mercy, we are to purify ourselves of all blemishes in body and soul. In brief, Bernoulli was accused of Cartesianism pure and simple9 Descartes, it will be remembered, regarded the body as an extension, res extensa, of the soul, res cogitans, with only the soul, spirit, or thought being of eternal value. Bernoulli was deeply hurt: Should I gainsay these theses, he were right to expose me as a wicked heretic; but the opposite was true: all my life I have professed my Reformed Christian belief, which I still do. . . . he would have me pass for an unorthodox believer, a very heretic; indeed, very wickedly he seeks to make me an abomination to the world, and to expose me to the vengeance of both the powers that be and the common people. He wickedly accuses me of denying Christ's suffering in the body, exclaiming: Deplorandus error novus! O deplorable new error! Venhuysen also accused Bernoulli of "opposing Calvinist faith" and "depriving the believers of their corn28
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
fort in Christ's passion." Bernoulli retorted, "This strange interpreter had better be aware that he harbours dubious thoughts, which should be removed through the belly and the necessary." Bernoulli continued in the same vein for 12 manuscript pages, and referred in passing to one Doornkamp, a student from Utrecht who had been fined 6,000 guilders for a similar allegation. Things never reached that stage for the Groningen student and there is no evidence that his writings were confiscated.
PhiIosophia Experimentalis At the same time there was opposition in the field of science. Bernoulli had i n t r o d u c e d the subject of Philosophia Experimentalis in Groningen. This involved the interpretation of physical phenomena by means of experiments. Utterly novel for many, it was, of course, unacceptable to most Groningen scientists. The new experimental philosophy was objectionable to scientists of the Cartesian persuasion and Calvinists alike. The Cartesians naturally highlighted "reason," and held the view that res extensa, the world of sensory perception, is of minor importance; the Calvinists attempted to fathom God's underlying plan by scrupulously analyzing natural phenomena. Interpretations of these natural phenomena alone would be incompatible with either. Then, Johann's brother Jakob Bernoulli had found that comets move in an orbit around the sun and are therefore visible at regular intervals. Consequently, the appearance of a comet could not very well be a divine message. Jakob hastened to add, as a sop, that of course the length of the tail was positively God's hand, and he said that "God will always be able to find a comet from which He may cause a tail to grow, in order to reveal his wrath."
The University Chapel Desecrated Opposition to Philosophia Experimentalis grew fiercer, as Bernoulli conducted his experiments in the choir of the University Chapel. This was nothing out of the ordinary since anatomy lessons, for which citizens were recruited, took place in the choir too. However, both ministers and theologians clamored against the desecration of the chapel by the experiments. Bernoulli replied, "You are mad and full of envy." Then there was public outcry over money donated by the (Provincial) States for his experiments, but Bernoulli just shrugged and replied: All those who drivel about God's temple being desecrated by the experiments recently conducted by me in a most decorous manner in the Choir of the University Chapel, are either plainly unsound of mind, or must outrageously show their prejudice and spite against me and my work.
Those who are disgruntled at the generosity shown by the illustrious and mighty Governors in granting me the sum of money for the purchase of experimental instruments--they do not deserve to benefit from it and should be deemed the worst and dullest of misanthropists. Nowhere is God's power and wisdom more evident than in the study of his works, and none is better equipped for this study than the philosopher and mathematician, who tries to fathom both the nature and character of God's works. They are much to be ridiculed who scoff at philosophy and mathematics pretending the latter are of no advantage in matters of the greatest importance. In one of the stained windows in the great Hall of the present University main building Johann Bernoulli is represented together with his son Daniel, born in Groningen, with the words: "Nowhere is God's power and wisdom more evident than in the study of his works."
Not Ready The availability of instruments was certainly not something Bernoulli o w e d to the Provincial Legislators. At the Provincial Convention of May 1697 the City tabled a proposal to purchase new instruments totalling 200 guilders. The Provincial delegates were "not ready" for this, however. Bernoulli had the City Councillors on his side and managed to acquire mathematical and astronomical instruments for no less than 1,200 guilders. The voice of the City Councillors appeared to carry more weight than the Provincial Legislators. What was at the bottom of the controversy between City and Province? This controversy had also been responsible for the paralysis in the policies pursued inside the University and had caused a steady decline of the University since 1648. The Provincial Legislators' "not ready" had frustrated many decisions. Country clergy and theologians regarded the University profes-
sors as modernists who turned the University into a hotbed of pernicious Cartesianism. The other factor was Groningen's position of commercial centre, which meant that all country products had to be traded through the City, which reaped part of the profit. Ossemarkt and Vismarkt (two market squares) date from this period. In money matters in particular, the Province's "not ready" sounds like a protest against the City's economic dominance.
At Odds with Brother Jakob Johann Bernoulli's stay in Groningen roughly coincides with his disagreement with his brother Jakob. Ironically, their animosity has been most productive for the development of calculus. In December 1695 Acta Eruditorum published an article by Jakob Bernoulli containing a fierce attack on Johann and reproaching him sarcastically with arrogating to himself results that were rightfully Jakob's. Johann then submitted a range of problems in the hope that the latter had bitten off more than he could chew, but Jakob solved them all. Jakob replied with a few problems and promised that Johann could win a prize of 50 Thalers for the correct solutions. Within a few h o u r s - - a mere three minutes in a later statement--Johann decided he had found the solutions. Jakob, however, rejected them as incorrect. Johann naturally maintained that his solutions were correct and accused his brother of "wanting to cheat him and wrong the poor, for w h o m he had intended the prize." He also proposed to turn to Leibniz as a referee. This was accepted b y Jakob, but he also wanted to add Newton and De l'H6pital to the jury. However, Leibniz did not fancy the idea. The case was never to be settled by arbitration, and Johann never got his 50 Thalers. After this incident Johann continued to pour forth most unsavoury invective on Jakob. It was not until 1705, some years after his brother's death, that he admitted in passing to have made an occasional misjudgment in the fight against his brother.
The Brachystochrone Problem
Demonstration model of a cycloid (University Museum Groningen).
His judgment was far from wrong in the solution of the well-known Brachystochrone Problem which Johann Bernoulli submitted to the world of mathematicians in 1696. The problem had been formulated before by Galileo in 1638: "Let A and B be points in a vertical plane. Determine the path AMB d o w n which a movable point M, as a consequence of its weight, will move from A to B in the shortest possible time." Johann observed that it was not the straight line from A to B, and promised to offer the solution before the end of the year if nobody else had come up with it. Leibniz advised him, however, to put off the deadline to Easter 1697, the reason being that Acta Eruditorum, THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 2 9
the journal which had published the Problem, appeared outside Germany with considerable delay: this would handicap foreign mathematicians. Eighteen months later, in May 1697, Johann published the solution in Acta Eruditorum. Leibniz added a short note and remarked that "'De l'H6pital, Huygens (had he lived), Hudde, if he were still engaged in scientific work [Hudde had become mayor of Amsterdam], and Newton, if he had taken the trouble" could have solved the problem. The solution of the Brachystochrone Problem is a cycloid, a curve traced by a fixed point on a wheel rolling along a line. Johann remarked in his article that "the reader will be amazed to learn . . . . that the cycloid . . . . is the much-disputed Brachystochrone." This solution of the problem by means of calculus made a profound impression and was a major triumph for the nova methodus.
Johann Bernoulli's Merits It must be pointed out that Bernoulli's greatest significance is that, together with his brother Jakob, he expanded calculus into a mathematical theory and solved a variety of problems by means of it. Leibniz writes somewhere that calculus owes as much to the brothers Johann and Jakob Bernoulli as to himself. The lion's share of this tribute should go to Johann. To him we owe the full integration of all rational functions, and differentiating and integrating exponential functions. He was the first to apply the separation of variables in a differential equation. This greater share may be ascribed to Johann's longer active career but also to his greater brilliance. Jakob Bernoulli was more thorough and exhaustive, but Johann was more inventive; for all his brilliance he lacked precision so that he did occasionally blunder, which cannot be said of his brother.
Back to Basel Early in 1705 the Bernoulli family received a letter from Basel telling them that Johann's father-in-law was pining for his daughter and grandchildren. Under this familial pressure they returned to Basel on 18 August 1705. The company of parents and four children also included nephew Nicolaus, who had studied mathematics in Groningen. The University of Utrecht made a strong effort to lure the travelling Bernoulli to Utrecht. In fact, the Vice Chancellor and professor of rhetoric Pieter Burman chased him all the way to Frankfurt to persuade him. To no avail. Johann wanted to return to Basel where, his brother being dead, the vacant chair in mathematics awaited him. Bernoulli's father-in-law
30
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
was delighted at the return of the children; he died three years later, in 1708. Johann Bernoulli was an active mathematician until the day of his death, and was w e d d e d to Basel. He declined many a lucrative post including Leiden and, once more, Utrecht. In 1717 he received a very attractive offer from Groningen again, and he was assured that his previous ample income was to be raised periodically. He wrote, "I must have given great satisfaction in my first time" (in Groningen). Johann Bernoulli died on 1 January 1748. He h a d e a r n e d h i m s e l f t h e r e p u t a t i o n of t h e Archimedes of his age, and that title is also mentioned on his tombstone. Bernoulli was a fellow of the Academies of Science of Paris, Berlin, London, St. Petersburg, and Bologna. His numerous writings were published in the Journal des Savants, in Acta Eruditorum, and in the works of the Academies whose fellow he had been. In his own lifetime his Opera Omnia were published in four volumes by Gabriel Cramer. His correspondence with De l'H6pital, Varignon, Leibniz, and Euler is currently being edited for the "Bernoulli Edition" in Basel led by Dr. Fritz Nagel. The opening pages of the Opera Omnia show Bernoulli's portrait with a motto written by Voltaire: His mind saw truth His heart knew justice He was the glory of Switzerland And of all mankind.
Bibliography 1. Acta Senatus Academia 1671-1718. 1702: contains, among others, the passage on the autopsy. 2. E. T. Bell, The Development of Mathematics. New York: McGraw-Hill (1940). 3. Jakob Bernoulli: Reisebuch, Stambuch. Unpublished translation of Jakob's diaries. The diaries are in the Basel University Library. 4. Johann Bernoulli (1702). "Brief account of the wicked accusation, shameless scorn and foul satirical mockery poured forth upon the undersigned by the student Petrus Venhuijsen, in his disputatio de Unione Anima cum corpore, held on 15 February 1702" (Dutch). This letter to the 'Edel Moog. Heeren Curatoren der Provincialen Academie'--the High and Mighty Governors of the Provincial Academy--is now in the Public Record Office in Groningen. In the same file is a letter of the Rector Magnificus, Bernoulli's friend Johannes Braunius, concerning the Venhuysen-Bernoulli controversy. 5. F. de Boer, De Familie Bernoulli in de Geschiedenis der Wiskunde. J. B. Wolters, Groningen (1846). 6. G. A. Ensched6, Oratio de Joh. Bernoulli, eximio mathematico (1852). Oration delivered on his retirement as vicechancellor. Ensched6 was Professor of Mathematics in Groningen from 1843 to 1881. 7. J. O. Fleckenstein, Johann und Jakob Bernoulli. Basel: Verlag Birkhauser (1949).
8. H. H. Goldstine, A history of the Calculus of Variations from the 17th through the 19th Century. New York: SpringerVerlag, (1980). 9. J. E. Hofmann, Johannis Bernoulli, Opera Omnia. Hildescheim: Georg Olms Verlag-buchhandlung (1968). 10. A. Hooper, Makers of Mathematics. New York: Random House (1948), 325-350. 11. W. J. A. Jonckbloet, Gedenkboek der Hoogeschool te Groningen ter gelegenheid van haar vifde halve eeuwfeest, (1864). 12. J. Mahrenholz, Anekdoten aus dem Leben Deutscher Mathematiker, Leibzig und Berlin: Verlag und Druck von B. G. Teubner (1936). 13. H. Meschkowski, Problemgeschichte der Mathematik II. ZOrich: Bibliographisches Institut (1986). 14. F. Sassen, 350 Jaren wijsgerig Onderwijs te Groningen. In: Gronings Universiteitsblad, 15e jaargang nr. III (1965). 15. P. Schafheitlin, Johannes (I) Bernoullii Lectiones de calculo Differentialium. Verhandlungen des Naturforschenden GeseUschaft in Basel, Band 1. Birkh~user Verlag Basel. Vol. 34, (1922), 1-32. An edition of a manuscript, believed to be lost, on calculus by Nicolaus Bernoulli, Johann's nephew. The manuscript is in the Basel University Library.
16. Johannes Bernoulli, "Der Selbstbiographie yon Johannes Bernoulli I." In: Gedenkbuch der Familie Bernoulli, 16221922. Basel: Verlag Von Helbing und Lichtenhahn. 17. D. Speiser, Der Briefwechsel yon Johann I Bernoulli. Band 2. Der Briefwechsel mit Pierre Varignon, Erster Teih 1692-1702. Basel: Birkh~iuser Verlag (1988). 18. O. Spiess, Johannes Bernoulli und seine SfJhne, Atlantis, (1940), 663-669. 19. O. Spiess, Der Briefwechsel von Johann Bernoulli, Herausgegeben von der Naturforschenden Gesellschaft in Basel, Band 1. Birkh/iuser Verlag Basel (1955). 20. J. A. Vollgraff, De Kromme van Johann Bernoulli volgens Christiaan Huygens en anderen, of Zqn en Worden in de Wiskunde en het Leven. De Tijdstroom Lochem (1945). 21. R. Wolf, Biographien zur Kulturgeschichte der Schweiz; Johannes Bernoulli yon Basel, Verlag Von Drell (1859). Contains the French autobiography of Johann Bernoulli. Department of Econometrics University of Groningen P.O. Box 800 9700 A V Groningen, The Netherlands
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
31
Karen V. H. Parshall*
Li Shanlan (1811-1882) and Chinese Traditional Mathematics Jean-Claude Martzloff Li Shanlan, one of the last representatives of Chinese traditional mathematics, passed away 110 years ago on 9 December 1882. At the time of Li's death, Imperial China had begun its slow demise, overwhelmed by innumerable natural and human disasters. Less than 40 years later, mathematicians of Republican China had progressively joined the international mathematical community and had begun to practice mathematics in what had become the international style. They used the same symbols, raised the same questions, and worked within the same paradigms. At the same time, Chinese traditional mathematics sank rapidly into oblivion and became a subject of mere historical interest. Unbelievable as it may seem, however, Chinese traditional mathematics later enjoyed a certain resurgence, due not to nostalgic historians but to professional mathematicians well aware of the intellectual standards of their discipline. In 1937, the Hungarian mathematician, Gy6rgy Szekeres, took refuge in Shanghai in the hope of escaping the rising Nazi threat. There, he happened to meet Zhang Yong (1911-1939), a young polyglot Chinese mathematician who was born in Aberdeen, Scotland and who had studied mathematics at G6ttingen University in Germany. Influenced by Otto Neugebauer, the eminent historian of Babylonian mathematics and astronomy, the young Chinese mathematician took up Chinese traditional mathematics and reported his findings to his Hungarian friend. Szekeres was particularly impressed by the then unheard-of combinatorial formula
j=o\l] \
2k
k k) 2'
where the (~) are the usual binomial coefficients (p) = n(n- 1 ) . . . ( n - p +
1)
p! . Zhang Yong had discovered this fact, deeply hidden and stated without proof, in the collected mathematical works of Li Shanlan. Entitled the Zeguxi zhai suanxue [Mathematicsfrom the Zeguxi Studio], this book was published at Nanking in 1867. [Zeguxi was the name of Li Shanlan's studio and means "devoted to the imitation of the ancient (Chinese) tradition."]
* Column Editor's address: Departments of M a t h e m a t i c s a n d H i s t o r y , U n i v e r s i t y of V i r g i n i a , C h a r l o t t e s v i l l e , V A 22903 USA.
32 THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4 9 1992 Springer-Verlag New York
Puzzled by the finding, Szekeres tried in vain to prove the recalcitrant formula. He related the problem to his Hungarian colleague, Paul Turin, who ultimately succeeded in supplying the missing proof. Tur~in's overly complicated demonstration was eventuaUy published in Chinese in the journal, Kexue [Science], in November of 1939 [1]. It hinged on nothing less than properties of certain differential equations and Legendre polynomials, which, it can safely be assumed, did not belong to Li Shanlan's mathematical arsenal. Several easier proofs of Li Shanlan's formula appeared more than 10 years later (in 1954, 1955, and 1956) in the Hungarian Matematikai Lapok and, still later, in various journals as well as in mathematical books like John Riordan's Combinatorial Identities, where the question is settled in just a few lines of text [2]. In all, more than 15 proofs of the formula have been published to date. In these various sources, Li Shanlan's name appears in a bewildering variety of more or less fanciful spellings such as Shoo Le-Jen, Li Jen-Shu, and Li Zsen-Su. In fact, the variations of "Le" and "Jen-Shu" (or Shoo) correspond, respectively, to Li Shanlan's name (xing)--Li--and to his alias (bie hao)--Renshu. "Shanlan," on the other hand, is his "school name" (xiang ruing). These variations aside, Li Shanlan is also known under other special names, but present-day historians of Chinese mathematics generally refer to him as "Li Shanlan" in an effort to avoid confusion. The Life of Li S h a n l a n Li Shanlan was born on 2 January 1811 in the city of Shanshi, Haining Prefecture, Zhejiang Province [3]. He belonged to a somewhat learned family, whose ancestors can be traced back to the end of the Southern Song dynasty (1279) and whose resources were sufficient to offer him a classical education. Under the guidance of Chen H u a n (1786-1863), a renowned scholar and disciple of the celebrated philologist Duan Yucai (1735-1815), Li was initiated into the subtleties not only of eight-legged essays but also of the Confucian classics, the age-old basis of the Chinese civil service recruitment examinations. In such a course of study, one based essentially on rote learning and the imitation of canonical literary models, no room remained for mathematics. Li Shanlan, however, happened to discover a copy of the Jiuzhang suanshu [Computational Rules in Nine Chapters] in his father's library and took a liking to this bible of Chinese traditional mathematics (or rather arithmetic), which dated from the Han dynasty (206 a.C.-A.D 220). As one edifying anecdote would have it, "although Lin Shanlan was then only eight years old, he was able to master the book on his o w n " [4]. Still, although contemporary historians of Chinese mathematics highly praise the venerable Han manual, Li Shanlan wrote later in life that he found it
Figure 1. Li Shanlan. hardly worthy of attention. In a similar vein, it is reported that five years after studying the Jiuzhang suanshu, Li Shanlan also quickly mastered the beginning of Euclid's Elements. The first six books of this Greek classic had been translated into Chinese in 1607 from Clavius' Euclidis Elementorum (1574) by the Italian Jesuit and member of the China mission Matteo Ricci (15521610), as well as by Xu Guangpi (1562-1633), the successful proponent of the reform of Chinese astronomy along the lines of the European Ptolemaic and Tychonic models. In 1825, Li Shanlan competed in the prestigious triennial provincial examination held at Hangzhou, the capital of Zhejiang Province. There, the candidates spent three days and two nights in uninterrupted seclusion within the examination compound. Despite several attempts, Li Shanlan never passed this examination and so faced an uncertain future. This bitter experience did have at least one positive consequence, however. During the examination period, Hangzhou became a meeting place for famous scholars and candidates. With their help, Li Shanlan became increasingly aware of the ancient Chinese mathematical tradition. In particular, he eventually obtained rare books such as the Ceyuan haijing [Sea Mirror of (Inscribed and Circumscribed) Circle Computations] (1248). Written by Li Zhi (1192-1279), this text formed the very basis of Chinese medieval algebra, an algebra which dealt with negative numbers, numerical polynomials of high degree, and even rational functions. Somewhat later, Li Shanlan also obtained a handwritten copy of Zhu ShiTHE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
33
jie's Siyuan Yujian [Jade Mirror of the Four Primordialities (= Unknowns)]. Greatly impressed by the content of these books, Li Shanlan viewed their algebraical techniques as the secret key (mi yao) to solving all sorts of mathematical problems. He thus began to write his own mathematical treatises, all of which were greeted enthusiastically by the local "invisible college" of mathematicians. But sharing his passion for mathematics with adepts around Hangzhou did not suffice to ensure Li a decent life, and he was obliged to seek other means for actually earning a living. Around 1845, Li managed to secure the position of tutor in the Lufei family. [Lufei (?-1790) was a scholarofficial in charge of the famous Imperial Manuscript Library (Siku quanshu) of Emperor Qian Long.] Then, in 1852, he went to Shanghai, where he served for eight years as a co-translator of Western scientific works at the Protestant London Missionary Society. Strange as it may seem, however, Li knew no foreign language and never mastered one. As in the case of the learned foreigners from the Buddhist kingdoms of Central Asia, who had engaged some 1500 years earlier in Chinese translations of Buddhist sutras, the content of the originals was conveyed orally by the foreign member of a two- (sometimes more) person translation team, and the Chinese scholar recorded it in the appropriate literary style. The same method had also been used 250 years earlier, w h e n Jesuit missionaries began to translate European religious and scientific manuals into Chinese. From 1852 to 1859, Li Shanlan became a sort of "jackof-all-trades" translator, working often on several projects simultaneously. Each morning, he would tackle, say, Euclid's Elements, and he would spend the afternoon rendering some other, not necessarily mathematical, work into Chinese. His co-workers were the Englishmen Alexander Wylie (1815-1887), Joseph Edkins (1823-1905), and Alexander Wflliamson (1829-1890). Admittedly, the names of these men have not entered the annals of the history of mathematics. Nevertheless Wylie, despite minimal training in mathematics, did author the first history of Chinese mathematics. Entitled Jottings on the Science of the Chinese, Arithmetic (1852), this work still retains its value as a first-hand account of the subject [6]. In it, Wylie presents succinctly, and for the first time in the West, the celebrated "'Chinese remainder theorem." (In fact, the socalled "theorem" is actually an algorithm devised to solve general systems of simultaneous congruences of the first degree and not merely, as commonly held, systems with pairwise relatively prime moduli.) He also discusses the medieval Chinese "'Horner's" method and m a n y other topics worthy of interest. Wylie's text was subsequently translated into German and French. During his eight years in Shanghai, the prolific and
34
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
indefatigable Li Shanlan translated an incredible number of old and modern scientific treatises, among them: 9 the last nine books of Euclid's Elements (translated neither from the Greek original nor from Clavius but apparently from Barrow's English text (1655) into an edition of 15 rather than 13 books); 9 the Algebra of Augustus de Morgan (1806-1871) [7]; 9 the Elements of Analytical Geometry and of Differential and Integral Calculus written by Elias Loomis (18111889) [8]; 9 An Elementary Treatise on Mechanics by William Whewell (1794-1866) and the anonymous Conic Sections [9]; 9 the Outlines of Astronomy of John F. W. Herschel (1792-1871) [10]; 9 Botany by John Lindley (1799-1865) [11]. To crown it all, in 1868, Li Shanlan also commenced (but never finished) a translation of Newton's Principia. His co-translator on this project was John Fryer (18391928), a "secular missionary" of considerable sinological attainments, who was responsible for no fewer than 129 translations and who even continued to do translation work after assuming the Louis Agassiz Professorship of Oriental Languages and Literature at the University of California, Berkeley in 1896. (He held this post until his retirement in 1913.) Following this sudden influx of translations, Chinese mathematicians started pondering the respective merits of the Western and Chinese mathematical traditions. Most of them judged European algebra to be less than novel, given its similarity in content if not in form--to medieval Chinese algebra. Calculus was also considered far from original because Chinese mathematicians had already developed similarly spirited methods of their own. (These Chinese innovations originated from several developments in the theory of infinite series, which had been transmitted earlier and without proof to Chinese astronomers by the French Jesuit, Pierre Jartoux (1669-1720).) Significantly, Wylie reported in his Jottings that Li Shanlan not only viewed his own theories on logarithms as "two thousand times easier than the methods used by the Europeans," but also thought that whereas "'the Europeans can just calculate the numbers, yet they are ignorant of the principles" [12]! Clearly, the word "principles" here did not mean the same thing to Li and the Occidental mathematicians. Moreover, although Li Shanlan was certainly his day's greatest connoisseur of Euclid's Elements, he never commented on it and never used hypothetico-deductive reasoning. Still, many of the concepts--conics, various curves, mechanics, analytical geometry, etc.--were accepted at the expense of both the sinicization of notations and the invention of
a considerable number of mathematical neologisms. Although these notations have long since fallen out of use in China, a considerable amount of the technical terminology has, in fact, stood the test of time. In 1859, Li Shanlan left his post as translator in Shanghai to join the staff of the Governor of Jiangsu, Xu Youren (1800-1860), a renowned mathematician who was best known for his researches on logarithmic and trigonometric infinite series. Somewhat after Xu's death, around 1863, Li found a new position on the staff of Zeng Guofan (1811-1872), the famous Chinese reformer responsible for the suppression of the Taiping rebellion. It was Zeng, who, in 1867, sponsored the publication of Li's collected mathematical works. In 1869, at the age of 58, Li Shanlan was appointed Instructor of Mathematics and Astronomy at the recently inaugurated Tongwenguan (College of Combined Learning) in Beijing. The opening of this college represented a hard-won victory for those Chinese reformers who believed that the introduction of mathematics, technology, and foreign languages into the Chinese educational system would not only serve to remedy Chinese decadence but also provide a way to probe the secrets behind Western power. Li remained at the Tongwenguan until his death in 1882. Under the supervision of W. A. P. Martin (1827-1916) of the American Presbyterian mission, the college offered two academic cycles, one in eight years and the other in five, with the latter cycle devoted exclusively to sciences as opposed to languages. The five-year program followed this oftidal curriculum: first and second years Chinese traditional mathematics, algebra, geometry, and trigonometry; third year--physics and chemistry; fourth year--calculus; and fifth year law, astronomy, and geology. In mathematics, Li Shanlan was free to adapt his lectures to the level of his students, thanks both to the fact that he had a limited number of youngsters in his charge during his career (only about 100, who were recruited at the age of 14) and to the fact that he was the college's sole instructor of mathematics. He thus directed his efforts toward what he called a "synthesis" of Chinese and Western mathematical methods. Basing his teaching on mathematical treatises of the Yuan dynasty, Li placed a stronger emphasis on algeb r a - a n d more generally on algorithmic processes-than on discursive geometrical reasoning. Moreover, during his spare time, he worked on prime numbers (a subject previously untouched in China) and independently rediscovered Fermat's little theorem.
Li S h a n l a n ' s M a t h e m a t i c s
The mathematical content of Li Shanlan's Zeguxi zhai suanxue has been carefully analyzed by the Taiwanese mathematician and historian of Chinese mathematics,
Figure 2. Professor Li Shanlan and his mathematical class at the
Tongwenguan.
Wann-Sheng Horng, in an excellent doctoral dissertation [13]. There the interested reader will find extensive commentary as well as modern translations of relevant material on sinicized and reinterpreted conics, projective geometry, logarithms, calculus, and much more. Here, however, we shall just look briefly at the finite summation formulas, which constitute one of the
u176 }',~L'P~,=~s
T ~ =Tg ~ ~T&~: i~-Tz~=~= T--~P~'X~T~~){~ YT~
9= T ~ - ~ " T mlj~=T - - ~ T
~ 9 =l~C.~ T IB19% =T-~~P-'~.T~~
~
.
(#P)~
~
--~ epT~
A 9 ~ : I _ u 7, ~T~P:~: T~P" ~ 1 = ~ ~," I I ~ R "
~ T ' - ~ ~T~I~= T ~ T ~
Figure 3. A page of algebra from Li Shanlan's Suanxe keyz.
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 3 5
Figure 4. One of Li Shanlan's generalized "Pascal" triangles.
most fascinating topics treated by Li in his work and which, as he states vigorously in his preface, fall specifically within the framework of the ancient Chinese mathematical tradition. Li Shanlan starts from several clever generalizations of "Pascal's" triangle. He obtains these by filling the border of the so-called triangle (or even other cells as well) not with l's but with other well chosen sets of numbers while, at the same time, maintaining the ordinary Pascalian recurrence to fill the other cells. He next derives n e w summation formulas, seemingly at will, through extensive use of the obvious recurrence
(kk)+(k+l) + which he illustrates through numerous diagrams reminiscent of "figurate numbers." Making extensive use of algebraic computations in the medieval Chinese style, Li also never gives proofs. Instead, he proceeds inductively by first stating what happens w h e n a certain variable takes on the values 1, 2, 3 . . . . successively and then asserting the emergent pattern. In this way, he obtains formulas for the summation of bino36
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
mial coefficients squared or raised to other powers, arrays of numbers with recurrence formulas for Eulerian numbers and Stirling numbers of the first kind, and, just to mention one more result, the equivalent of Worpitzki's formula for the summation of the kth powers of the first n integers using binomial coefficients [14]. The latter result appeared in print independently in the West in 1883, that is, 15 years after its publication in Li's works. Of interest here, however, is not so much the obvious priority question but rather the existence of two enduring mathematical "traditions" based on two different conceptualizations of the nature of mathematics, the admissible modes of inference (logical or not), and the w a y in which mathematics should be written and taught. Li Shanlan largely based his mathematical work on models which did not tend to share the mathematical ideals of the Greeks but which were characterized instead by highly heuristic principles, proofs without words, pattern recognition, algebra and algorithms, and other components. His writings thus raise many fascinating questions about the differences between Eastern and Western views of rationality, while they provide a rich source of study not only for historians and mathematicians but also for sociologists of knowledge. References
1. Zhang Yong, Duoji bilei shu zheng [Proofs and Commentaries of (Finite Summation Formulas) in the Duoji bilei], Kexue 23 (1939), 647-663. Paul Tur~n's proof appears on pp. 661-663 of Zhang's article. Here, Duoji bilei is the exact title of a book by Li Shanlan, the literal meaning of which is "Piles of Heaps Summed Analogically." Perhaps a more idiomatic translation of this would be "Finite Summation Formulas Derived by Analogical Reasoning." 2. Paul Turin, A ldnai matematika t6rt6net6nek egy prob16m~j~r6i [A Problem from Chinese Mathematics],
3.
5. 6.
7. 8.
9. 10. 11. 12.
13.
14.
Matematikai Lapok (1954), 1-6. See, also, the papers by Lajos Tak,~cs, J~inos Sur~inyi, G6za Husz~ir, and J~inos M~t6 in Matematikai Lapok (1955), 27-29; (1955), 30-35; (1955), 36-38; and (1956), 112-113, respectively. In Combinatorial Identities (New York: Wiley, 1968), John Riordan treats these ideas in a more modern setting. On Li Shanlan's life, see, for example, Arthur W. Rumreel, Eminent Chinese of the Ch'ing Period, Washington: n.p. (1943); reprint ed., Taipei: Ch'eng Wen Publishing Co. (1970), pp. 479--480; Wang Ping, Xifang lisuanxue zhi shurn [The Introduction of Western Astronomical and Mathematical Sciences into China], Taipei & Nankang Monograph Series No. 17, Taipei: Institute of Modern History, Academia Sinica, Republic of China (1966), pp. 144-182; and Wang Yusheng, Li Shanlan yanjiu [Researches on Li Shanlan], in Ming-Qing shuxue shi lunwrn ji [CollectedPapers in the History of Chinese Mathematics in the Ming and Qing Periods], Nanking: Jiaoyu chubanshe (1990), pp. 334-406. Wang Ping, p. 144. Alexander Wylie, Jottings on the Science of the Chinese, Arithmetic, North China Herald, Aug.-Nov. 1852, Nos. 108, 111, 112, 113, 116, 117, 119, 120, 121. Wylie's "jottings" have been reprinted often. For extensive references to these reprintings, see Joseph Needham, Science and Civilization in China, vol. 3, Cambridge: University Press (1959). Augustus DeMorgan, The Elements of Algebra Preliminary to the Differential Calculus, and Fit for the Higher Classes of Schools . . . . London: J. Taylor (1835). Elias Loomis, Elements of Analytical Geometry and of Differential and Integral Calculus, New York: Harper & Brothers (1851). In fact, according to the Dictionary of American Biography, Loomis's books were also translated into Arabic. See Allen Johnson, Dumas Malone, et al. (ed.), Dictionary of American Biography, 10 vols. and 8 suppls., New York: Charles Scribner's Sons (1927-1990), s. v. "Loomis, Elias," by David Eugene Smith. William Whewell, An Elementary Treatise on Mechanics, Cambridge: J. Deighton & Sons (1819). John F. W. Herschel, Outlines of Astronomy, London: Longman, Brown, Green, and Longmans (1849). Probably from John Lindley, An Introduction to Botany, London: Longman, Rees, Orme, Brown, Green, and Longmans (1832). Wylie, Jottings in Alexander Wylie, Chinese Researches, Taipei: Ch'eng-wen Publishing Co. (1966), 193. The exact quotation is: "Li Shen-lan [sic] . . . who has recently published a small work called Tuy-soo-tan~yuan 'Discovery of the source of logarithms,' in which he details an entirely new method for their computation, based on geometrical formulas, which he says in his introduction is 'ten thousand times easier than the methods used by Europeans,' and that 'although they can just calculate the numbers, yet they [i.e., the Europeans] are ignorant of the principle.' " Wann-Sheng Horng, "Li Shanlan, the impact of Western mathematics in China during the late 19th century," Ph.D. dissertation, The City University of New York, March, 1991. J. Worpitzki, "Studien fiber die Bernoullischen und Eulerschen Zahlen," J. Reine Ange. Math. 94 (1883), 202-232.
C.N.R.S., U.A. 1063 Institut des Hautes l~tudes Chinoises 52 rue du Cardinal-Lemoine 75231 Paris Cedex France THE MATHEMATICAL 1NTELLIGENCER VOL. 14, NO. 4, 1992
37
My Collaboration with Julia Robinson Yuri Matijasevich
The name of Julia Robinson cannot be separated from Hilbert's tenth problem. This is one of the 23 problems stated by David Hilbert in 1900. The section of his famous address [4] devoted to the tenth problem is so short that it can be cited here in full: 10. DETERMINATION OF THE SOLVABILITY OF A DIOPHANTINE EQUATION Given a Diophantine equation with any number of unknown quantities and with rational integral numerical coefficients: To devise a process according to which it can be determined by a finite number of operations whether the equation is solvable in rational integers.
The tenth problem is the only one of the 23 problems that is (in today's terminology) a decision problem; i.e., a problem consisting of infinitely many individual problems each of which requires a definite answer: YES or NO. The heart of a decision problem is the requirement to find a single method that will give an answer to any individual subproblem. Since Diophantus's time, number-theorists have found solutions for a large number of Diophantine equations and also have established the unsolvability of a large number of other equations. Unfortunately, for different classes of equations and even for different individual equations, it w a s necessary to invent different specific methods. In the tenth problem, Hilbert asks for a universal method for deciding the solvability of Diophantine equations. A decision problem can be solved in a positive or in a negative sense, by discovering a proper algorithm or by showing that none exists. The general mathematical notion of algorithm was developed by A. Church, K. G6del, A. Turing, E. Post, and other logicians only 30 years later, but, in his lecture [4], Hilbert foresaw the possibility of negative solutions to some mathematical problems. I have to start the story of my collaboration with Julia Robinson by telling about my own involvement in the study of Hilbert's tenth problem. I heard about it for the first time at the end of 1965 when I was a sopho38
more in the Department of Mathematics and Mechanics of Leningrad State University. At that time I had already obtained my first results concerning Post's canonical systems, and I asked m y scientific adviser, Sergel Maslov (see [3]), what to do next. He answered: " T r y to p r o v e the a l g o r i t h m i c unsolvability of Diophantine equations. This problem is known as Hilbert's tenth problem, but that does not matter to y o u . " - - " B u t I haven't learned any proof of the unsolvability of any decision problem."--"That also does not matter. Unsolvability is nowadays usually proved by reducing a problem already k n o w n to be unsolvable to the problem whose unsolvability one needs to establish, and you understand the technique of reduction well e n o u g h . " - - " W h a t should I read in advance?"-"Well, there are some papers by American mathematicians about Hilbert's tenth problem, but you need not
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4 9 1992 Springer-Verlag New York
study t h e m . " - - " W h y not?"--"So far the Americans have not succeeded, so their approach is most likely inadequate." Maslov was not unique in underestimating the role of the previous work on Hilbert's tenth problem. One of these papers was by Martin Davis, Hilary Putnam, and Julia Robinson [2], and even the reviewer of it for Mathematical Reviews stated: These results are superficially related to Hilbert's tenth Problem on (ordinary, i.e., non-exponential) Diophantine equations. The proof of the authors' results, though very elegant, does not use recondite facts in the theory of numbers nor in the theory of r.e. [recursively enumerable] sets, and so it is likely that the present result is not closely connected with Hilbert's tenth Problem. Also it is not altogether plausible that all (ordinary) Diophantine problems are uniformly reducible to those in a fixed number of variables of fixed degree, which would be the case if all r.e. sets were Diophantine. The reviewer's skepticism arose because the authors of [2] had considered not ordinary Diophantine equations (i.e., equations of the form P ( x l , x 2. . . . .
Xm) = E2(Xl,X2 . . . . .
Xm),
(2)
where E 1 and E2 are expressions constructed from xl,x 2. . . . . x m and particular natural numbers by addition, multiplication, and exponentiation. (In contrast to the formulation of the problem as given by Hilbert, we assume that all the variables range over the natural numbers, but this is a minor technical alteration.) Besides single equations, one can also consider parametric families of equations, either Diophantine or exponential Diophantine. Such a family Q(a 1. . . . .
an,X 1. . . . .
Xm) = 0
A(a,b,c,x 1. . . . .
(1)
Xm) = O,
where P is a polynomial with integer coefficients) but a wider class of so-called exponential Diophantine equations. These are equations of the form El(Xl,X2 . . . . .
Diophantine equations. Today we know that this approach was misleading, because in 1977 Gennadii Makanin found a decision procedure for word equations. I started my investigations on Hilbert's tenth problem by showing that a broader class of word equations with additional conditions on the lengths of words is also reducible to Diophantine equations. In 1968, I published three notes on this subject. I failed to prove the algorithmic unsolvability of such extended word equations (this is still an open problem), so I then proceeded to read "'the papers by some American mathematicians" on Hilbert's tenth problem. (Sergei Adjan had initiated and edited translations into Russian of the most important papers on this subject; they were published in a single issue of Mame~arnuKa, a journal dedicated to translated papers.) After the paper by Davis, Putnam, and Robinson mentioned above, all that was needed to solve Hilbert's tenth problem in the negative sense was to show that exponentiation is Diophantine; i.e., to find a particular Diophantine equation
(3)
d e t e r m i n e s a r e l a t i o n b e t w e e n t h e parameters al . . . . . an which holds if and only if the equation has a solution in the remaining variables, called unknowns. Relations that can be defined in this w a y are called Diophantine or exponential Diophantine according to the equation used. Similarly, a set ~ of n-tuples of natural numbers is called (exponential) Diophantine if the relation "to belong to ~ " is (exponential) Diophantine. Also a function is called (exponential) Diophantine if its graph is so. Thus, in 1965 1 did not encounter even the name of Julia Robinson. Instead of suggesting that I first study her pioneer works, Maslov proposed that I try to prove the unsolvability of so-called word equations (or equations in a free semigroup) because they can be reduced to
Xm) = 0
(4)
which for given values of the parameters a,b, and c (4) has a solution in x I . . . . . x m if and only if a = bc. With the aid of such an equation, one can easily transform an arbitrary exponential Diophantine equation into an equivalent Diophantine equation with additional unknowns. As it happens, this same problem had been tackled by Julia Robinson at the beginning of the 1950s. According to "The Autobiography of Julia Robinson," an article written by her sister Constance Reid [11], Julia Robinson's interest was originally stimulated by her teacher, Alfred Tarski, who suspected that even the set of all powers of 2 is not Diophantine. Julia Robinson, however, found a sufficient condition for the existence of a Diophantine representation (4) for exponentiation; namely, to construct such an A, it is sufficient to have an equation B(a,b,x 1. . . . .
Xm) = 0
(5)
which defines a relation J(a,b) with the following properties: for any a and b, J(a,b) implies that a < bb; for any k, there exist a and b such that J(a,b) and a~>b
k.
Julia Robinson called a relation J with these two properties a relation of exponential growth; today such relations are also known as Julia Robinson predicates. My first impression of the notion of a relation of exponential growth was "what an unnatural notion," but I soon realized its important role for Hilbert's tenth problem. I decided to organize a seminar on Hilbert's tenth problem. The first meeting where I gave a survey of known results was attended by five logicians and THE MATHEMATICALINTELLIGENCER VOL. 14, NO. 4, 1992 3 ~
five number-theorists, but then the n u m b e r s of participants decreased exponentially a n d soon I was left alone. I was spending almost all m y free time trying to find a Diophantine relation of exponential growth. There was nothing w r o n g w h e n a sophomore tried to tackle a famous problem, but it looked ridiculous w h e n I continued m y attempts for years in vain. One professor began to laugh at me. Each time we met he would ask: "Have you proved the unsolvability of Hilbert's tenth problem? Not yet? But then you will not be able to graduate from the university!" Nevertheless I did graduate in 1969. M y thesis consisted of m y two early works on Post canonical systems because I h a d not done anything better in the meantime. That same year I became a post-graduate student at the Steklov Institute of Mathematics of the Academy of Sciences of the USSR (Leningrad Branch, LOMI). Of course, the subject of my study could no longer be Hilbert's tenth problem. One day in the a u t u m n of 1969 some of m y colleagues told me: " R u s h to the library. In the recent issue of the Proceedings of the American Mathematical Society there is a n e w paper by Julia Robinson!" But I was firm in putting Hilbert's tenth problem aside. I told myself: "It's nice that Julia Robinson goes on with the problem, but I cannot waste m y time on it a n y longer." So I did not r u s h to the library. S o m e w h e r e in the Mathematical H e a v e n s there m u s t have been a God or Goddess of Mathematics w h o would not let me fail to read Julia Robinson's n e w paper [15]. Because of m y early publications on the subject, I was considered a specialist on it, and so the paper was sent to me to review for Peqbepamuanbl5 ~ypuaa MameacarnuKa, the Soviet counterpart of Mathematical Reviews. Thus, I was forced to read Julia Robinson's paper, a n d on December 11, I presented it to our logic seminar at LOMI. Hilbert's tenth problem captured me again. I saw at once that Julia Robinson had a fresh a n d wonderful n e w idea. It was connected with the special form of Pell's equation x2 - (a2 -
1)y2 = 1.
(6)
Solutions (• (X1,01) ..... (Xn, On) . . . . of this equation listed in the order of growth satisfy the recurrence relations Xn+l = 2aXn -- X n - 1 , 0n+1
---- 2 a O n
(7)
00,01 . . . . .
(mod a - 1)
On ....
(8)
is 0,1,2 . . . . .
a - 2,
(9)
whereas the period of the sequence •
(a -
-
Xn
--
(a
2 ) 0 0 , X1 --
(a -
2)0 n. . . .
2)01 .....
(mod 4a - 5)
(10)
begins with 2~
.....
(11)
The main n e w idea of Julia Robinson was to synchronize the two sequences by imposing a condition G(a) which would guarantee that the length of the period of (8) is a multiple (12) of the length of the period of (10). If such a condition is Diophantine and is valid for infinitely m a n y values of a, t h e n one can easily s h o w that the relation a = 2c is Diophantine. Julia Robinson, however, was unable to find such a G and, even today, we have no direct m e t h o d for finding one. I liked the idea of synchronization very m u c h a n d tried to implement it in a slightly different situation. W h e n , in 1966, I had started m y investigations on Hilbert's tenth problem, I h a d begun to use Fibonacci n u m b e r s and had discovered (for myself) the equation x2- xy-
y2 =
+1
(13)
which plays a role similar to that of the above Pell equation; namely, Fibonacci n u m b e r s (b, and only t h e y are solutions of (13). The arithmetical properties of the sequences 0, and (b, are very similar. In particular, the sequence 0,1,3,8,21 . . . .
(14)
of Fibonacci numbers with even indices satisfies the recurrence relation {b.+l = 3(I). - 6 . - 1
(15)
similar to (7). This sequence grows like [(3 + V5)/2]" a n d can be used instead of (11) for constructing a relation of exponential growth. The role of (10) can be played by the sequence
-- 0n-1.
It is easy to see t h a t for a n y m the sequences •215 . . . . %,01 . . . . are purely periodic m o d u l o m and hence so are their linear combinations. Further, it is easy to check by induction that the period of the sequence 40 THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
0o,01 . . . . .
0, . . . .
(mod a - 3)
(16)
because it begins like (14). Moreover, for special values of a the period can be d e t e r m i n e d explicitly; namely, if a = qb2k + ~b2k+2,
(17)
then the period of (16) is exactly 0,1,3 .
. . . .
(J)2k, -- qb2k. . . . .
-- 3, -- 1.
(18)
The simple structure of the period looked very promising. I was thinking intensively in this direction, even on the night of N e w Year's eve of 1970, and contributed to the stories about absentminded mathematicians by leaving my uncle's home on New Year's day wearing his coat. On the morning of January 3, I believed I had found a polynomial B as in (5) but by the end of that day I had discovered a flaw in my work. But the next morning I managed to mend the construction. What was to be done next? As a student I had had a bad experience w h e n once I had claimed to have proved unsolvability of Hilbert's tenth problem, but during my talk found a mistake. I did not want to repeat such an embarrassment, and something in my n e w proof seemed rather suspicious to me. I thought at first that I had just managed to implement Julia Robinson's idea in a slightly different situation; however, in her construction an essential role was played by a special equation that implied one variable was exponentially greater than another. My supposed proof did not need to use such an equation at all, and that was strange. Later I realized that my construction was a dual of Julia Robinson's. In fact, I h a d f o u n d a Diophantine condition H(a) which implied that the length of the period of (16) is a multiple of the length of the period of (8).
(19)
This H, however, could not play the role of Julia Robinson's G, which resulted in an essentially different construction. I wrote out a detailed proof without finding any mistake and asked Sergei Maslov and Vladimir Lifshits to check it but not to say anything about it to anyone else. Earlier, I had planned to spend the winter holidays with my bride at a ski camp, so I left Leningrad before I got the verdict from Maslov and Lifshits. For a fortnight I was skiing, simplifying the proof, and writing the paper [6]. I tried to convey the impact of Julia Robinson's paper [15] on my work by a rather poetic Russian w o r d ~IaBe~T~,,which seems to have no direct counterpart in English, and the later English translator used plain "suggested." On my return to Leningrad I received confirmation that my proof was correct, and it was no longer secret. Several other mathematicians also checked the proof, including D. K. Faddeev and A. A. Markov, both of w h o m were famous for their ability to find errors. On 29 January 1970 at LOMI I gave my first public lecture on the solution of Hilbert's tenth problem. Among my listeners was Grigorii Tseitin, w h o shortly afterward attended a conference in Novosibirsk. He took a copy of my manuscript along and asked my
permission to present the proof in Novosibirsk. (It was probably due to this talk that the English translation of [6] erroneously gives the Siberian Branch instead of the Leningrad Branch as my address.) Among those w h o heard Tseitin's talk in Novosibirsk was John McCarthy. In "The Autobiography" [11], Julia Robinson recalls that on his return to the United States McCarthy sent her his notes on the talk. This was h o w Julia Robinson learned of my example of a Diophantine relation of exponential growth. Later, at my request, she sent me a copy of McCarthy's notes. They consisted of only a few main equations and lemmas, and I believe that only a person like Julia, w h o had already spent a lot of time intensively thinking in the same direction, would have been able to reconstruct the whole proof from these notes as she did. In fact, Julia herself was very near to completing the proof of the unsolvability of Hilbert's tenth problem. The question sometimes asked is why she did not (this question is also touched u p o n in [11]). In fact, several authors (see [7] for further references) showed that tVs can be used instead of ~b's for constructing a Diophantine relation of exponential growth. My shift from (12) to (19) redistributed the difficulty in the entire construction. The path from a D i o p h a n t i n e H to a Diophantine relation of exponential growth is not as straightforward as the path from Julia Robinson's G would have been. On the other hand, it turned out that to construct an H is much easier than to construct a G. In [6], I used for this purpose a lemma stating that ~b2[~bm~ ~b,[m.
(20)
It is not difficult to prove this remarkable property of Fibonacci numbers after it has been stated, but it seems that this beautiful fact was not discovered until 1969. My original proof of (20) was based on a theorem proved by the Soviet mathematician Nikolai Vorob'ev in 1942 but published only in the third argumented edition of his popular book [18]. (So the translator of my paper [6] made a misleading error by changing in the references the year of publication of [18] from 1969 to 1964, the year of the second edition.) I studied the new edition of Vorob'ev's book in the summer of 1969 and that theorem attracted my attention at once. I did not deduce (20) at that time, but after I read Julia Robinson's paper [15] I immediately saw that Vorob'ev's theorem could be very useful. Julia Robinson did not see the third edition of [18] until she received a copy from me in 1970. Who can tell what would have happened if Vorob'ev had included his theorem in the first edition of his book? Perhaps, Hilbert's tenth problem would have been "unsolved" a decade earlier! The Diophantine definition of the relation of exponential growth in [6] had 14 unknowns. Later I was able to reduce the number of unknowns to 5. In October 1970, Julia sent me a letter with another definition THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
41
also in only 5 unknowns 9 Having examined this con- number of unknowns and to explain the nature of m y struction, I realized that she had used a different mistake. Actually, we were constructing not a single method for reducing the number of u n k n o w n s and we equation but a system of equations in a small number could combine our ideas to get a definition in just 3 of unknowns. (Clearly, a system A = B . . . . . D = unknowns! 0 can be compressed into single equation A2 + B2 + This was the beginning of our collaboration. It was 9 9 9 + D 2 = 0). Some of the equations used in our conducted almost entirely by correspondence. At that reduction were Pell equations: time there was no electronic mail anywhere and it took three weeks for a letter to cross the ocean. One of my - d i ~ = 1, (21) letters was lost in the mail and I had to rewrite 11 pages 1. (copying machines were not available to me.) On the other hand, this situation had its own advantage: To- We can replace these two Diophantine equations by a day I have the pleasure of rereading a collection of single one: letters written in Julia's hand. Citations from these letters are incorporated into this paper. II(x __+V(1 + d l ~ l ) + (1 + dly~l)V(1 + d2~2) ) = 0, One of the corollaries of the negative solution of (22) Hilbert's tenth problem (implausible to the reviewer for Mathematical Reviews) is that there is a constant N such where the product is over all the four choices o f that, given a Diophantine equation with any number of pa- signs ---. In the remaining equations, we substitute rameters and in any number of unknowns, one can effectively X/(1 + d l ~ ) for x 1, ~(1 + d2Y~2)for x2, and eliminate transform this equation into another with the same parame- the square roots by squaring. Thus, we reduce the total ters but in only N unknowns such that both equations are number of unknowns by one by introducing x but solvable or unsolvable for the same values of the parameters. eliminating x 1 and x2. It was in the count of variables, In my lecture at the Nice International Congress of introduced and eliminated, that I made my error. Mathematicians in 1970, I reported that this N could be The situation was rather embarrassing for us betaken equal to 200. This estimate was very rough. Julia cause the result had been announced publicly. I tried and her husband, Raphael, were interested in getting to save the claimed result, but having no new ideas, I a smaller value of N, and in the above-mentioned letter was unable to reduce the number of unknowns back Julia wrote that they had obtained N = 35. Our new to 14. joint construction of a Diophantine relation of expoSoon I got a new letter from Julia. She tried to connential growth with 3 (instead of 5) u n k n o w n s auto- sole me: "I think mistakes in reasoning are m u c h matically reduced N to 33. Julia commented: "I con- worse than arithmetical ones which are sort of funny.'" sider it in the range of 'practical' number theory, since But more important, she came up with new ideas and Davenport once wrote a paper on cubic forms in 33 managed to reduce the number of unknowns to 14 variables." again, thus saving the situation. Julia sent me a detailed proof of this reduction, and We discussed for some time the proper place for it became the basis for our further work. We were ex- publishing our joint paper. I suggested the Soviet jourchanging letters and ideas and gradually reducing the nal I/I3oecmua. The idea of having a paper published in value of N further. In February 1971, I sent a new Russian was attractive to Julia. (Her paper [16] had improvement that reduced N to 26 and commented been published in the USSR in English in spite of what that now we could write equations in Latin characters is said in Mathematical Reviews.) On the other hand, she w i t h o u t subscripts for u n k n o w n s . Julia called it wanted to attract the attention of specialists in number "breaking the 'alphabetical' barrier." theory to the essentially number-theoretical results obIn August 1971, I reported to the IV International tained by logicians, so she suggested Acta Arithmetica. Congress on Logic, Methodology and Philosophy of Finally, we decided that we had enough material for Science in Bucharest on our latest result: any Diophan- several papers and would publish our first joint paper tine equation can be reduced to an equation in only 14 un- in Russian in I436ecmun and our second one someknowns [7]. At that Congress Julia and I met for the first where else in English. time. After the Congress I had the pleasure of meeting We found writing out a paper when we were half a Julia and Raphael in my native city of Leningrad. world apart quite an ordeal. Later Julia wrote to me: "It "With just 14 variables we ought to be able to know seems to me that we had little trouble in collaborating every variable personally and why it has to be there," mathematically on 4-week turn-around time but it is Julia once wrote to me. However, in March 1972 the hopeless when it comes to writing the results up. minimal number of unknowns unexpectedly jumped Namely, by the time you could answer a question, it up to 15 when she found a mistake in my count of the was no longer relevant." We decided that one of us number of variables! I would like to give the readers an would write the whole manuscript, which was then to idea of some of the techniques used for reducing the be subject to the other's criticism. Because the first 42 THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4, 1992
paper was to be in Russian, I wrote the first draft (more than 60 typewritten pages) and sent it to Julia in a u t u m n 1972. Of course, she found a n u m b e r of misprints and small errors but, in general, she approved it. The reader, however, need not search the literature for a reference to this paper because that manuscript has never been published! In May 1973, I found "a mistake in reasoning." The mistake was the use of the incorrect implication
,23,
Diophantine relation defined by (3) is also defined by the formula k
3x3y & ::lz[Pi(a,x,y) < Di(a,x,y)z < Qi(a,x,y)].
(24)
i=1
While k can be a particular large fixed number, each inequality involves only 3 unknowns. The second theorem states that we can also find polynomials F and W such that the same relation is defined by the formula
3x::lyVz[z ~ F(a,x,y) ~ W(a,x,y,z) > 0].
(25)
The entire construction collapsed. I informed Julia and she replied:
This formula also has only 3 quantifiers but the third is a (bounded) universal one. Such representations have I was completely flabbergasted by your letter of May 11. I a close connection to equations because the main techwanted to crawl under a rock and hide from myself! Somehow I had never questioned that (~) -= (b) (mod a - b). I nical result of [2] is a method for eliminating a single usually know enough not to divide by zero. I had even bounded quantifier at the cost of introducing several mentioned (asserted) it to Raphael several times, and he extra existential quantifiers and allowing exponentiahad not objected. He said he would have said 'No' if I had tion to come into the resulting purely existential forasked if it were true. I guess I would have myself if I had mula. asked! One of Julia's requests in regard to this paper was Earlier, we had discussed a similar situation, and in that her first name should be given. She had good 1971 Julia had written to me: "Almost all mathematical reason for that. I had been the Russian translator of mistakes come about from not writing out proofs and one of the fundamental papers on automatic theoremespecially making changes after the proof is written proving by John A. Robinson [12]. When the translaout." But that was not the case this time. The mistake tion appeared in 1970 in a collection of important pawas present from an early stage and was not detected pers on that subject, Soviet readers saw the names of either when one mathematician (myself) wrote out a JIx~. PO6~IHCOnas the author of a paper translated by detailed proof or w h e n another mathematician (Julia) IO. MaTn~cesnn and M. JIaBnC as the a u t h o r of another f u n d a m e n t a l paper on automatic theoremcarefully read it. Luckily, this time I was able to repair the proof on proving. In the minds of many, these three names the spot. Julia wrote: "I am very glad you sent a way were associated with the recent solution of Hilbert's around the mistake at the same time you told me about tenth problem so a number of people got the idea that it!" However, the manuscript had to be completely it was Julia Robinson who had invented the resolution principle, the main tool from [12]. To add to the conrewritten. In 1973, the prominent Soviet mathematician A. A. fusion, John Robinson in his paper thanked George Markov celebrated his seventieth birthday. His col- Robinson, whose name in Russian translation also beleagues from the Computing Center of the Academy of comes )~x~. Po6~nco,. As a student I had made "a mistake of the second Sciences of the USSR decided to publish a collection of papers in his honor. I was invited to contribute to the kind": I did not identify J. Robinson, the author of a collection. I suggested a joint paper with Julia Robin- theorem in game theory, with J. Robinson, the author son, and the editors agreed. Because of the imminent of important investigations on Hilbert's tenth problem. deadline, we had no time to discuss the manuscript. I (In fact, Julia's significant paper [13] was her only pubjust asked Julia to authorize me to write the paper and lication on game theory.) Julia's request was agreed to by the editors, and as a to send it to the editors without her approval. Later I would incorporate her suggestions on the proofsheets. result our joint paper [9] is the only Russian publication where my first name is given in full. She agreed. This short paper was a by-product of our main inSo our first joint publication [9] appeared, and it was in Russian. The paper was a by-product of our main vestigation, which was still to be published. As it had investigations on reduction of the n u m b e r of un- been decided beforehand that our second publication knowns in Diophantine equations. The first theorem should be in English, Julia wrote the new paper about stated that given a parametric Diophantine equation the reduction of the number of unknowns. Now we (3) we can effectively find polynomials with integer were able to eliminate one more variable and so had coefficients P1,D1,Q1 . . . . . Pk, Dk, Qk such that the only "a baker's dozen" of unknowns. THE MATHEMATICALINTELLIGENCER VOL. 14, NO. 4, 1992 4 3
The second paper [10] was published in A c t a A r i t h metica. We had a special reason for this choice because the whole volume was dedicated to the memory of the prominent Soviet mathematician Yu. V. Linnik, whom we both had k n o w n personally. I was introduced to him soon after showing Hilbert's tenth problem to be unsolvable. Someone had told Linnik the news beginning with one of the corollaries: "Matijasevich can construct a polynomial with integer coefficients such that the set of all natural number values assumed by this polynomial for natural number values of the variables is exactly the set of all primes." "That's wonderful," Linnik replied. "Most likely we soon shall learn a lot of new things about primes." Then it was explained to him that the main result is in fact much more general: Such a polynomial can be constructed for every recursively enumerable set, i.e., a set the elements of which can be listed in some order by an algorithm. "It's a pity," Linnik said. "Most likely, we shall not learn anything new about the primes." Since there was some interest in our forthcoming paper with the proof of a long-announced result becoming at last accessible to other researchers, numerous copies were circulated. We had exhausted our ideas but there was a chance that someone with a fresh view of the subject might improve our result. "Of course there is the possibility that someone will make a breakthrough and supersede our paper too," Julia wrote, "but we should think of that as being good for mathematics!" Raphael, on the other hand, believed that 13 unknowns would remain the best result for decades. Actually, the record fell even before our paper appeared. The required " n e w idea" turned out, as so often happens, to be an old one that had been forgotten. In this case, it was the following nice result by E. E. Kummer: the greatest p o w e r of a prime p w h i c h divides the binomial coefficient ( a +ab) is pC, where c is the n u m b e r of carries needed w h e n adding a and b written to base p. This old
result was rediscovered and reproved a number of times and I was lucky to learn it from the review of [17] in P e d p e p a m u e n b t ~ ~ y p n a . a M a m e . , a m u t c a , K u m m e r ' s theorem turned out to be an extremely powerful tool for constructing Diophantine equations with special properties. (Julia once called it "a gold mine.") It would be too technical to explain all the applications, but one of them can be given here. Let p be a fixed prime and let f be a map from {0,1. . . . . p - 1} into itself such that riO) = O. Such an f generates a function F defined by F(ana,_ 1 9 9 9 ao) = fla,)f(an_ 1) " 9 " f(a0),
(26)
w h e r e a n a n _ 1 9 9 9 a o is the n u m b e r w i t h digits an,an_ 1. . . . . a 0 to base p. Now we can easily prove t h a t F is an e x p o n e n t i a l D i o p h a n t i n e function. 44
THE MATHEMATICAL INTELLIGENCERVOL. 14, NO. 4, 1992
Namely, b = F(a) if and only if there are natural numbers co. . . . . Cp_ 1,d0. . . . . dp _ 1,k,s,U,Wo . . . . . W p _ 1, v0. . . . . Vp_l such that a =
0
*d o +
1 *d 1 +
9 ..
+
(p-
1)*dp_l,
(27)
b =ri0)*d o +f(1)*d 1 + 9 .. +f(pS =
1)*dp_l, (28) do + d1 + ... + d p _ l , (29) s = (pk+l _ 1 ) / ( p - 1), (30) U = 2s+l, (31) (u + 1)s wi udi+l + ciu di q- vi, (32) v i < u a~, (33) ci < u, (34) p,rc i. (35) =
This system has a solution with k
di = X
8i(a3P l,
(36)
l=0
where 8 i is the delta-function: 8i(i) = 1, otherwise 8i(j) = 0. In this solution
wi =
u,
(37)
(s)
k=&+l
Ci =
di ,
di-1 vi =
u,
(39)
k=O
and for any given value of k that solution is in fact unique. Kummer's theorem serves as a bridge between number theory and logic because it enables one to work with numbers as sequences of indefinite length consisting of symbols from a finite alphabet. Application of Kummer's theorem to reducing the number of unknowns resulted in a real breakthrough and, in one jump, that number dropped from 13 to 9. I wrote out a sketch of the new construction and sent it to Julia. When we met for the second time in London, Ontario, during the V International Congress on Logic, Methodology and Philosophy of Science, she confirmed that the proof was correct, so I dared to present the result in my talk [8]. We hoped to be able to publish it as an a d d e n d u m to our paper in A c t a A r i t h m e t i c a , but it turned out to be too late. In 1974, the American Mathematical Society organized a symposium on "Mathematical Developments Arising From Hilbert's Problems" at DeKalb, Illinois. I was invited to speak about the tenth problem, but my participation in the meeting did not get the necessary
approval in m y country, so Julia became the speaker on the problem; h o w e v e r she suggested that the paper for the Proceedings of the meeting be a joint one by Martin Davis a n d the two of us. Again we had the problem of an approaching deadline. So we first discussed by p h o n e w h a t topics each of us w o u l d cover. Of course, Julia a n d Martin had m u c h more communication with each other than with me. The final difficult work of combining our three contributions into a coherent exposition [1] was done by Martin. I believe that this paper t u r n e d out to be one about which Julia h a d t h o u g h t for a long time: a nontechnical introduction to m a n y results obtained by logicians in connection with Hilbert's tenth problem. Writing the paper for the Proceedings prevented me from immediately writing a paper about the n e w reduction to 9 u n k n o w n s (clearly it was m y turn to write it up). Unfortunately, Julia firmly refused to be a coauthor. She wrote: "I do not want to be a joint author on the 9 u n k n o w n s paper--I have told everyone that it is y o u r i m p r o v e m e n t and in fact I w o u l d feel silly to have m y name on it. If I could make some contribution it would be different." I am sure that w i t h o u t Julia's contribution to [10] and w i t h o u t her inspiration I would never have reduced N to 9. I was not inclined to publish the proof by myself, a n d so the result a n n o u n c e d in [9] did not appear in print with a full proof for a long time. At last James P. Jones of the University of Calgary spent half a year in Berkeley, where Julia a n d Raphael lived. He studied m y sketch and Julia's comments on it, a n d m a d e the proof available to everybody in [5]. The photo accompanying this article was taken in Calgary at the e n d of 1982 w h e n I spent three m o n t h s in Canada collaborating with James as part of a scientific exchange program between the Steklov Institute of Mathematics and Q u e e n ' s University at Kingston, Ontario. Julia at that time was very m u c h occupied with her n e w duties as President of the American Mathematical Society a n d was not very active in mathematical research, but she visited Calgary on her w a y to a meeting of the Society. Martin also came to Calgary for a few days. I conclude these reminiscences with yet another citation from Julia's letters with which I completely agree: "Actually I am very pleased that working together (thousands of miles apart) we are obviously making more progress than either one of us could alone."
References 1. Martin Davis, Yuri Matijasevich, and Julia Robinson, Hilbert's tenth problem. Diophantine equations: positive aspects of a negative solution, Proc. Syrup. Pure Math. 28 (1976), 323-378. 2. Martin Davis, Hilary Putnam, and Julia Robinson, The decision problem for exponential Diophantine equations, Ann. Math. (2) 74 (1961), 425-436. 3. G. V. Davydov, Yu. V. Matijasevich, G. E. Mints, V. P. Orevkov, A. O. Slisenko, A. V. Sochilina and N. A. Shanin, "Sergei Yur'evich Maslov" (obituary), Russian Math. Surveys 39(2) (1984), 133-135 [translated f r o m 39(236) (1984), 129-130[. 4. David Hilbert, Mathematische Probleme. Vortrag, gehalten auf dem internationalen Mathematiker Kongress zu Paris 1900, Nachr. K. Ges. Wiss., G6ttingen, Math.-Phys. Kl. (1900), 253-297. 5. James P. Jones, Universal diophantine equation, J. Symbolic Logic 47 (1982), 549-571. 6. AH CCCP 191(2) (1970), 279-282 [translated in Soviet Math. Doklady 11(20) (1970), 354-357; correction 11(6) (1970), vi]. 7. Yuri Matijasevich, On recursive unsolvability of Hilbert's tenth problem, Proceedings of Fourth International Congress on Logic, Methodology and Philosophy of Science, Bucharest, 1971, Amsterdam: North-Holland (1973), 89-110. 8. Yuri Matijasevich, Some purely mathematical results inspired by mathematical logic, Proceedings of Fifth International Congress on Logic, Methodology and Philosophy of Science, London, Ontario, 1975, Dordrecht: Reidel (1977), 121-127. 9. AH CCCP (1974), 112-123. 10. Yuri Matijasevich and Julia Robinson, Reduction of an arbitrary Diophantine equation to one in 13 unknowns, Acta Arith. 27 (1975), 521-553. 11. Constance Reid, The autobiography of Julia Robinson, College Math. J. 17 (1986), 3-21. 12. John A. Robinson, A machine-oriented logic based on the resolution principle, J. Assoc. Comput. Mach. 12 (1965), 23-41 [translated in 7 (1970), 194-218]. 13. Julia Robinson, An iterative method of solving a game, Ann. Math. (2) 54 (1951), 296-301. 14. Julia Robinson, Existential definability in arithmetic, Trans. Amer. Math. Soc. 72 (1952), 437-449. 15. Julia Robinson, Unsolvable Diophantine problems, Proc. Amer. Math. Soc. 22 (1969), 534-538. 16. Julia Robinson, Axioms for number theoretic functions, Selected Questions of Algebra and Logic (Collection Dedicated to the Memory of A. I. Mal'cev), Novosibirsk: Nauka (1973), 253-263; MR 48#8224. 17. D. Singmaster, Notes on binomial coefficients, J. London Math. Soc. 8 (1974), 545-548; (1975), 3A143. 18. N. N. Vorob'ev, Fibonacci Numbers, 2nd ed., Moscow: Nauka, 1964; 3rd ed., 1969.
Acknowledgment I am grateful to Raphael Robinson, Constance Reid, a n d Martin Davis for their help in preparing this narration for print.
Steklov Institute of Mathematics St. Petersburg Branch (LOMI) 27 Fontanka St. Petersburg, 191011 Russia THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
45
Ian Stewart* The catapult that Archimedes built, the gambling-houses that Descartes frequented in his dissolute youth, the field where Galois fought his duel, the bridge where Hamilton carved quaternions---not all of these monuments to mathematical history survive today, but the mathematician on vacation can still find many reminders of our subject's glorious and inglorious past: statues, plaques, graves, the cafd where the famous conjecture was made, the desk where the
famous initials are scratched, birthplaces, houses, memorials. Does your hometown have a mathematical tourist attraction? Have you encountered a mathematical sight on your travels? If so, we invite you to submit to this column a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks. Please send all submissions to the Mathematical Tourist Editor, Ian Stewart.
The Bernoullis in Basel David Speiser
Epitaph on the tombstone of Jacob I.
* Column Editor's address: Mathematics Institute, University of Warwick, Coventry, CV4 7AL England.
46
The Bernoulli family arrived in Basel in 1622, coming from Antwerp by way of Frankfurt am Main. Between ca. 1680 and 1800 eight members of the family were active in the city and abroad as mathematicians and physicists. Many buildings and monuments in Basel still testify to the Bernoullis' activities or bear witness to their lives. Among these are the houses in which some of them lived, public buildings where they taught, and their tombstones (which, however, have been moved from their original places). Local tradition had always preserved many of these testimonials; some years ago, Dr. F. Nagel (an editor of The Bernoulli Edition) started a systematic investigation of them. The Basel Tourist Office is planning a leaflet and a "Bernoulli Walk" across the city. All we know about the house of Jacob I (who discovered the formula for the radius of curvature, t h e Bernoulli numbers, the fundamental theorem of probability theory, the elastic curve, etc.) is that it was on the Barf~isserplatz, close to the Barfiisserkirche (today the Museum of History). His tomb was in this church, but on the occasion of a restoration in 1843, Jacob's bones and the tombstone were transferred to the cloister next to the cathedral. The lower part of the tombstone shows (by mistake) an Archimedean spiral; this should have been a logarithmic spiral, which has the property of reproducing itself under many operations (e.g., it is its own evolute). For this reason, Jacob chose it as a symbol of resurrection: eadem mutata resurgo ("in the same way, I shall be resurrected after the transformation") is the device he had engraved around the spiral. Not far from the cathedral is the present Institute of Mathematics, and a bit lower on the Rheinsprung the old university, a big yellow building where many of the Bernoullis' lectures were delivered.
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4 9 1992 Springer-Verlag New York
Imberhof.
Zur Alten Treu and plaque.
Grosser Engelhof.
Photos on this page by R. Speiser and B. Speiser.
Of J o h a n n I Bernoulli (hanging chain, brachystochrone, rule of Bernoulli-l'Hbpital, calculus of variations, hydraulics), more can still be seen: After his return from Groningen he lived in the house Zur alten Treu (To the Old Loyalty) on Nadelberg, a street still boasting many fine old houses. Johann's father had bought this house (once the home of Johannes Froben, the printer of Erasmus; and the son used it as his family home as well as for receiving his m a n y guests, to w h o m he also gave private lectures there. Thus it was also the home of his three sons. About 100 meters below, on the charming Andreasplatz, lies the Imberhof, where Nicolaus I lived for a while. On the same level with the Alte Treu, in the direction of the Peterskirche, are the Grosser Engelhof and the Kleiner Engelhof. The latter was bought by Daniel Bernoulli (hydrodynamics, theory of oscillations), who was a bachelor, and the former by Johann II Bernoulli, the father of Jakob II and Johann III. It is in this house that Johann's friend Maupertuis died in 1759, some time after his departure from Berlin. At that time, no activity of a Catholic priest was permitted in Basel, but an exception was granted, and a Capuchin monk was allowed to administer the last rites. Maupertuis was then buried, according to his last will, at Dornach, a Catholic village about 10 kilometers from Basel. Thirty-five years ago, Clifford Truesdell wrote in the Engelhof his introduction to vol.II/ll of Euler's Opera Omnia, a history of the theory of elasticity, in which the Bernoullis played such a prominent part. In the Peterskirche, to the left of the main entrance, are the tombstones of Johann I, Nicolaus I, Daniel, and Johann II Bernoulli. The church faces the Petersplatz where the new university (built only in 1939) is located. From the charming diaries of the counts Teleki, who came to Basel around 1740 to study with Daniel Bernoulli, we know that Daniel and his brother fre-
quently strolled there with their students. At the other end of the square is the Stachelsch~itzenhaus (House of the Crossbowmen), where Daniel Bernoulli conducted experiments. According to Whittaker, these experiments, as described by the Telekis and his student Abel Socin, led Bernoulli to the first conjecture (in 1760) of the 1/r2 law of electric attraction and repulsion. Next to the StachelschLitzenhaus is the University Library; today it houses the Basel office of the Bernoulli Edition, responsible for the edition of the complete works and correspondences of the Bernoullis and of Jacob Hermann. Bromhfibelweg 5 CH-4144 Arlesheim Switzerland
Stachelschiitzenhaus (House of the Crossbowmen). THE MATHEMATICAL INTELLIGENCER VOL. 14, N O . 4, 1992
47
Representation Theory of Finite Groups: from Frobenius to Brauer* Charles W. Curtis
This article is dedicated to the memory of my friend and collaborator, Irving Reiner. The representation theory of finite groups began with the pioneering research of Frobenius, Burnside, and Schur at the turn of the century. Their work was inspired in part by two largely unrelated developments which occurred earlier in the 19th century. The first was the awareness of characters of finite abelian groups, and their application by some of the great 19th-century number theorists. The second was the emergence of the structure theory of finite groups, beginning with Galois's brief outline of the main ideas in the famous letter written on the eve of his death, and continuing with the work of Sylow and others, including Frobenius himself. My aim is to give an account of some of the early work, the problems considered, the conjectures made, and then to trace a few threads in the development of the mathematical ideas from their origins to their place in Brauer's theory of modular representations.
groups of the finite field Zp = Z/pZ of residue classes = a + pZ, for a prime p. Additive characters of Zp are characters of the additive group of L , with the defining property that X(~ + ~) = •215 for all residue classes ~ and ~. These are obtained by taking powers of a pth root of unity, so X(~) = coa, where cop = 1 in C. Multiplicative characters of Zp are characters of the multiplicative group of Zp, and include Legendre's quadratic residue symbol (alp) = -+1, for a nonzero residue class ~, with (alp) = 1 if x 2 ~ a (mod p) has a solution and - 1 if not.
Characters of Finite Abelian Groups and 19thCentury Number Theory A character of a finite abelian group A is a homomorphism from A into the multiplicative group of the field C of complex numbers, in other words, a function X: A ~ C* = C - {0}, which satisfies the condition: x(ab) = x(a)•
for all a,b in A.
The simplest examples, which occur in elementary number theory, involve the additive and multiplicative * This p a p e r is an e x p a n d e d version of a Joint AMS-MAA Invited Address, presented at the A n n u a l Meetings in Louisville, in January, 1990. 48 THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4 9 1992 Springer-VerlagNew York
Gauss combined additive and multiplicative characters • and ~r, respectively, to form certain sums of roots of unity (today called Gauss sums), which have the form g(•
= ~ X(J)~r~), t # 0 in Zp.
In w of the Disquisitiones Arithmeticae [23], he derived the polynomial equations satisfied by the expressions g(• in some special cases, using information about the number of solutions of congruences, such as x n + yn = 1 (mod p). In terms of what we know now, this appears to have been a case of putting the cart before the horse. In fact, Gauss sums have proved to be fundamental for obtaining formulas for the number of solutions of a wide class of polynomial congruences, and for the more general problem of counting the number of solutions of polynomial equations over finite fields. A nice account of these matters, with historical comments, can be found in the first part of Weil's paper on the number of solutions of equations over finite fields [43] (see also [27], w Multiplicative characters were used by Dirichlet in his reinterpretation (see [11]) of some of Gauss's work on genera of binary quadratic forms, where the character-theoretic nature of the quadratic residue symbol (a/p) was applied. He also used them in his definition of L-series, and in the proof, using L-series, that certain arithmetic progressions contain infinitely many primes [10]. Dedekind edited Dirichlet's lectures on number theory for publication, and added supplements containing material of his own. In view of the different ways characters had been applied in Dirichlet's work, he called attention to the general notion of characters of abelian groups in one of the supplements (see [11], page 345, footnote, and pages 611, 612). Weber had also become interested in abelian group characters, had published a paper on them, and gave a full account of them in his Lehrbuch der Algebra [41], including their construction using the factorization of abelian groups as direct products of cyclic groups. The starting point of the representation theory of finite groups was Dedekind's work, apparently unpublished, on the factorization of the group determinant of a finite abelian group, and his suggestion, in a letter to Frobenius in 1896, that perhaps Frobenius might be interested in the same problem for general (not necessarily abelian) groups. Here is a statement of the problem. xg~}be a set of n indeterminates Let {xg} = {xg1. . . . . over the field C of complex numbers, indexed by the elements {gl . . . . . g,} of a finite group G of order n. Form the n x n matrix whose entry in the ith row and jth column is the indeterminate x ~ ,. The group determinant of G is the determinant ~) = IXgh-ll of this matrix, and is a polynomial in the indeterminates xgi,
with integer coefficients. Dedekind had proved the elegant result that, for a finite abelian group, the group determinant | factors over the complex numbers as a product of linear factors, whose coefficients are given by the different characters of • of the group: e
=
1-I (x(g)xg + xCx')xg, + . . . ) . X
As he communicated to Frobenius, he had also investigated the factorization of @ for nonabelian groups in some special cases, and had observed that, in the cases he had examined, | had irreducible factors of degree greater than one. The factorization of | is not as special a problem as it appears. It is related to the problem of factoring the characteristic polynomial, in the regular representation, of an element of the group algebra Exgg with indeterminate coefficients xg, into its irreducible factors. Exactly the same idea was used, with great success, by Killing and Cartan, and by Cartan and Molien, to obtain the structure of semisimple Lie algebras and associative algebras, over the field of complex numbers [26].
Frobenius's First Papers on Character Theory With Dedekind's letter as a spur, Ferdinand Georg Frobenius (1849-1917) burst onto the scene with three papers, published in 1896, in which he created the
Ferdinand Georg Frobenius. THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 4 9
theory of characters of finite groups, factored the group determinant for nonabelian groups, and established many of the results that have become standard in the subject. At this point in his career, he had assumed Kronecker's chair in Berlin, and was already widely known for his research on theta functions, determinants and bilinear forms, and the structure of finite groups, all of which contributed ideas he was able to use in his n e w venture. His first task in "Uber Gruppencharaktere" [16] was to define characters of nonabelian finite groups. The key to his approach was the study of the multiplicative relations satisified by the conjugacy classes {C1. . . . . Cs} in a finite group G. From his previous work in finite group theory, he was well aware of the importance of counting the numbers of solutions of equations in a group G. His starting point was the consideration of the integers {hqk}, denoting the numbers of solutions of the equations abc = 1, with a @ C i, b E Cj, and c E C k. From them, he defined a new set of integers, aqk = hi,jk/hi, where C i, = C i - 1, and h i is the number of elements in the class C i. He then made the crucial observation that the ai/k satisfy identities which imply that the bilinear multiplication defined on a vector space E over C with basis elements {e1. . . . . es} by the formulas ejek = X alike i
(1)
hjx/f=
pj,
where f is a proportionality factor, and h = ICjl as above. This is hardly an intuitively satisfying definition. Things become a little clearer if we realize, as Frobenius did, that the characters can be viewed as complex-valued class functions • : G ~ C, constant on the conjugacy classes (this is what it means to be a class function), satisfying the relations (2)
XjXk = f y~ aijk•
where • = • for x E Cj, and the constant f, called the degree of the character • is • the value of X at the identity element I of G. The algebra E, as Frobenius realized somewhat later ([18], w is isomorphic to the center of the group algebra of G, so that for abelian groups, the constants aqk describe the multiplication in the group algebra, and the equations (2) are clear generalizations of the definition, given previously, of characters of abelian groups. The first main results about characters were what are now called the orthogonality relations, for two characters • and ~, which assert that
~-/~
gEG
x(g)~(g-1) =
0ifxr
"
is associative and commutative; that is, we have ei(ejek) = (e#j)e k and e#j = eje i
for all i, j, k. This was a situation familiar to him, in view of "Ober vertauschbare Matrizen" [15]. He s u m m o n e d into play a theorem on what we now call the irreducible representations of commutative, semisimple algebras. The theorem asserts that, under a condition equivalent to semisimplicity of the algebra, there exist s = dim E linearly independent numerical solutions (Pl . . . . . Ps) of the equations (1), so that PjPk = Y~aijkPi. The condition is that det (Pke) ~ 0, where (Pke) is the matrix with entries pke = ,,7, aijkajie . i,j
He proved it, in this case, by an ingenious direct argument based on properties of the class intersection numbers {hijk}. Special cases of the result had been obtained by Dedekind, Weierstrass, and Study ([6], [42], [39]), and the definitive theorem, with a new proof, was given by Frobenius himself, in "Uber vertauschbare Matrizen" [15], the first paper in the 1896 series. The characters • = (• . . . . . • of the finite group G were defined in terms of the solutions pj of the equations (1), by the formulas ~0
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
(These involve the choice of the constant f taken above.) By the theorem used to obtain the characters, the number of different characters and the number s of conjugacy classes are the same, so the characters define an s x s matrix, called the character table of G, whose (i,j)th entry is the value of the ith character at an element in the jth conjugacy class. The orthogonality relations express the fact that, in a certain sense, the rows and columns of the character table are orthogonal. The question arises, what information about a finite group G is contained in its character table. Frobenius took up the problem himself, and it has fascinated group-theorists ever since. His first contribution to it followed easily from his approach to characters ([15], ~4). Using the orthogonality relations, he deduced a formula for the class intersection numbers hi# in terms of the character table, a result which later proved to be fundamental for applications of character theory to finite groups. Another interpretation of the orthogonality formulas is that the characters form an orthonormal basis for the vector space of class functions on G, with respect to the hermitian inner product defined by (~,~) = I G [ - I ~ ~(g)~(g), for class functions ~,B. gEG
(3)
This makes it possible to do a kind of Fourier analysis in the vector space of class functions, in which the "'Fourier coefficients" a x in the expansion of a class function ~ = ~ax• in terms of the characters, are given by the inner products a x = (~,• for each character • After establishing the foundations of character theory in the second 1896 paper, he turned, in the third, to the solution of the problem raised by Dedekind, about the factorization of the group determinant @ = IXgh-'l of a finite group G. He settled the problem with a flourish, proving that @ = I I ~ , with s irreducible factors ,I~, whose coefficients are given in terms of the s different characters of G, and the really difficult resuit, which he called the fundamental theorem in the theory of the group determinant, that the degree of each irreducible factor 9 and the multiplicity with which it occurs in the factorization of | coincide, and are both equal to the degree f of the corresponding character. He pointed out the consequence that, if n is the order of the finite group G, then n is the sum of the squares of the degrees of the characters:
n = Ef 2,
wherefx = d e g •
As we noted earlier, the definition (2) of characters of a finite group G does not have the same immediate relation to the structure of G enjoyed by the concept of characters of abelian groups. In the following year, 1897, he clarified the situation by introducing, for the first time, the concept of representation of a finite group. This he defined, as we do today, to be a homomorphism T : G ~ GLa(C), where GLd(C) is the group of invertible d x d matrices over C, and d is called the degree of the representation, so we have
T(gh) = T(g)T(h), for all g,h E G. For an abelian group, the characters, defined previously, are representations of degree one. In the general case, he defined two representations T and T' : G GLa,(C) to be equivalent if they have the same degree, d = d', and the representations T and T' are intertwined by a fixed invertible matrix X, so that T(g)X = XT'(g), or X-1T(g)X = T'(g), for all g E G; in other words, the representation T' is obtained from T by a change of basis in the underlying vector space. In particular, the matrices T(g) and T'(g) are similar, for g E G, and therefore have the same numerical invariants associated with similarity: the same set of eigenvalues, the same characteristic polynomial, trace, and determinant. The important invariant for representation theory is the trace function, •
= Trace T(g),
g ~ G,
which Frobenius called the character of the representation. The characters defined earlier by the formulas (2) turned out to be the trace functions of certain representations characterized by the irreducibility of polynomials analogous to the group determinant associated with them. False modesty was not a weakness of Frobenius. From the beginning of his research on the theory of characters, he was keenly aware of its potential importance for algebra and group theory. He was on to a good thing, and he knew it. Altogether, he published more than 20 papers between 1896 and 1907, extending the theory of characters and representations in various directions, and applying the results to finite group theory. One of the highlights among the papers published after 1896 was a deep analysis of the relation between characters of a group G and the characters of a subgroup H of G [19]. As he stated in the introduction, an understanding of this relationship is crucial for the practical computation of representations and characters---a statement as true n o w as it was then! One of the main ideas in the paper was the definition of the induced class function ~c, for a class function qJ on a subgroup H of G, by the formula THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 5 1
the groups in the infinite families consisting of the projective unimodular groups PSLa(p), for odd primes p (in [16], w the symmetric groups S, (in [20]); and the alternating groups A, (in [21]). The methods he developed for carrying out these computations involved the full range of his ideas on characters, combined with new techniques from cornbinatorics and algebra, far ahead of their time, which continue to have a strong influence on research in these areas. A comprehensive historical analysis of Frobenius's first papers on character theory, his correspondence with Dedekind, and other contemporary work in algebra and representation theory, was provided by T. Hawkins [24], [25], [26].
Character Theory and the Structure of Finite Groups: William Burnside (1852-1927)
William Burnside.
,C(g) = IHI-~ ~ ~ (xgx-'),
g ~ G,
xEG
where ~ is the function on G defined by
t~(g) =
{ t~(g) if g E H ifg~H 0
He proved the fundamental result, now called the Frobenius Reciprocity Law, which states that
(0c,0c = (~',~IH)H, for class functions qJ on H and ~ on G, respectively, where ( , )c and ( , )Hare the inner products (3) on the vector spaces of class functions on G and H, and ~ln denotes the restriction of the class function ~ to H. Using the Fourier analysis for expansions of class functions in terms of characters, the Reciprocity Law implies that qjc is the character of a representation of G if is the character of a representation of H, and gives the desired information about the relationship between characters of G and H. Frobenius relished computations, the more challenging the better, and rounded out this great series of papers with computations of the character tables of all 52
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
At about the same time that Frobenius's first papers on character theory appeared, Burnside published his treatise, Theory of Groups of Finite Order (1897). After graduating from Cambridge in 1875, Burnside had followed the Cambridge tradition in applied mathematics, with his research in hydrodynamics, until his appointment as Professor of Mathematics at Greenwich, in 1885. His work in group theory began with a paper on automorphic functions in 1892, and continued with research on discontinuous groups, and then finite groups, leading to his book [5]. At first he was not optimistic about the possible applications of representations to finite group theory. In the preface to the first edition of his book, in reply to the question of why he devoted considerable space to permutation groups, while groups of linear transformations were not referred to, he explained, "My answer to this question is that while, in the present state of our knowledge, many results in the pure theory are arrived at most readily by dealing with properties of substitution groups [i.e., groups of permutations], it would be difficult to find a result that could be most directly obtained by the consideration of groups of linear transformations." He was aware of Frobenius's work, however, and developed independently his own approach to representations and characters. It is interesting to speculate on how the work of each one influenced the other. They frequently referred to each other's work in their publications, but as far as I know, they never met, or corresponded extensively with each other. In the preface to the second edition (1911), he stated, " . . . the reason given in the original preface for omitting any account of it no longer holds good. In fact, it is more true to say that for further advances in the abstract theory one must look largely to the representation of a group by linear substitutions." Later (on p. 269, footnote), he described his indebtedness to the
work of Frobenius: "'The theory of the representation of a group of finite order as a group of linear substitutions was largely, and the allied theory of group characteristics was entirely, originated by Prof. Frobenius." He then listed the papers of Frobenius discussed in the preceding section, and continues, "In this series of memoirs Prof. Frobenius's methods are, to a considerable extent, indirect; and the same is true of two memoirs, 'On the continuous group that is defined by any given group of finite order,' I and II, Proc. L. M. S. Vol. XXIX (1898) in which the author obtained independently the chief results of Prof. Frobenius's earlier memoirs." Frobenius expressed himself on the matter, in one of his letters to Dedekind, as follows ([25], page 242; see also [29]): "'This is the same Herr Burnside who annoyed me several years ago by quickly rediscovering all the theorems ! had published on the theory of g r o u p s , in the s a m e o r d e r and w i t h o u t exception . . . . " One of Burnside's best-known achievements in group theory is the theorem, proved using character theory, that every finite group G whose order is divisible only by two primes is solvable: tGI = p~q", for primes p, q, implies that G is solvable. The p~q~theorem implies, among other things, that the order of a finite, simple, nonabelian group is divisible by at least three different prime numbers. Simple means having no nontrivial normal subgroups. Every finite group has a composition series whose factors are simple groups, so that, in a sense, simple groups are the building blocks of all finite groups. Burnside took a great interest in the classification of finite simple groups, a problem that dominated research in finite group theory until its solution in the 1980s. It was in this connection that he remarked, in note M of the second edition of his book, "There is in some respects a marked difference between groups of even and those of odd order." He went on to discuss the possible existence of nonabelian simple groups of odd order, remarking that he had shown that the number of possible prime factors of a simple group of composite odd order is at least 7. He continued with the statement: "The contrast that these results shew between groups of odd and even order suggests inevitably that nonabelian simple groups of odd order do not exist." Further progress on this problem was a long time coming. A breakthrough came with M. Suzuki's proof [40] in 1957 that there are no simple groups of composite odd order having the property that the centralizers of all nonidentity elements are abelian. In his proof, he made heavy use of a subtle extension of Frobenius's work on induced characters, called the theory of exceptional characters. The next step was the theorem of Feit, Hall, and Thompson [13], that the same result held for groups with the property that centralizers of nonidentity elements are nilpotent.
The culmination of this line of research came in 1963, with the publication by Walter Feit and John Thompson of what has become known as the odd-order paper [14], containing one theorem: All finite groups of o d d order are solvable. Although a purely group-theoretic proof (not using characters) has been found for Burnside's p~q"-theorem, the proof of the odd-order theorem contains an apparently essential component based on character theory. Feit and Thompson's proof of it takes about 250 pages of close reasoning which to this day resists significant simplification, so perhaps the 50-year wait following Burnside's statement of the problem is not so surprising.
New Foundations of Character Theory: Issai Schur (1875-1941) Issai Schur entered the University of Berlin in 1894, to study mathematics and physics. Among his instructors, he expressed special thanks, in a brief autobiographical note at the end of his dissertation, to Professors Frobenius, Fuchs, Hensel, and Schwarz. The dissertation itself, on the classification of the polynomial representations of the general linear group, was a work of such distinction as to place him at once on an equal footing with his illustrious predecessors in representation theory. The difficulty of the proofs of the main theorems in Frobenius's approach to character theory has already been mentioned. If all persons wishing to enter the field had to master the intricacies of group determinants, the representation theory of finite groups might well have remained a closed book to all but a few. Burnside's account of the foundations of the theory made important strides towards greater accessibility. In particular, he was apparently the first to take irreducible representations and complete reducibility as concepts of central importance. A representation T of a finite group G is called reducibleif it is equivalent to a representation T' of the form
T'(g) =
Tl(g) A(g) ) 0 T2(g)
for all g E G,
for representations T 1 and T2 of lower degree. If this does not occur, the representation is said to be irreducible. Using results of Loewy [31] and E. H. Moore [33] on the existence of G-invariant hermitian forms, Maschke [32] had proved that every representation T is completely reducible, that is, T is either irreducible or equivalent to a direct sum of irreducible representations. It remained to Schur, however, to give a wholly elementary and self-contained exposition of the main facts about representations and characters [36]. His THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
53
starting point was the result, n o w called Schur's Lemma, which as he pointed out had also played an important role in Burnside's account of the theory. He stated the result, in two parts, as follows: I. Let T and T' be irreducible representations of a finite group G, of degrees d and d', respectively. Let P be a constant d x d' matrix, such that
T(g)P = PT'(g),
for all g in G.
Then either P = 0, or T and T' are equivalent, and P is an invertible d x d matrix. II. The only matrices P which commute with all the matrices T(g), g ~ G, for an irreducible representation T, are scalar multiples of the identity matrix. As a consequence, he gave short, understandable proofs of the orthogonality properties of the matrix coefficient functions {aq(g)}, and for the characters, of irreducible representations T(g) = (aij(g)), g E G. He also gave a n e w proof of Maschke's theorem on complete reducibility, replacing an appeal to the existence of invariant bilinear forms by a simple, direct argument, in much the same spirit as the standard proof used today. This work, along with what were by all accounts clear and beautifully presented courses of lectures, put the subject within reach, for students and professional mathematicians, without requiring a specialized background. One of his students, Walter Ledermann, remarking on the popularity of his lectures, recalls attending his algebra course in a lecture theater filled with about 400 students, and sometimes having to use opera glasses to follow the speaker when he was unlucky enough to get a seat in the back [30]. Schur's research opened up two more important lines of investigation. In the first [38], he introduced what are called projective representations of a finite group G, that is, homomorphisms ~ from G into the projective general linear group PGL,(C) = GLn(C)/{scalars}. He analyzed precisely w h e n such a representation -r could be lifted to an ordinary representation T of a suitably defined covering group G, so that the diagram G T
GLe(C)
G ~-~PGL,i(C) is commutative, and the kernel of the homomorphism from G to G is contained in the center of G. He constructed a universal covering group G, which can be put in the diagram above (for some choice of T) for all projective representations ~'. The methods used to construct G and the kernel of the homomorphism from 54
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
to G are the beginnings of a major chapter in group theory, known as the cohomology of groups. Another theme was Schur's search for arithmetical properties of representations, which brought out connections with algebraic number theory. The central idea is the concept of a splitting field K of a finite group G. This is a subfield K of the complex field C with the property that each irreducible representation T : G GLa(C) is equivalent to a K-representation T' : G --~ GLa(K). A splitting field is minimal if no proper subfield is a splitting field. From the work of Frobenius, it was known that a given finite group G has a splitting field K which is an algebraic number field, that is, a finite extension of the rational field. The splitting field problem was to determine, for a finite group G, the algebraic number fields K which are minimal splitting fields. Splitting fields reflect, in some mysterious way, the structure of the group. For example, splitting fields for cyclic groups require the addition of roots of unity to the rational field, while the field of rational numbers is a splitting field for the symmetric groups S,. Both Burnside and Schur were interested in the splitting field problem, and had evidence to support the conjecture that the cyclotomic field of ruth roots of unity, where m is the least common multiple of the orders of the elements of G, is always a splitting field. Using a subtle device, k n o w n as the Schur index, Schur was able to prove the conjecture for all solvable groups [37].
The D a w n of the M o d e m Age of Representation Theory: Emmy Noether (1882-1935) By finding simple algebraic ideas to express the essential structure of a mathematical theory, Emmy Noether reshaped many different parts of 20th-century mathematics. Representation theory of finite groups was no exception: it has never been the same since the publication of her article "Hyperkomplexe Gr6ssen u n d Darstellungstheorie" (1929) [34]. She presented a basic set of ideas underlying the representation theory of a finite-dimensional algebra over an arbitrary field. In the case of representations of finite groups, the algebra involved was the group algebra KG of the group G over a field K. This is the associative ring whose additive group is the vector space over K with a distinguished basis indexed by the elements of the group G. In order to define multiplication in KG, it is enough to define it for a pair of basis elements, corresponding to elements g and h in G; their product is defined to be the basis element corresponding to the product g.h in G. Representations of G over the field K may be viewed as homomorphisms T : G --~ GL(V), where V is a finitedimensional vector space over K and GL(V) is the group of invertible linear transformations on V. The
simple, that is, direct sums of simple modules, by a version of Maschke's Theorem. This implies that the group algebra KG is semisimple, and the main facts about representations, such as that the number of equivalence classes of irreducible representations in a splitting field is the same as the number of conjugacy classes, become straightforward applications of the Wedderburn structure theorems for semisimple algebras. For more detailed analysis of the contents of her paper [37], see the article by Jacobson [28].
Richard Brauer (1901-1977) and Modular Representation Theory
Emmy Noether. definition given earlier amounts to choosing a basis in V, and taking note of the resulting isomorphism: GL(V) GLcl(K), where d is the dimension of V over K. Noether's critical observation was that each representation T : G ---* GL(V) defines the structure of a left KG-module on V, with the module operation a 9v defined by setting a.v = " ~ %T(g)v, for v E V and a : ~ %g ~ KG. gEG gEG
(Here we have identified the element g E G with the basis element of KG corresponding to it.) Conversely, each finitely-generated left KG-module V defines a representation T : G ~ GL(V) by reversing the procedure given above. It is easily checked that two representations are equivalent if and only if the KG-modules corresponding to them are isomorphic. Thus the main problem of representation theory, which is the classification of the representations of a finite group G up to equivalence, becomes the problem of construction and classification of modules over the group algebra. The problem makes sense for arbitrary fields K. For a field K of characteristic zero, or of prime characteristic p not dividing the order of the group, the left KG-modules are semi-
The next great surge of activity in the representation theory of finite groups, and one that ties up some of the threads started earlier in this article, centered around the work of Richard Brauer, and its continuation by his students and successors, on modular representation theory. Brauer had been a student in Berlin, and completed his dissertation under Schur's supervision in 1926. His early work on representation theory and the theory of simple algebras, including the invention of what has become known, over his objections, as the Brauer group, firmly established his position in the European mathematical community. When Hitler came to power in 1933, Brauer, Emmy Noether, and many other Jewish university teachers in Germany were dismissed from their positions, and came to the United States. Brauer's major publications on modular representations began to appear soon after his arrival in the United States, and the subject remained a focus of his research throughout the rest of his life. My interest in representation theory was kindled by lectures given by Brauer that I heard as a graduate student, including all four of his 1948 AMS Colloquium Lectures at the summer meetings in Madison, and a lecture on his solution of Artin's conjecture on L-series with general group characters, at another meeting in N e w York, w h e n he was awarded the Cole Prize of the American Mathematical Society for this work. I can't say I understood the lectures very well at the time, but they made a strong impression. A few years later, in 1954, Irving Reiner and I spent a year at the Institute for Advanced Study in Princeton. Neither of us knew much about representation theory, but it seemed to us to be a subject in which exciting things were happening, especially those connected with Brauer's work. We organized an informal seminar devoted to Brauer's work on modular representations and other topics in character theory. This led to our book-writing projects, as a w a y of learning the subject. Modular representation theory is the classification of kG-modules, where kG is the group algebra of a finite group G over a field k of characteristic p > 0. In case p THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 5 5
divides the order of the group G, the group algebra kG is not semisimple, and the kG-modules are not necessarily direct sums of simple modules, so their classification is much more difficult. Modular representations were first considered by Leonard Eugene Dickson [7], [8], [9]; among other things, he was the first to point out the different nature of the theory in case the order of the group is divisible by the characteristic of the field. One of Brauer's first results in this subject was the theorem ([1], 1935), that the number of equivalence classes of irreducible representations of a finite group G, in a splitting field of characteristic p > 0, is equal to the number of conjugacy classes in G containing elements of order prime to p. If p does not divide the order of G, every conjugacy class has this property, and the result agrees with known properties of representations in the complex field C. Brauer maintained a steady interest in the relation between properties of the irreducible complex-valued characters and the structure of finite groups. One of his objectives was to use modular representation theory to obtain n e w information about the values of the irreducible characters in C, and to apply it to problems on the structure of groups. Many of his lectures at meetings and research conferences contained lists of unsolved problems, often involving finite simple groups and properties of their characters. In order to develop the connection between modular representations and complex-valued characters, he introduced what is now called a p-modular system, consisting of an algebraic number field K which is a splitting field for G, a discrete valuation ring R with quotient field K, maximal ideal P, and residue field k = R/P of characteristic p, for a fixed prime number p. As an application of the character theory he had developed in connection with his prize-winning proof [3] of Artin's conjecture, he had also succeeded in proving the splitring field conjecture of Burnside and Schur [2]. For the cyclotomic field containing the nth roots of unity, where n is the order of G, it follows that K, and the residue field k = /UP, are both splitting fields. The next step explains how modular representations are related to representations in the field K. Each KGmodule V defines a representation T : G ~ GL,I(K). Since R is a principal ideal domain, it follows that there exists a representation T' : G --* GLa(R) which is equivalent to T : T'(g) = XT(g)X -1, for all g ff G. The homomorphism R ~ / U P -- k can be applied to the entries of the matrices T'(g),g ~ G; this procedure yields a representation T' : G --~ GLd(k) and a kG-module M = V. The representation T' and the module M = V are obtained from T and V by what is called reduction rood P, so T' is a modular representation of G. But there is a difficulty connected with this process. The kG-module M = ~7 is not determined up to isomorphism by the isomorphism class of the KG-module V. Nevertheless, 56
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
in a fundamental joint paper [4], Brauer and his Toronto Ph.D. student Cecil Nesbitt proved that the composition factors of M are uniquely determined. Using the process of reduction rood P, they defined the decomposition matrix D as follows. The rows of D are indexed by the isomorphism classes of simple KGmodules (or, what amounts to the same thing, by the equivalence classes of irreducible representations T : G --* GLa(K)), the columns by the isomorphism classes of simple kG-modules, and an entry dq of D gives the number of times the jth simple kG-module appears as a composition factor in the module obtained by reduction mod P from the ith simple KG-modute. They also introduced the Cartan matrix C, whose entry cq counts the number of times the jth simple kG-module occurs as composition factor in the ith indecomposable left ideal occurring in a suitably indexed list of indecomposable direct summands of kG. In 1937 they proved the remarkable fact that the Cartan matrix and the decomposition matrix satisfy the relation C = tDD, where tD is the transpose of D. This establishes a deep connection between the representation theory of G in the field K of characteristic zero and the representation theory of G in the field k of characteristic p. Its proof used a result of Frobenius [22], which was a refinement of his previous work on the factorization of the group determinant. The preceding result is only the beginning of Brauer's theory. The refinements of character theory he was seeking came from his theory of p-blocks. These describe a partition of the set of irreducible characters in subsets, called p-blocks, corresponding to the decomposition of the group algebra RG as a direct sum of indecornposable two-sided ideals. To each p-block of irreducible characters, he associated a certain p-subgroup of G, called the defect group of the block. He obtained precise information about the values of the irreducible characters in a given p-block, using the modular representation theory of the defect group and its normalizer. This work, in turn, led to applications of the theory of p-blocks of characters by Brauer, Suzuki, and others to important early steps in the classification of finite simple groups (see [12], Chapters VIII and XII). The belief that representation theory of finite groups had a bright future was shared by Frobenius, Burnside, Schur, Noether, and Brauer. The high level of current research activity in the subject and its connections with other parts of mathematics seem to support their judgment.
Acknowledgments I want to thank Walter Ledermann for a helpful letter about this project, and for sending me copies of his
articles ([29] a n d [30]). I am also i n d e b t e d to Harold E d w a r d s and Gerald Janusz for their c o m m e n t s on a preliminary version of the manuscript, a n d to Richard Koch for the c o m p u t e r calculation of the factorization of the g r o u p determinant.
22. 23. 24.
References 1. R Brauer, Clberdie Darstellungen yon Gruppen in Galoischen Feldern, Actualit~s Scientifiques et Industrielles 195, Hermann, Paris, 1935. 2. R. Brauer, "On the representation of a group of order g in the field of gth roots of unity," Amer. J. Math. 67 (1945), 461-471. 3. R. Brauer, "On Artin's L-series with general group characters," Ann. of Math. (2) 48 (1947), 502-514. 4. R. Brauer and C. J. Nesbitt, "On the modular representations of groups of finite order I," Univ. of Toronto Studies, Math. Set. 4, 1937. 5. W. Burnside, Theory of Groups of Finite Order, Cambridge, 1897; Second Edition, Cambridge, 1911. 6. R. Dedekind, "Zur Theorie der aus n Haupteinheiten gebildeten complexen GrOssen," G6ttingen Nachr. (1885), 141-159. 7. L. E. Dickson, "On the group defined for any given field by the multiplication table of any given finite group,'" Trans. A.M.S. 3 (1902), 285--301. 8. L. E. Dickson, "Modular theory of group matrices," Trans. A.M.S. 8 (1907), 389-398. 9. L. E. Dickson, "Modular theory of group characters," Bull. A.M.S. 13 (1907), 477-488. 10. P. G. Lejeune Dirichlet, "'Beweis des Satzes, dass jede unbegrenzte arithmetische Progression, deren erstes Glied und Differenz ganze Zahlen ohne gemeinschaftlichen Factor sind, unendlich viele Primzahlen enth~ilt," Abh. Akad. d. Wiss. Berlin (1837), 45-81. Werke I, 313-342. 11. P. G. Lejeune Dirichlet, Vorlesungen fiber Zahlentheorie, 4th ed. Published and supplemented by R. Dedekind, Vieweg, Braunschweig, 1894. 12. W. Feit, The Representation Theory of Finite Groups, NorthHolland, Amsterdam 1982. 13. W. Felt, M. Hall, and J. G. Thompson, "Finite groups in which the centralizer of any nonidentity element is nilpotent," Math. Z. 74 (1960), 1-17. 14. W. Feit and J. G. Thompson, "Solvability of groups of odd order," Pacific f. Math. I3 (1963), 775-1029. 15. F. G. Frobenius, "Uber vertauschbare Matrizen," S'ber. Akad. Wiss. Berlin (1896), 601-614; Ges. Abh. II, 705--718. 16. F. G. Frobenius, "Ober Gruppencharaktere," S'ber. Akad. Wiss. Berlin (1896), 985-1021; Ges. Abh. III, 1-37. 17. F. G. Frobenius, "O'ber die Primfactoren der Gruppendeterminante," S'bcr. Akad. Wiss. Berlin (1896), 13431382; Ges. Abh. III, 38-77. 18. F. G. Frobenius, "lSPoer die Darstellung der endlichen Gruppen durch lineare Substitutionen," S'ber. Akad. Wiss. Berlin (1897), 994-1015; Ges. Abh. III, 82-103. 19. F. G. Frobenius, "l]ber Relationen zwischen den Charakteren einer Gruppe und denen iher Untergruppen," S'ber. Akad. Wiss. Berlin (1898), 501-515; Ges. Abh. III, 104-118. 20. F. G. Frobenius, "Uber den Charaktere der symmetrischen Gruppe," S'ber. Akad. Wiss. Berlin (1900), 516534; Ges. Abh. III, 148-166. 21. F. G. Frobenius, "Ober die Charaktere der alternirenden
25. 26. 27. 28. 29. 30. 31. 32.
33.
34. 35. 36. 37. 38.
39. 40. 41. 42. 43.
Gruppe," S'ber. Akad. Wiss. Berlin (1901), 303-315; Ges. Abh. III, 167-179. F. G. Frobenius, "Theorie der hyperkomplexen Gr6ssen," S'ber. Akad. Wiss. Berlin (1903), 504-537; Ges. Abh. III, 284-317. C. F. Gauss, Disquisitiones Arithmeticae, Leipzig, 1801; English translation by A. A. Clarke, Yale University Press, New Haven, 1966. T. Hawkins, "The origins of the theory of group characters," Archive Hist. Exact Sc. 7 (1971), 142-170. T. Hawkins, "New light on Frobenius's creation of the theory of group characters," Archive Hist. Exact Sc. 12 (1974), 217-243. T. Hawkins, "Hypercomplex numbers, Lie groups, and the creation of group representation theory," Archive Hist. Exact Sc. 8 (1971), 243-287. K. Ireland and M. Rosen, A Classical Introduction to Modern Number Theory, Springer-Veflag, New York, 1980. N. Jacobson, Introduction, in Emmy Noether, Ges. Abh., Springer-Veflag, Berlin, 1983; 12-26. W. Ledermann, "The origin of group characters," J. Bangladesh Math. Soc. 1 (1981), 35--43. W. Ledermann, "Issai Schur and his school in Berlin," Bull. London Math. Soc. 15 (1983), 97-106. A. Loewy, "Sur les formes quadratiques d~finies a ind~t6rmin6es conjugu6es de M. Hermite," Comptes Rendus Acad. Sci. Paris 123 (1896), 168-171. H. Maschke, "Beweis des Satzes, dass diejenigen endlichen linearen Substitutionsgruppen, in welchen einige durchgehends verschwindende Coefficienten auftreten, intransitiv sind," Math. Ann. 52 (1899), 363-368. E. H. Moore, "A universal invariant for finite groups of linear substitutions: with applications in the theory of the canonical form of a linear substitution of finite period," Math. Ann. 50 (1898), 213-219. E. Noether, "'Hyperkomplexe Gr6ssen und Darstellungstheorie," Math. Z. 30 (1929), 641-692; Ges. Abh. 563-992. I. Schur, Llber eine Klasse yon Matrizen, die sich einer gegebenen Matrix zuordnen lassen, Dissertation, Berlin, 1901; Ges. Abh. L 1-72. I. Schur, "Neue Begrfindung der Theorie der Gruppencharaktere," S'ber. Akad. Wiss. Berlin (1905), 406-432; Ges. Abh. I, 143-169. I. Schur, "'Arithmetische Untersuchungen fiber endliche Gruppen linearer Substitutionen" S'ber. Akad. Wiss. Berlin (1906), 164-184; Ges. Abh. L 177-197. I. Schur, "Untersuchungen fiber die Darstellung der endlichen Gruppen durch gebrochene lineare Substitutionen," J. reine u. angew. Math. 132 (1907), 85-137; Ges. Abh. I, 198--205. E. Study, "Uber Systeme von complexen Zahlen," Gfttingen Nach. (1889), 237-268. M. Suzuki, "The nonexistence of a certain type of simple group of odd order," Proc. A.M.S. 8 (1957), 686--695. H. Weber, Lehrbuch der Algebra, vol. 2, Vieweg, Braunschweig, 1896. K. Weierstrass, "Zur Theorie der aus n Haupteinheiten gebildeten complexen Gr6ssen," G6ttingen Nach. (1884), 395-414. A. Weil, "Numbers of solutions of equations in finite fields," Bull. A.M.S. 55 (1949), 497-508; Collected Papers, I, 399--410.
Department of Mathematics Institute of Theoretical Science University of Oregon Eugene, OR 97403-5203 USA THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4, 1992 5 7
David Gale* For the general philosophy of this section see Vol. 13, no. 1 (1991). Contributors to this column who wish an acknowledgment of their contributions should enclose a self-addressed postcard.
Ideas As a graduate student in physics at the University of Michigan m a n y years ago I had the good fortune to take a course in function theory from Norman Steenrod which pretty much changed the course of my life. My experience in that course plus several private conversations convinced me to switch out of physics and into mathematics. I recall one of our sessions particularly, where, in trying to describe what mathematical research was like, Steenrod said that really there were only about a dozen or so ideas in the whole subject which people just use over and over again, and once you have mastered these you are, so to speak, in business. I wish now I'd had the presence of mind to ask him for his list of the top twelve. In any case, I expect that on anyone's list, one of the ideas would be the so-called variational method, which will be the subject of this month's column.
The Variational Method The variational m e t h o d is used to give existence proofs. In trying to prove the existence of an object with certain properties, one picks an object which maximizes or minimizes some function. The resulting object is then shown to have the desired property, by showing that if it did not, one could "vary" the object so that the given function would further increase or decrease. Examples are everywhere dense in mathematics. The most familiar is perhaps Rolle's Theorem, and the most historically significant may be Riemann's proof of the celebrated mapping theorem, which he proved by minimizing the Dirichlet integral. As we know, there was a serious gap in Riemann's proof in that he did not show that this minimum existed, and it was not until some years later that Hilbert succeeded in supplying the missing argument. Of course, the method goes back much further. Perhaps the most basic example is the standard proof of the fundamental theorem of arithmetic which goes * C o l u m n e d i t o r ' s a d d r e s s : Department of Mathematics, University of California, Berkeley, CA 94720 USA
back to Euclid. A crucial step is to show that every two integers a and b have a greatest common divisor, that is, a divisor, d, of a and b which is divisible by all other divisors of a and b.
The Greatest Common Divisor Because one is looking for the greatest something, one would expect that this would involve solving a maxim u m problem. Instead, it turns out that the right approach is to solve not a maximum but a minimum problem; namely, that of finding the smallest positive integer d such that d = ma + nb, where m and n are integers (there is no problem here about the existence of the minimum, which is, in fact, equivalent to one of the Peano axioms). The fact that any divisor of a and b divides d is now immediate, but it remains to show that d itself is a common divisor of a and b, and this is where the "variation" comes in. If, say, d did not divide a, then by the Euclidean algorithm, there is a positive r less than d such that a = qd + r, and r would be a smaller integral linear combination of a and b, a contradiction. A characteristic feature of this proof is that, like many variational proofs, it leads immediately to a simple algorithm for finding the greatest common divisor (gcd) of two numbers. I have gone over this very familiar proof in such detail to point out that it is more than just an application of the variational method. It is an example of a duality theorem. We see that the largest integer dividing a and b is the same as the smallest positive integer which is an integral linear combination of a and b. Such theorems play a central role in, among other subjects, the theory of linear programming. We will return to this later.
The Sylvester Problem Another feature of the variational method is the fact that it often leads to very short proofs. A striking example of this is the famous problem posed by Sylvester in 1893: to show that if a finite set S of points in the plane has the property that any line through two of
58 THE MATHEMATICALINTELL|GENCERVOL. 14, NO. 4 9 1992 SpringerVerlagNew York
them passes through a third, then the points all lie on a line. Neither Sylvester nor any of his contemporaries was able to find a proof, and it was almost 50 years before the first rather complicated proof was published by Gallai. The short proof which follows is n o w wellknown. It was discovered in 1948 by L. M. Kelly (Amer. Math. Monthly 55, p. 28). Suppose points with the Sylvester property are not collinear. Among pairs (p,L) consisting of a line L through two of the points and a point p not on that line, choose one such that the distance d from p to L is a minimum. Let q be the foot of the perpendicular from p to L. Then (variation) by assumption there are at least three points a, b, and c on L; hence two of these, say, a and b are on the same side of q in the order a, b, q (and c can be on either side) as
S J
R
,•///a
q
b
c?
/x Figure 2. Applying Pasch's axiom.
P
p
Z
!
c?
a
b
q
q
c?
c?
Figure 1. Kelly's proof.
in Figure 1. But then the distance from b to the line ap is d' which is less than d, a contradiction. Kelly's p r o o f is i n d e e d s h o r t - - b u t , h e r e w i t h H. S. M. Coxeter (Introduction to Geometry, Wiley, 1961, p. 181): "This matter [Sylvester's problem[ of collinearity dearly belongs to ordered geometry. [Indeed, the result is false over the complex numbers or finite fields! You can find an easy 9-point counterexample on the torus.] Kelly's Euclidean proof involves the extraneous concept of distance: it is like using a sledge hammer to crack an almond. The really appropriate nutcracker is provided by the following argument." Coxeter's lovely proof (which, happily, is also variational) depends on Pasch's Axiom which in its simplest formulation asserts that it is impossible for a straight line to meet only one side of a triangle. (One w a y of thinking of this is as a very primitive special case of the Jordan curve theorem. If the line enters the triangle through one side, it must cross another side to get back out again.) The picture for the proof is very much like Kelly's, but this time we pick any point p and find a ray R from it which contains no other points of S but crosses at least one line connecting points of S. Each such line intersects R in some point, so we choose a line L whose intersection q with R is closest to p (not in the sense of distance, of course, but as an ordered set on R). N o w (variation) there must be two points, a and b of L, on the same side of q. We show that there can be no third point of S on the line ap. There are two cases.
x Figure 3. Applying Pasch's axiom to the other case.
The third point, y, is between a and p. Then, Figure 2, no matter where c is, from Pasch's Axiom applied to triangle apq, the line cy will intersect R in a point closer to p than q. Case I.
Case II. The third point, x or z, is not between a and p. Then, as before, either bx or bz will intersect R in a point closer to p than q. (See Fig. 3.) Voile! But
There are 2n points in the plane in general position, n of them are red and n are blue. Show that they span n disjoint segments having one end red and the other blue. Of course, by now you have gotten the message and realize that you should choose segments in a w a y which minimizes the sum of their lengths. The segments will then be disjoint, for (variation) if they crossed, apply Proposition XX of Euclid's Elements (the one about the shortest distance between a red and a blue point being a straight line) to show you could decrease the total length by uncrossing them. Once again, however, we are using the "sledge hammer" of distance on a theorem which is clearly affine (but not projective) invariant. I doubt, h o w e v e r , that this would have bothered the Greeks. T H E M A T H E M A T I C A L INTELLIGENCER VOL. 14, N O . 4, 1992
59
Birkhoff's Billiard Balls Let T be a convex billiard table with a smooth (C1) boundary. Then for any n > 1, there is an n-bounce periodic billiard-ball orbit (theorem of G. D. Birkhoff).
triangle with minimum perimeter (Can you prove it?) It has been shown that periodic orbits exist for polygons whose angles are rational multiples of "rr. However, apparently nothing is k n o w n about obtuse triangles with irrational angles.
Proof. Among all inscribed n-gons (whose edges may, of course, cross), choose one whose perimeter is a maximum. This will be a billiard-ball orbit, i.e., angle of incidence will equal angle of reflection at each bounce point; for if not, perturb the bounce point on the boundary in the direction of the edge making the larger angle with the tangent. This will increase the perimeter (a nice calculus exercise in what the textbooks used to refer to as "related rates"). 9 Note that the result is trivial for even-gons since the ball just bounces back and forth along a diameter of the set. In Figure 4 are a pair of 5-bounce Birkhoff billiardball orbits on an elliptic table, courtesy of Ben Lotto (as told to Mathematica). Interestingly, not much is known about nonsmooth billiard tables, in particular, about polygons. For example, for acute triangles there is always a 3-bouncer with bounces at the feet of the altitudes, the solution this time of finding the inscribed
The Desegregation Theorem Speaking of short proofs, here is an example that came to me from Donald N e w m a n via Murray Klamkin. Given any graph, prove that it is possible to color each vertex either black or white in such a way that at least half of the neighbors of each white (black) vertex are black (white). I must admit I didn't see h o w to prove this, but Jim Propp did and came up with, not a two-line, but a six-word proof. It will be given at the end of the column.
The Stable-Assignment Theorem Finally, an example from economics. There are n workers and n employers, and if worker i works for employer j, they will together generate aij units of some good, say bread. Assuming each employer can hire only one worker, who will work for w h o m and h o w will the partners divide up the bread they produce between wages, w, of the worker and profits, p, of the employer? (Wages and profits are paid in bread rather than money to emphasize that we are dealing with intrinsically valuable goods rather than paper and coins.) There is a simple economic "equilibrium" condition that gives the answer. If, in an assignment, some worker i is earning wage w i and some employer j w h o m i is not working for is making profit pj, then we require that wi + p/t> aii,
(1)
for if the inequality went the other way, then i and j could both get more bread by working together. The question is then whether there will always exist an assignment of workers to firms and a division of the bread from each partnership so that (1) is satisfied. Such an arrangement is called a stable or equilibrium
assignment.
Figure 4. A pair of Birkhoff billiard-ball orbits on an elliptic table. 60
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
Before looking at the existence question, we call attention to a wonderful property of equilibrium assignments which is a special case of what is sometimes called the Fundamental Theorem of Economics. Obviously, for the welfare of the society as a whole, the employer-worker assignments should be such that they produce the maximum possible amount of bread. Such an assignment is called optimal.
Theorem. An equilibrium assignment is optimal. Here is Propp's proof of the Desegregation Theorem:
The proof is a two-liner.
Maximize the number of interracial neighbors.
Proof. For notational convenience, let us suppose that in the equilibrium assignment worker i works for employer i, so that aii = wi + Pi"
(2)
Summing (2) on i gives Y~aii = Y'wi + Y'Pi.
(3)
N o w consider any other assignment r where worker i is assigned to employer o'(i). From the equilibrium condition (1), we have
Y'air
~ Xwi + YP~(i) = Xwi + XPi,
(4)
the last equation following because cr is a bijection. From (3) and (4), it n o w follows that ~,aii >i Xair ), which is precisely optimality. 9 (It is because of theorems like the above that economists go around extolling the virtues of "market economies.") We see then that the equilibrium property is related to a very natural maximum problem, which suggests using this maximum problem to prove the existence of equilibrium assignments. April Fool! It turns out as in the case of the gcd that instead one should consider the "dual" minimum problem. Namely, among all 2n-tuples (w1. . . . . Wn, Pl . . . . . Pn) satisfying the stability condition (1), choose one which minimizes Y~wi + Y~Pi. With these minimizing values of the w's and p's consider the set S of all pairs (i, j) such that (1) is satisfied as an equation: wi + pj = aij.
Note that once again the proof leads to a good coloring algorithm which terminates in at most E iterations, where E is the number of edges of the graph. Namely, start with any coloring, and if a vertex does not satisfy the conditions, change its color. (Since this was written, Alexandre Giventhal proposed a three-word proof: Maximize Ferromagnetic Energy.)
Problems Colored triangles (92-1) b y I m r e Barany, Budapest A point p in the plane lies in the intersection of a red, a white, and a green triangle [editor: these are the colors of the Hungarian flag!]. Prove that one can choose one vertex from each of the triangles such that p lies in their convex hull.
(5)
Let us say that if (5) holds for a pair (i, j), then i and j are compatible. If it is n o w possible to choose a subset of n disjoint compatible pairs from this set, then one easfly sees that this gives the desired equilibrium assignment. The punch line in the proof now uses the famous "marriage theorem" of Philip Hall that if no such subset exists, then there must be a "bottleneck," namely, a set of k employers w h o are compatible with fewer than k workers; but in that case (variation), by the Law of Supply and Demand, one could increase by ~ the w i of these "overdemanded" workers which would decrease the pj of the k firms and this would decrease Y.w i + Epi, a contradiction. 9 As in our other examples, this proof leads to a good (O(n2)) algorithm for finding an equilibrium assignment, the so-called Hungarian method due to Harold Kuhn. THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
61
After much cajoling on my part, the retiring director of the Berkeley Mathematical Sciences Research Institute agreed to contribute the following opus which perhaps reflects some of the things he learned during his tenure in office.
The Deep Young Man A parody of Bunthorne's song in Patience, with admiration for, and apologies to Sir William Gilbert and Sir Arthur Sullivan
By Irving Kaplansky If you wish to cut a path In the modern world of math You must spout a lot of things. You must show you know topology And 6tale cohomology, In physics, mention strings. You can live a life of pleasant ease Because a set of PDE's Is just a space of loops. You should act a bit convivial And tell them that it's trivial If they think of quantum groups. Chorus: And everyone will beam When you tell them of your scheme To make them see A la Bourbaki That if one only tries, Then hyperbolic manifolds and affine planes Are wavelets in disguise. And of course you'll give a lecture On the Poincar6 conjecture; Make it full of double talk. But don't hint that you can do it, Others did and lived to rue it, And use plenty of colored chalk. You will want to put in lots Of invariants for knots; Keep the lemmas coming fast. Then if their eyes grow bleary, Just invent a subtle theory That may settle Fermat's last! Chorus (Gilbert's words unchanged): And everyone will say As you walk your mystic way: If this young man expresses himself In terms too deep for me, Why, what a very singularly deep young man This deep young man must be. 62
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
MOVING? We need your new address so that you do not miss any issues of
THE MATHEMATICAL INTELLIGENCER. Please fill out the form below and send it to: Springer-Verlag N e w York, Inc. Journal F u l f i l l m e n t Services
44 Hartz Way, S e c a u c u s , NJ 07096-2491 Old
Name.
Address Address (or label)
City/State/Zip Name
New Address Address.
City/State/Zip Please give us six weeks notice.
Jet Wimp*
The Bernoulli Edition: The Collected Scientific Papers of the Mathematicians and Physicists of the B e r n o u l l i Family Basel: Birkh/iuser, 1935-ongoing Reviewed by David Speiser In this announcement, I want to describe an extensive ongoing publishing activity called The Bernoulli Edition. I also wish to extend a worldwide invitation to scientists, mathematicians, and physicists to collaborate in its preparation. The Bernoulli Edition is a multi-volume effort prepared under the auspices of the Naturforschende Gesellschaft, Basel. Its purpose is to make accessible to scientists and historians the papers of the members of the Bernoulli family of the eighteenth century. The Edition will eventually include the work and letters of all scientific members of the family. It will contain the published work of the Bernoullis and their most important unpublished work. The Edition will document all other work and indicate its location. It will reproduce the important letters in their entirety, others in part. Let me provide the reader with some biographical data concerning the eight Bernoullis and their fellow scientist Jacob Hermann, a disciple of the older Jacob. Jacob I (1654-1705) is known primarily for his contributions to the development of the infinitesimal calculus, the calculus of variations, differential geometry, analysis, the theory of probability ("Law of Large Numbers"), and mechanics, especially the dynamics of rigid bodies and the theory of elasticity. Johann I (1667-1748) was the younger brother of Jacob. He is known today mainly for his solutions of the problems of the catenary and the brachystochrone. It was Johann who presented the new infinitesimal calculus in a systematic way in his private lectures to the Marquis de l'H6pital. He made also important contributions to the calculus of variations, differential geometry, the problem of orthogonal trajectories, and hydrodynamics. He corresponded with Newton, whose results he improved. With Newton and his brother * C o l u m n Editor's address: D e p a r t m e n t U n i v e r s i t y , P h i l a d e l p h i a , P A 19104 USA.
of M a t h e m a t i c s ,
Drexel
Jacob he stands as one of the towering figures in the origins of analytical mechanics. To historians he is especially important as the foremost teacher of his time. His influence not only on his sons but on many others, especially Euler, his disciple, was enormous. His ongoing quarrels and disputes are notorious in the chronicles of mathematical infighting. He engaged in bitter arguments with his brother, with Cartesians to whom he championed the new calculus, and especially with English mathematicians against w h o m he defended Leibniz as an independent inventor of the calculus. Jacob Hermann (1678-1733) was a disciple of Jacob I and a prot6g6 of Leibniz. The first issue of the newly founded Commentarii Academiae Scientiarum Imperialis Petropolitanae opened with one of his articles. He was famous for his Phoronomia, the first systematic presentation after Newton of mechanics that used infinitesimal calculus. He also published more than 50 papers in analysis, geometry, and especially mechanics. He was one of the first mathematicians to use creatively ideas in the Principia, and thus, while initially suffering neglect, his work is now gaining increased attention from historians of mathematics. Daniel (1700-1782) we remember for his achievements in hydrodynamics, for his researches on what we now call the Bernoulli equation, and for his seminal work in the kinetic theory of gases. Today we recognize him as the originator of the theory of oscillations; in particular, the discoverer of normal modes of multiple oscillations. His theory of tides was among the first physical theories to exploit Newton's theory of gravitation. He found practical, technical questions interesting, e.g., the precision of clocks, and with an artisan from Basel he constructed the first apparatus for measuring the magnetic inclination. The "minor" Bernoullis made contributions which, although of smaller importance than those of their relatives, are essential to an understanding of the entire corpus of the family. The five in question are: Nicolaus I (1687-1759), nephew of Jacob I and Johann I; Nicolaus II (1695--1726), eldest son of Johann I; Johann II (1710-1790), third son of Johann I; Johann III (1744 1807), eldest son of Johann II; and Jacob II (1759-1789), youngest son of Johann II. Of their accomplishments I mention only these: Nicolaus I's evaluation of the series
THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4 9 1992 Springer-Verlag New York
63
1 En2--
"/1"2
6
n=l
and his formulation of the notorious "St. Petersburg Paradox" of probability theory; Nicolaus I's and Nicolaus II's research into orthogonal trajectories; Johann II's attempt to establish a wave theory of light in terms of transverse waves; Jacob II's (unsuccessful) attempt to provide a theory of the oscillating plate. The works of the Bernoullis cover a period of over 100 years, from 1682 until 1789. They wrote their published papers in Latin and in French. O. Spiess started The Bernoulli Edition in 1935. His colleagues and successors carried on Spiess' work. The efforts of these workers have to date resulted in the preparation for publication of the works of Jacob I (six volumes); a volume of the works of Jacob I and Johann I on the calculus of variations; a facsimile volume of Jacob I's scientific diary, Meditationes; the letters from and to Jacob Bernoulli1; the correspondence of Johann I with his brother, with the Marquis de l'H6pital and with P. Varignon2; and the complete works of Daniel Bernoulli (without letters) in eight volumes 3. The final stage of the publication will address the works of Johann I Bernoulli, Jacob Hermann, and the five minor Bernoullis. P. Radelet-de Grave recently submitted a draft plan to the scientific committee, allowing for 13 volumes. A definitive plan will appear next year. F. Nagel has compiled an inventory of the ca. 4700 known letters by or addressed to a Bernoulli (but for Johann III). We hope to have by 1993 a plan for the edition of the correspondence stipulating which letters are to be printed in their entirety. The correspondence will fill another 10-15 volumes. The Bernoulli Edition is under the patronage of the Naturforschende Gesellschaft in Basel, directed by the Kuratorium of the O. Spiess Foundation (president: J.-L. von Planta), and by the board of the Verein zur FOrderung der Bernoulli-Edition (president: H. St/ihelin). The chief editors for The Edition are D. Speiser, general editor (Basel); P. Radelet-de Grave, editor for the works (Louvain-la-Neuve, Belgium); F. Nagel, editor for the correspondence (Basel); M. Mattm~iller, permanent assistant in charge of the preparation of the text (Basel). While The Edition originates in Switzerland and is financed mainly by the Swiss National Science Foundation, the editing of the works and letters is a world-
1 O f these, four v o l u m e s h a v e a p p e a r e d , two are in print, two m o r e are in preparation. 2 Of these, two v o l u m e s h a v e a p p e a r e d , o n e v o l u m e is in print, o n e v o l u m e is in preparation. 3 T w o v o l u m e s h a v e a p p e a r e d , two v o l u m e s will be r e a d y by the b e g i n n i n g of t h e n e x t year, t w o m o r e are in preparation. 64 THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4, 1992
wide enterprise. Scientists and scholars of nine countries and three continents have collaborated in this effort. Among the editors of volumes in print or in preparation are A. Weil (Princeton), responsible for three volumes, C. A. Truesdell (Baltimore), A. Englebert (Brussels), B. L. vander Waarden, C. S. Roero (Turin), U. Troehler and V. Zimmermann (G6ttingen), U. Bottazzini (Milan), J. Lederer (Brussels), G. Mikhailov (Moscow), J. Peiffer (Paris), P. Radelet-de Grave (Louvain-la-Neuve), and D. Speiser (Basel). Any reader of The Mathematical Intelligencer who is interested in editing one or several volumes should write to me. Please indicate which of the volumes interest you. Please provide some information about your professional background and experience as a historian (publications, conferences, lectures, etc.). The editors are expected to furnish an introduction and a commentary for each volume. Each volume will contain an index of the complete works of the author and auxiliary material of help to the reader. The allocation of material in the volumes will be such that never more than two, or possibly three, editors will be required for any one volume. The texts will be printed in the original language; introductions and commentaries preferably in English. Each editor will receive photocopies of the works or letters to be printed in the volume. The chief editor will provide all additional necessary information. Bromh~ibelweg 5 CH-4144 Arlesheim Switzerland
Ethnomathematics: A Multicultural View of Mathematical Ideas by Marcia Ascher Pacific Grove, Calif.: Brooks/Cole Publishing Company, 1990. ix + 203 pp. Reviewed by David Wheeler Like a good movie or a good novel, this book makes its effects at more than one level. It gives us glimpses of small-scale mainly nonliterate societies behaving in mathematical ways; it shows familiar mathematics in some unfamiliar contexts; it underlines the interplay of mathematical ideas and culture. The book suggests the need for a revaluation and revision of the stories that our own culture tells about mathematics, particularly about its history. The body of the book comprises six chapters, which in turn focus on number words and symbols, graphs, kinship, game strategies, models of space, and strip patterns. The author intersperses her reports of localised cultural phenomena with accounts of the "Western" mathematical ideas she finds exemplified in them. The chapter "Tracing graphs in the sand," for example, begins with an anecdote about an ethnologist challenged by the tribe he is studying to draw a certain
pattern in the sand with a continuous finger movement, then goes on to give a brief background of the basic vocabulary of graph theory, including the idea of "Eulerian path," and continues with examples of unicursal figures occurring in the d r a w i n g s of the Bushoong and Tshokwe peoples of West Africa. (The author provides brief anthropological sketches of each society.) A shift to the Malekula in New Hebrides reveals another ethos, ascribing a central place to the continuous drawing of two-dimensional configurations, some treated as representational, others ritualistic, others mythic. The author remarks that the drawing procedures, which she divides into three classes, are systematic, the rules having been passed from one generation to the next. After introducing some notation, she shows that the Malekula system of classifying "nitus," as the drawings are called, is compatible with the Eulerian classification of graphs, and makes some interesting inferences about the way the drawing procedures produce symmetrical configurations. The chapter ends with a brief comment pointing out that "different peoples have pondered the same problem 9 . . and found the idea sufficiently intriguing to elaborate it well beyond practical necessity" (p. 62). I am never sure how much surprise to feel at examples of this kind: examples of the evidence, that is, of other, very different, societies being in possession of procedures that we could call algorithmic and of knowledge that we might call theorems. However astonished I am by the similarities, it does seem important to keep a firm hold on the crucial differences. When Marcia Ascher introduces the material on the Malekula drawings, she tells us that the drawings relate to myths about passage to the Land of the Dead and to beliefs about the origins of death. The next paragraph, beginning "Viewed solely as graphs defined by vertices and edges, the nitus vary in c o m p l e x i t y . . . " (p. 46), startles me into recalling the gulf between the significations of these drawings in the Malekula and in our own mathematical culture, of the wealth of meaning that we often discard in order to transport the phenomenon into a form suitable for mathematical discussion. Do the gains compensate for the loss? Possibly for us they do because we have learned how to suspend judgment about the pay-offs from mathematisation. The Malekula, however, must surely find a reduction of their nitus to the austerity of mathematical graphs even more puzzling than we find their pictures of mortality. A culture is as much shaped by what it ignores as by what it chooses to emphasise. Even where a society like the Malekula develops a system of actions and ideas that appears to us strangely like familiar mathematics, that system does not function in that society remotely the way mathematical systems function in ours. Western society has discarded the kind of rituals
A nitus (graph with a unicursal route) of the Malekula.
the Malekula value: Indeed, it tends to throw away everything nontransferable and nontransformable with a profligacy that seems deeply and characteristically related to the search for multivalence that is at the heart of our mathematics. From this perspective, it is not clear what validity we could ascribe to the claim that the same mathematical ideas turn up in different cultural settings. However, the questions persist; the many examples from other cultures of that "strange similarity" continue to snag our interest. Perhaps, as with the Euledan paths, a particular "problem" yields the "solution" it must to anyone anywhere exploring it diligently enough, a solution that then may or may not be recorded and stored in the community memory and may or may not lead to other "discoveries." Many similarly nearly explicable examples, especially those to do with games and pattern-making, appear in the book. Ideas about space and time, though, are not solutions to well-posed problems; they do not arise out of situations that we can externalise in a straightforward way and then study. "We are in space, we move through space, and we act on space . . . . We . . . conceptually structure space in order to have a context for perceiving and describing objects and motion within it . . . . And for the people of each culture, their physical and conceptual structuring of space-time is such an integral part of their world and their world view that it seems both obvious and natural" (p. 149-150). Marcia Ascher's swing through some examples from the Navajo, the Inuit, and the Caroline Island navigators is exhilarating, making this, for me, the most interesting of her six core chapters. Here one is struck by the differences between the various cultural "answers" THE MATHEMATICALINTELLIGENCERVOL. 14, NO. 4, 1992 6 5
and the official mathematical story; but it seems to me that story, in either the Euclid or the Riemann edition, falls a long way short of accurately expressing people's experience of space and time, even of those people brought up with the benefits of the language and conceptual framework of Western mathematics. Michel Serres points out that we each inhabit many geometrical spaces, only one of them the space of labour, he wryly suggests!---corresponding to the space of Euclidean geometry. These space-time examples do a better job of convincing me of the value of taking "a multicultural view of mathematical ideas" (to quote the book's subtitle) than the material on, say, kinship, where the graphs, group tables, and Cayley diagrams easily can take on the appearance of "neat stuff," encouraging a patronising "Gee! Look h o w smart these primitive people really are!" attitude. I hasten to add that there is not a hint of patronage in Marcia Ascher's own approach. She makes it clear in her concluding reflections on ethnomathematics that one of her main motivations in writing the book was to contribute to modifying the cultural assumptions underlying much of current writing about the history of mathematics. She finds that a late nineteenth-century evolutionary outlook still informs our views of mathematics--to be later is always better and to be non-Western is necessarily worse. Anyone with the slightest interest in mathematics as a pervasive ingredient of human culture is in debt to the author for making so much fascinating material accessible. The book will enable us to modify in an appropriate way the stories we tell about mathematics in significant, concrete instances. The text is marvellously well written, composed by a writer who knows exactly how to take her readers into account. The book production is sturdy and attractive. My sole complaint is the absence of a comprehensive listing of the references given in the detailed notes following each chapter. I hope mathematicians are not too busy to ponder, if only on Sundays, the issues raised by this book. 206-1273 Merklin Street White Rock, British Columbia V4B 4B8 Canada
The Crest of the Peacock by George Gheverghese Joseph London, N e w York: I.B. Tauris & Co., Ltd., 1991. xvv + 368 pp. Distributed by St. Martin's Press, N e w York R e v i e w e d by D. J. Struik
The title of this book, written by the Indian-born Senior Lecturer in Econometrics and Social Statistics at the University of Manchester, England, is taken from an ancient Indian saying that "like the crest of a peacock," so is "mathematics at the head of all knowl66
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
edge." It is primarily an eloquent plea for a broader understanding of the history of mathematics than is usually presented in our textbooks and lectures on this subject, where the presentation, as the author claims, is tainted with Eurocentrism, a polite term for the spirit of colonialism. He proposes to show that The standard treatment of the history of non-European mathematics exhibits a deep-rooted historiographical bias in the selection and interpretation of facts, and that mathematical activity outside Europe has as a consequence been ignored, devalued or distorted. (p. 3) Indeed, the traditional w a y to sketch the history of mathematics, at any rate in what we call "the West," is to accentuate the Greek achievement and from there to jump over to the Italian Renaissance and the subsequent development associated with Descartes, Newton, Leibniz, Euler, and their European followers. In this process a polite bow is made to the Babylonians with their sexigesimals, to the Egyptians with their curious unit fractions, and to the Hindus with their decimal position system. As to the Arabs, they are supposed to have distinguished themselves mainly by passing "the torch of Greek science to the Europeans," to quote Cajori (1919), in this process, admittedly, bringing some beginnings of algebra. This is, I believe, not a bad w a y to explore, to those who like to know how our college and academic mathematics came into being, the w a y from the theorem of Pythagoras to the integral of Lebesgue. We must realize, however, that the scope of mathematics history has been widening considerably during this century, with results not only of scholarly and cultural interest, but also of importance to the philosophy and education of mathematics, even affecting questions concerning the nature of mathematics itself. This has caused the traditional way to become highly unsatisfactory, if we like to consider the history of mathematics as a whole (as our textbooks claim they do), especially when we look at this history as an aspect of cultural history. Professor Joseph has written his book to point this out in a forceful way and to warn us that forgetting or underestimating these changes in our knowledge, especially of non-European mathematical structures, has become a deplorable sign of Eurocentrism. Let us have a quick look at the many changes in our outlook on non-European mathematics. There is, to begin with, the work of Neugebauer and ThureauDangin* in the 1920s and later, which has opened an entirely new world, that represented by Babylonian cuneiform tablets. I still remember the amazement (perhaps y o u might call it cultural shock) w h e n , around 1930, I learned that the theorem of Pythagoras was known more than a millennium before the time of * Titles of papers and books published by authors who are mentioned in this article can be found in D. J. Struik, ConciseHistory of Mathematics, 4th ed. (New York: Dover, 1987).
the Greek sage, together with results such as solutions of systems of equations, compound interest, and Pythagorean triples. Our understanding and respect for ancient Egyptian mathematics was d e e p e n e d by Giddings, Van der Waerden, and other authors, and so was the history of Indian mathematics by such men as Gupta and Rajagopal. This led, among other surprises, to the discov- The Kouku (Pythagoreery that Madhava as early as the 15th century pro- an) theorem illustrated posed infinite series expansion of circular and trigono- in the earliest extant metric functions, thus perhaps even anticipating the Chinese mathematical text. calculus. For me, at any rate, came another new understanding when Needham and Wang Ling (1959) made us the mathematics of nonliterate people in general, from better acquainted with the highly sophisticated Chi- Stone Age times to the present? Professor Joseph has nese mathematical culture, several millenia old, and also turned his attention to Africa, where the oldest through its Arabic and Indian connections a formative artifact of mathematical interest has been found, the factor in the history of mathematics. While Archi- Ishango Bone. This bone tool, dating back to about medes had ~ to about two decimals, Tsu Chung Chih 35,000 B.c. (and now in Brussels), has two sides with had "rr between 3.1415926 and 3.1415927 (circa A.D. clearly visible notches, arrayed in groups, which may 450), Madhava had "rr correct to 11 decimal places (circa represent an arithmetical game of some sort, or some 1400) and A1-Kashi to 16 decimal places (1429). This ritual, or hunting, or perhaps an astronomical record. makes Vi~te a newcomer, with only nine decimal Professor Joseph does not devote much further space places (1579). After him, the "West" picks up. to the records of African mathematics, but this has Arabic history has also undergone a face-lifting. been done by Claudia Zaslavsky, Paulus Gerdes, and These Arabs were, of course, seldom Arabs, but Per- others in their study of present-day mathematical nosians, Tadjiks, Jews, Moors, and others, but they all tions of peoples of Nigeria, Zambia, Mozambique, etc. used the Arabic language, just as their Western and Here pottery, tiles, fishnets, knots, games, decoraCentral European contemporaries used the Latin lan- tions, sand drawings (graphs we call them), and even guage. We now know, much better than in earlier kinship relations present mathematical patterns. Or days, especially through Russian authors, that the shall we use the term "proto-mathematics"? But Moachievements of these Arabic scholars were far more li~re's M. Jourdain did not know either that he had been than mere translations from Greek, Sanskrit, Babylo- talking prose. nian, or Indian sources, but formed an impressive corThis has led us into a field n o w known as ethpus in their own right, and not just a transmission belt. nomathematics, about which we can learn a good deal To point this out has been the main task Professor in Marcia Ascher's book of that title (see review on p. Joseph has set himself. But he also points out that new 66). But we wander away from the Joseph book beinsight has come from the American side of the Arian- cause its intention is not ethnomathematics, but is a tic. Discovering the Maya remains and deciphering plea for abandoning an approach to the history of their symbols has given us information on their type of mathematics which looks with condescension on nonmathematics based on a curious vigesimal system es- European achievements, even seeing in them merely pecially used in astronomy. And we have, mainly the "childhood" of our own superior European-bred through the work of R. and M. Ascher, received new mathematics. For this purpose he emphasizes "the knowledge on the quipu reckoning of the Incas. (A global nature of mathematical pursuits and creations." quipu is a structure consisting of a colored cotton cord Thus he places in the forefront of our attention full with many cords hanging from it, with knots repre- accounts of the achievements of the Egyptian (Chapsenting numbers in the decimal position system.) ters 3 and 5), Babylonian (Chapters 4 and 5), Chinese These quipus, it seems, could give a satisfactory arith- (Chapters 6 and 7), Indian (Chapters 8 and 9), and metic for the statistical data of Inca bureaucracy. This Arabic mathematics (Chapter 10). There is no other last discovery has taught us that nonliterate societies book, I believe, in which such a thorough exposition of (the Incas had no script) still can possess their own these mathematical cultures is found under one cover. mathematical methods. Has this also h a p p e n e d to The first two chapters give us a justification for the other nonliterate cultures, say perhaps that of the book and an account of mathematics of "bones, strings "'druids" of Stonehenge? Quipus have been preserved and standing stones." There is an interesting bibliogin arid Andean graves, but Stonehenge records, if they raphy at the end. It is also pointed out that we may not ever existed, seem to have been lost forever. neglect the influence on Greek science coming from This brings us to the question: What do we know of Babylonian and Egyptian sources (of which the ancient THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
67
Greeks were well aware), and the influence of African influences on the whole of Egyptian culture. This is the topic of M. Bernal's book Black Athena (London, 1987). Just now an exposition of Nubian culture at Harvard reminds us of this same subject. Sometimes Professor Joseph goes a little too far. He speaks of Europe's " d e e p slumber" in the "Dark Ages" (p. 9, also p. 345). But there was some interesting mathematical thinking by the medieval scholastics, mostly of a very philosophical nature, concerning such concepts as the infinite and the indivisibles. Moreover, the builders of the cathedrals must have had their more practical mathematics. It is with this mathematics as with the Chinese and other non-European achievements: We have to understand and appreciate them in their own cultural terms rather than consider them merely as "childhood" stages of our own academic mathematics. The ethnomathematicians will also have to say something about this since several of them extend their ideas from nonliterate peoples to other ethnic groups, which may be literate cultures, urban or rural ghettos. Incidentally, the term "medieval," which occasionally finds a place in the book (cf. p. 20), is itself Eurocentric. The word "Middle Ages," medium aevum, is a seventeenth-century term based on the supposed "dark ages" between the Greeks and the Renaissance in the Europe of the fifth to fifteenth century. Each culture then has its own idea of what mathematics is and what it will accomplish, and this also holds for "'proto-mathematics," in whatever way we think about it. Our present book takes more or less for granted that we recognize mathematics w h e n we see it (not difficult in the non-European mathematics it describes), though the ethnomathematicians may have trouble (see, e.g., J. Hoyrup's review of Ascher's book, Math. Reviews, March 1992). Does the potter producing urns with beautiful geometric shapes commit himself to mathematics (G. D. Birkhoff, in his day, had his ideas about it in Aesthetic Measure, 1933)? Since the linguists discover mathematical structure in our languages, shall we include language forms in our ethnomathematics? We had better be a little careful. I am again straying away from the Joseph book, which wants to cure us of our Eurocentrism, a trait we share with other "Western" historians. Recognizing that there also exists a non-European chauvinism (e.g., on the part of Arab, Chinese, and Indian scholars), Professor Joseph writes: While non-European chauvinism does persist, the "arrogant ignorance" (as J. D. Bernal, 1969, described the character of European scholarship in writing history of science) is the other side of the same coin. But the latter tendency has done far more harm than the former because it rode upon the political domination imposed by the West, which imprinted its own version of knowledge on the rest of the world. (p. 216) 68
THE M A T H E M A T I C A L INTELLIGENCER VOL. 14, NO. 4, 1992
This criticism even extends to our academic "Development Studies," Anthropology and Oriental Studies. These subjects in Darn serve as the basis for which more elaborate Eurocentric theories of social development and history are developed and tested. (p. 3) Even the logic prevalent in the different cultures may be different. They do not all follow Euclid, Frege, or Russell. Logic, as M. Ascher has pointed out, may even come through dream messages directing rituals in a "proto-mathematical" way. Joseph does not go that far, but points out that what passes for rigor in different cultures may not be the same. In "Western'" mathematics, we base our proof on axiomatic procedures dating back to Euclid. "But," asks our author, is this not taking a highly restrictive view of what is a proof? Could we not expand our definition to include, as suggested by Lakatos (1970), explanations, justifications and elaborations of a conjecture constantly subjected to counter-examples?... He continues, It is possible to distinguish between logically deductive and axiomatically deductive algebraic reasoning. Once Hilbert and Russell had laid the foundations of mathematical logic, it became possible to construct an algebra from a limited set of axioms. Previously, what great mathematicians such as Euler, Gauss and Lagrange had considered as proof was logically deductive proof. (p. 127) We can compare the reasoning in the Chinese "Nine Chapters on the Mathematical Arts" with that in Euclid's Elements, books which occupy a similar position in both cultures and are roughly speaking contemporary. The certainty of reasoning they gave in both cultures seems to have been the same. We know how the Elements are constructed. The "Nine chapters'" is a collection of 246 problems in the algebraic-arithmetic tradition similar to that of Babylonian mathematics. This seems to have presented a method with satisfactory certainty. It is not only this, but we must realize also the fact that in "Western" mathematics the notion of rigor has also been subjected to changes. Rigor, we can say in Joseph's term, is a cultural concept. The book's last paragraph reflects some of the author's spirit: If there is a single universal object, one that transcends linguistic, national and cultural barriers, and is acceptable to all and denied by none, it is our present set of numerals. From its remote beginnings in India, its gradual spread in all directions remains the great romantic episode in the history of mathematics. It is hoped that this episode, together with other non-European mathematical achievements highlighted in this book, will help to extend our horizons and dent the parochialism that lies behind the Eurocentric perception of the development of mathematical knowledge.
Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139 USA
A Diary on Information Theory by Alfr4d R4nyi Chichester: John Wiley & Sons, 1987. ix + 125 pp. Hardcover, US$54.95 (ISBN 0-471-90971-8)
Reviewed by Gregory J. Chaitin Can the difficulty of an exam be measured by how many bits of information a student would need to pass it? This may not be so absurd in the encyclopedic subjects but in mathematics it doesn't make any sense since things follow from each other and, in principle, whoever knows the bases knows everything. All of the results of a mathematical theorem are in the axioms of mathematics in embryonic form, aren't they? I will have to think this over some mote. A. R6nyi (A Diary on Information Theory, p. 31) This remarkable quotation comes from R6nyi's unfinished 1969 manuscript, written in the form of a fictitious student's diary. This "diary" comprises the bulk of R6nyi's posthumous work, A Diary on Information Theory, a stimulating introduction to information theory and an essay on the mathematical notion of information, a work left incomplete at R6nyi's death in 1970 at the age of 49. Alfr6d R6nyi was a member of the Hungarian Acade m y of Sciences. The Diary, as well as the material on information theory in his two books on probability theory [1,2], attest to the importance he attached to the idea of information. This Diary also illustrates the importance that R6nyi ascribed to wide-ranging nontechnical discussions of mathematical ideas as a way to interest students in mathematics. He believed the discussions served as vital teaching tools and stimuli for further research. R6nyi was part of the tidal wave of interest in information theory provoked by Claude Shannon's publications in the 1940s. The many papers actually published with titles like "Information Theory, Photosynthesis, and Religion" illustrate the tremendous and widespread initial interest in information theory. When R6nyi wrote his Diary, the initial wave of interest in information theory was dying out. In fact, R6nyi was unaware of a second major wave of interest in information theory slowly beginning to gather momentum in the 1960s. At that time, Andrei Kolmogorov and I independently proposed a n e w algorithmic information theory to capture mathematically the notion of a random, patternless sequence as one that is algorithmically incompressible. The development of this new information theory was not as dramatically abrupt as was the case with Shannon's version. It was not until the 1970s that I corrected the initial definitions. The initial definitions Kolmogorov and I proposed had serious technical deficiencies which led to great mathematical awkwardness. It turned out that a few changes in the definitions
led to a revised algorithmic information theory whose elegant formulas closely mirror those in Shannon's original theory in a radically altered interpretation [3]. In the 1970s I also began to apply algorithmic informarion theory to extend and broaden G6del's incompleteness theorem, culminating in the 1980s in an explicit constructive proof that there is randomness in arithmetic [4]. (For recent discussions of algorithmic information theory directed to the general scientific public, see [5-16].) R6nyi's D/ary stops at the brink between Shannon's ensemble information theory and the newer algorithmic information theory applying to individual sequences. With the benefit of hindsight, one can detect the germ of ideas that, if R6nyi had pursued them properly, might have led him in the direction of algorithmic information theory. Let us take the quotation at the head of this review. If R6nyi had developed it properly, it might have led him to my insight that incompleteness can be obtained very naturally via metatheorems whose spirit can be summarized in the phrase, "a theorem cannot contain more information than the axioms from which it is deduced." I think this new information-theoretic viewpoint makes incompleteness seem a much more menacing barrier than before. A second instance occurs later in R6nyi's Diary, p. 41: Therefore, the method of investigating the redundancy of a text by erasing and reconstruction is not appropriate. By this method, we would get a correct estimation of the real redundancy only if the reconstruction could be done by a computer. In that case, the meaning of the text wouldn't be a factor because a computer wouldn't understand it and could reconstruct it only by means of a dictionary and grammatical rules. If R6nyi could have formalized this, perhaps he might have discovered the complexity measure used in algorithmic information theory. (In algorithmic information theory, the complexity of a string or sequence of symbols is defined to be the size of the smallest computer program for calculating that string of symbols.) So R6nyi's D/ary balances on the edge between the old and the new versions of information theory. It also touches on connections between information theory and physics and biology that are still the subject of research [7,8]. In what remains of this review, I would like to flesh out the above remarks by discussing Hilbert's tenth problem in the light of algorithmic information theory. I will end with a few controversial remarks about the potential significance of these information-theoretic metamathematical results, and their connection with experimental mathematics and the quasi-empirical school of thought regarding the foundations of mathematics. THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 6 9
with a parameter k. This equation gives complete randomness as follows. Ask the question, "Does L(k) = R(k) have infinitely many solutions?" N o w let
Consider a diophanfine equation P(k, x l , x 2. . . .
) = 0
with parameter k. Ask the question, "Does P(k) = 0 have a solution?" Let
q = qoqlq2"'"
be the infinite bit string with q = qoqlq2"'"
be the infinite bit string with {~ qk =
qk =
if P(k) = 0 has no solution if P(k) = 0 has a solution.
01 if L(k) = R(k) has finitely many solutions if L(k) = R(k) has infinitely many solutions.
As before, let qn = q 0 q l ' ' ' q , - 1
Let q" = q0qs 9 9 9 q,-1 be the string of the first n bits of the infinite string q, that is, the string of answers to the first n questions. Let H(q n) be the complexity of q", that is, the size in bits of the smallest program for computing q". If Hilbert had been right and every mathematical question had a solution, then there would be a finite set of axioms from which one could deduce whether P(k) - 0 has a solution or not for each k. We would then have H(q") ~ H(n) + c.
The c bits are the finite amount of information in our axioms, and this inequality asserts that if one is given n, using the axioms one can compute qn, that is, decide which among the first n cases of the diophantine equation have solutions and which do not. Thus, the complexity H(q n) of answering the first n questions would be at most order of log (n) bits. We ignore the immense time it might take to deduce the answers from the axioms; we are concentrating on the amount of information involved. In 1970, Yuri Matijasevi~ showed that there is no algorithm for deciding if a diophantine equation can be solved. However, if we are told the number m of equations P(k) = 0 with k < n that have a solution, then we can eventually determine which do and which do not. This shows that
for some m ~ n, which implies that the complexity H(q n) of answering the first n questions is still at most order of log (n) bits. So from an information-theoretic point of view, Hilbert's tenth problem, while undecidable, does not look too difficult. In 1987, I explicitly constructed [4] an exponential diophantine equation
70
) = R(k, xl,x 2. . . .
H(q") >! n - c",
that is, the string of answers to the first n questions q" is irreducible mathematical information and the infinite string of answers q = qoqlq2 9 9 9 is now algorithmically random. Surprisingly, Hilbert was wrong to assume that every mathematical question has a solution. The above exponential diophantine equation yields an infinite series of independent irreducible mathematical facts. It yields an infinite series of questions which reasoning is impotent to answer because the only w a y to answer these questions is to assume each individual answer as a new axiom! Here one can get out as theorems only what one explicitly puts in as axioms, and reasoning is completely useless! I think this information-theoretic approach to incompleteness makes incompleteness look much more natural and pervasive than has previously been the case. Algorithmic information theory provides some theoretical justification for the experimental mathematics made possible by the computer and for the new quasi-empirical view of the philosophy of mathematics that is displacing the traditional formalist, logicist, and intuitionist positions [5].
References
H(q") ~ H(n) + H(m) + c'
L(k, x l , x 2. . . .
be the string of the first n bits of the infinite string q, that is, the string of answers to the first n questions. Let H(q n) be the complexity of q', that is, the size in bits of the smallest program for computing q'. N o w we have
)
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992
1. Alfr6d R4nyi, Introduction to information theory, Probability Theory, Amsterdam: North-Holland (1970), 540616. 2. Alfr6d R6nyi, Independence and information, Foundations of Probability, San Francisco: Holden-Day (1970), 146-157. 3. Gregory J. Chaitin, A theory of program size formally identical to information theory, Information, Randomness & Incompleteness--Papers on Algorithmic Information Theory, Second Edition, Singapore: World Scientific (1990), 113-128.
4. Gregory J. Chaitin, Algorithmic Information Theory, Cam- me a message from a fellow New Mexican w h o m I had bridge: Cambridge University Press (1987). not yet met: "Stan Ulam wants to meet Paul Cohen." 5. John L. Casti, Proof or consequences, Searching for CerStan worked in Los Alamos and lived in Santa Fe, tainty, New York: Morrow (1990), 323--403. 6. Martin Gardner, Chaitin's omega, Fractal Music, Hyper- midway between Los Alamos and Albuquerque. I drove Paul up to Santa Fe. We were to meet Stan in cards and More . . . . New York: Freeman (1992), 307-319. 7. David Ruelle, Complexity and G6del's theorem, Chance the bar of La Fonda, a well-known hotel across from and Chaos, Princeton: Princeton University Press (1991), the Plaza. For a brief interval I was alone with Stan, 143-149. 8. David Ruelle, Complexit6 et th6or6me de G6del, Hasard awaiting the rest of our group. In an effort at small talk I said, "I read somewhere that you are the father of the et Chaos, Paris: Odile Jacob (1991), 189-196. 9. Luc Brisson and F. Walter Meyerstein, Que peut nous H-bomb. But I'm sure that's just journalistic nonapprendre la science?, Inventer L'Univers, Paris: Les sense." Belles Lettres (1991), 161-197. "But it's true!" Stan replied, proudly and without 10. Gregory J. Chaitin, Le hasard des nombres, La Recherche hesitation. 22 (1991) no. 232, 610-615. With characteristic generosity, Stan w a n t e d to 11. John A. Paulos, Complexity of programs, GOdel and his theorem, Beyond Numeracy, New York: Knopf (1991), 47- present both of us with The Scottish Book, the famous 51, 95-97. book of mathematical problems assembled in Lw6w 12. John D. Barrow, Chaotic axioms, Theories of Everything, before the War, in a haunt of pre-war Lw6w matheOxford: Clarendon Press (1991), 42-44. 13. Tor Norretranders, Uendelige algoritmer, M~erk Verden, maticians, the Szkocka (Scottish) Coffee House. We drove up to Los Alamos, where he had numbers of Denmark: Gyldendal (1991), 65-91. 14. Gregory J. Chaitin, A random walk in arithmetic, The them in his office. It was one of my first trips to Los New Scientist Guide to Chaos (Nina Hall, ed.), Harmonds- Alamos. I remember the undistinguished grayness of worth: Penguin (1991), 196-202. the lab buildings. However, Paul Cohen was im15. Paul Davies, The unknowable, The Mind of God, New pressed by the mountain scenery between Los Alarnos York: Simon & Schuster (1992), 128-134. 16. Gregory J. Chaitin, Zahlen und Zufall, Naturwissenschafl and Santa Fe. He called it "an exaltation of the spirit." Every year or two after that I would bump into Stan. und Weltbild Hans-Christian Reichel, ed.), Vienna (1992). He always asked me about " y o u r friend Paul Cohen," whom I saw even more rarely than I saw Stan. By IBM Research Division various circumstances I came to know two close Yorktown Heights, NY 10598 USA friends of Stan's: Mark Kac and Gian-Carlo Rota. Mark and Stan had Lw6w in common. Mark had followed Sets, Numbers and Universes Stan as a student at Lw6w Technical University. There Stanislaw Ulam he heard talk about the legendary Stanislaw Ulam, Cambridge, Massachusetts: The MIT Press, 1974. 645 who had passed through a few years earlier. Gianpp. US $70.00 Carlo and Stan had Los Alamos in common. Stan Science, Computers, and People. From the Tree worked there in the glory days of the Manhattan Project, then returned after a few unsatisfactory years in of Mathematics Academe. It was on this second stint that he worked Stanislaw Ulam on the H-bomb. Gian-Carlo has long been a frequent Boston: Birkhauser, 1986. xxi + 264 pp. US $39.00 visitor to the lab, as consultant and collaborator. With Bill Beyer of Los Alamos and Jan Mycielski of Boulder, Analogies Between Analogies. The he edited the first book listed above. With Mark ReynMathematical Reports of S. M. Ulam and His olds, he edited the second. It contains the eulogy GianLos Angeles Collaborators Carlo delivered at Stan's memorial service. At times I Edited by AI Bednarek and Fran~oise Ulam had the privilege of joining Mark or Gian-Carlo at Berkeley, California: University of California Press, Stan's Santa Fe home. Much of the evening's warmth 1990. xvii + 565 pp. US $65.00 arose from the hospitality of Fran~oise, Stan's wife. On Reviewed by Reuben Hersh these occasions, I couldn't help appreciating Stan's wit In 1964 1 came from an instructorship at Stanford to an and good humor. However, one thing puzzled me. assistant professorship at the University of New Mex- After a single witty remark on any subject, that subject ico in Albuquerque (where I remain to this day.) That would be dropped for the evening. year I was able to invite some of my Stanford colStan died on May 13, 1984, at age 75. In 1987 Los leagues for brief visits to New Mexico. Alamos Science, a semi-annual periodical produced at One of our visitors was Paul Cohen. This was only a the Lab, published a handsome memorial issue. Two year or so after Cohen had proved the independence of years later Cambridge University Press re-published it the continuum hypothesis and of the axiom of choice, under the title From Cardinals to Chaos. It contains memrelative to the other axioms of Zermelo-Fraenkel set oirs by family and friends, and articles on Stan's printheory. Shortly before Paul's visit, someone relayed to cipal scientific interests--set theory, probability, comTHE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4, 1992 7 1
putation, dynamical systems, biology, and physics. Gian-Carlo wrote the lead article, an interpretive biography of his friend. It is remarkable for offering not only anecdotes and biographical data, but an actual theory. The theory is that an attack of encephalitis suffered by Stan in 1946 left him with permanent brain damage. Stan's inability or unwillingness to sustain mental effort, even for elementary algebra calculations, and his inability or unwillingness to stick to a subject in conversation, might have been, Rota suggested, aftereffects of encephalitis. Stan's acquaintances continue to debate this theory energetically. To m y mind, it does Stan no discredit. Think of continuing for decades to work at mathematics, while suffering such a disability! We applaud Monet who, while losing his eyesight, created his magnificent water-lily canvasses. We applaud Beethoven, who, after losing his hearing composed his Ninth Symphony. If Gian-Carlo is right, Stan deserves equal admiration. Ulam's papers are contained in three volumes. The first, Sets, Numbers and Universes (SNU), has two parts. The first part, titled "Mathematics," reprints 38 of U1am's published mathematical papers. There are some 20 early papers in topology with Polish collaborators: H. Auerbach, K. Borsuk, K. Kuratowski, Z. Lomnicki, S. Mazur, J. Schreier. There are three papers written with J. C. Oxtoby, including a famous one, "Measure Preserving Homeomorphisms and Metrical Transitivity," and other papers with D. H. Hyers, C. J. Everett, G. H. Meisters, W. A. Beyer, P. Erd6s and J. Mycielski. The second half of the book is "Computations, Games, and Numbers." It has an expository article on the Monte Carlo method co-authored with Nicholas Metropolis (see discussion below, in ABA), and several reports of numerical experiments, including the famous Fermi-Pasta-Ulam paper, "Studies on non linear problems" (discussed below, in ABA.) The next collection, Science, Computers, and People, contains non-technical writings accessible to the general reader. It was a well known "secret" about Stan that he didn't write. On the other hand, Fran~oise Ulam is a skilled writer who reviews books for several publications. Her article "Non-Mathematical Reminiscences about Johnny yon Neumann" was an invited address to an A.M.S. memorial meeting for yon Neumann. Stan dictated his autobiography, Adventures of a Mathematician, to a tape recorder and then handed it to Fran~oise to turn into a book. The same is true of the essays in Science, Computers, and People. Martin Gardner tells the reader as much in his introduction to the book. In her introduction to it, Fran~oise Ulam describes how Stan got published though "the physical act of taking pen to paper has always been painful for him." The essays in this book fall into three groups: 1. Six are on computing: computer chess, parallel computing, pattern generation by computer, comput72
THE MATHEMATICAL INTELL1GENCER VOL. 14, NO. 4, 1992
ing in mathematics, and a "personal retrospective" on computing. 2. Five are exposition and speculation on the applications of mathematics to physics or biology. 3. Nine are historical and/or biographical: "Thermonuclear Devices," "The Orion Project," three articles on von Neumann, and one each on George Gamow, Marian Smoluchowski, Stefan Banach, and Kazimierz Kuratowski. (Stan was Kuratowski's first doctoral student at Lw6w Polytechnic Institute). Ulam's main article on yon Neumann, "John von Neumann, 1903-1957/' is the best piece in the book. The reader feels the intellectual comradeship between Stan and Johnny, and Stan's sense of loss at Johnny's death. "Thermonuclear Devices" tells about work on the hydrogen bomb at Los Alamos. Stan aims to show that this was a very difficult, very important scientific breakthrough, and that it has promise of peaceful applications. He makes clear his grounds for claiming to be the Bomb's father. He does not discuss its military significance, nor the people (atomic scientists included) who opposed its production. The Orion Project was a scheme to travel around the solar system by nuclear propulsion. One "nuclear bomblet" after another would drive a large heat- and radiation-resistant plate fastened to a space ship. Freeman Dyson tells this fantastic story in Disturbing the Universe. Ulam's article explains and advocates the project. Fortunately or unfortunately, an international treaty banning nuclear explosions in the atmosphere rendered the project illegal. ABA contains two technical discussions of the Orion Project, "On a Method of Propulsion of Projectiles by Means of External Nuclear Explosions," co-authored with C. J. Everett, and "Some Schemes for Nuclear Propulsion, Part I," coauthored with Conrad Longmire. There is also an article presenting some mathematical models of space and space-time, and an article on philosophical implications of recent scientific discoveries. The most recent Ulam anthology is Analogies Between Analogies (ABA), edited by Frangoise Ulam and AI Bednarek of the University of Florida. It consists of Los Alamos technical reports, most of them not previously published outside the Lab. Prof. Bednarek writes, "Many of these reports and much of Ulam's work was done in collaboration. He liked to stress the importance of working with collaborators." The collaborators here w e r e Bednarek himself, William Beyer, C. J. Everett, Enrico Fermi, David Hawkins, Conrad Longmire, John von N e u m a n n , John Pasta, Robert Richtmyer, Robert Schrandt, M. L. Stein, Paul Stein, Temple-Smith, and Mary TsingouMenzel. Analogies Between Analogies contains several pieces that were important in their time, and still have gen-
uine historical interest. Second to the H-bomb, Stan's name was most often coupled with the "Monte Carlo method." This is, in brief, the process of making repeated trials of a suitably chosen chance experiment, to calculate nonrandom quantities of physical or mathematical interest. The name "Monte Carlo," suggested by Nick Metropolis, is a jazzy synonym for "random" or "statistical." The repeated random trials are performed on a computer, using pseudo-random numbers. This book contains a 1947 report, "Statistical Methods in Neutron Diffusion," presented by Ulam as "the first published ideas and proposals for the Monte Carlo Method." Bednarek writes, "the 'report'--of which only eight copies were made--consists of two letters, and handwritten calculations photographed and stapled together. Its cover specifies that the 'work' was 'done' by Ulam and yon Neumann and 'written' by yon Neumann and Robert Richtmyer--then head of the Laboratory's Theoretical Division." (The quotation marks are Bednarek's.) In fact, Enrico Fermi in the 1930s successfully used the "Monte Carlo method," of course without giving it that name. The method often requires extremely lengthy computations, and for that reason many scientists had dismissed it as impractical. At the Manhattan Project, high-speed computers were available, essentially for the first time in history. (Not high-speed by today's standards, of course!) Stan and Johnny realized that with these machines Monte Carlo could be competitive. Stan convinced some physicists to try the method. Before long, applied mathematicians adopted the "Monte Carlo method" as part of their tool kit. Stan contributed to this acceptance by writing and talking about Monte Carlo, and by directly influencing people to try it. (See also "The Monte Carlo Method" with Metropolis in SNU.) Since World War II, branching processes have become an established part of the theory of stochastic processes. Ulam and collaborators in Los Alamos conducted two early studies of branching processes, under the name "multiplicative processes." They are included in this volume. The first, coauthored with Hawkins, is "Theory of Multiplicative Processes." The second, with Everett, is "Multiplicative Systems in Several Variables.'" One must pay respects to the most famous of Ulam's Los Alamos reports, the "Fermi-Pasta-Ulam" paper, "Studies of Nonlinear Problems" (also reprinted in SNU). Following suggestions made by von N e u m a n n in the 1940s, Ulam was one of the earliest and most ardent advocates of machine computation to help in exploring u n k n o w n mathematical territory. This much-referenced three-author paper was a numerical study of the motion of a one-dimensional system of springs coupled together nonlinearly (cubically and quartically). To the surprise of the authors, the evolution of the system was not ergodic--the total energy of the vibration did not tend to become equally distrib-
uted among all admissible modes or frequencies. Instead, the evolution was periodic, or almost periodic! 1 Analysis of the phenomenon by Norman Zabusky and Martin Kruskal revealed a link with the Korteweg-De Vries equation of fluid dynamics, and gave rise to the vast, ever-expanding soliton industry of today. It is amusing, perhaps ironic, that the Fermi-PastaUlam paper contradicts the main result of another well-known paper of Ulam's, his early "Measure Preserving Homeomorphisms and Metrical Transitivity,'" written with Oxtoby in the '40"s. According to OxtobyUlam, the ergodic functions are a "residual set" in the space of continuous functions. (They are the complement of a countable union of nowhere dense sets.) One might therefore expect a flow chosen at random to be almost surely ergodic. In the Fermi-Pasta-Ulam experiments, however, numerical calculations consistently g e n e r a t e d non-ergodic flows. The fact that smooth Hamiltonian systems are all exceptional in the Oxtoby-Ulam category--a consequence of the Kolmogorov-Arnold-Moser theorem--provides the explanation for this paradox. I would like to thank Peter Lax for invaluable conversations about the subject of this review. Department of Mathematics University of New Mexico Albuquerque, NM 87106 USA
A d d e n d u m to the Review: A History of NonEuclidean Geometry by B. Rosenfeld, translated by Abe Shenitzer with the editorial assistance of Hardy Grant, New York: Springer-Verlag, 1988.
In m y review that appeared in volume 14 issue 3 of the Mathematical Intelligencer, I failed to acknowledge the debt of gratitude we owe to the translators of this book, Abe Shenitzer with editorial assistance of Hardy Grant. That the scope and depth of the book is as majestic and the story it tells consistent is due to their careful work.
John McCleary Vassar College Poughkeepsie, NY 12601 USA
I In a recent issue of Daedalus (Winter, 1992, vol. 121, no. 1, p. 129), Nick Metropolis revealed that the Fermi-Pasta-Ulam discovery w a s reminiscent of the "accidental" discoveries often made by chemists a n d microbiologists. "By accident one day," Metropolis writes, "they let the program run long after their steady state had been reached. W h e n they realized their oversight and came back to the computer room, they noticed that the system, after remaining in the steady state for awhile, had then departed from it, and reverted to the initial distribution of energy (to within two percent) . . . Fermi believed this computer-simulated discovery to be his greatest contribution to science." THE MATHEMATICALlNTELLIGENCERVOL. 14, NO. 4, 1992 73
Robin Wilson*
Greek Mathematics II From about 500-300 B.C., Athens became the most important intellectual center in Greece, numbering among its scholars Democritus, Plato, and Aristotle. Although none of them is remembered primarily as a mathematician, they helped to set the stage for the "Golden Age" of Greek mathematics in Alexandria. Democritus (circa 460-370 B.C.) made important contributions to both physics and mathematics. In physics, he first proposed the view that matter is composed of indivisible elements, called atoms. In mathematics, he stated the formula for the volume of a pyramid and wrote works on numbers and geometry. Plato (428-347 B.C.) and Aristotle (384-322 B.C.) were two of the greatest philosophers of all time. Around 387 B.C., Plato founded his Academy in Athens, which soon became an important center for mathematical study and philosophical research. Aristotle studied there for 20 years and systematized the study of logic and deductive reasoning. In a lighter vein, we note that the Greeks were fond of recreational mathematics. The Cretan maze shown here dates from about 350 B.C. and is, according to Greek mythology, the labyrinth in which the Minotaur was confined.
* Column editor's address: Faculty of Mathematics, The Open University, Milton Keynes, MK7 6AA England. 74
THE MATHEMATICAL INTELLIGENCER VOL. 14, NO. 4 9 1992 Springer-Verlag New York