Letters to the Editor
The Mathematical Intelligencer encourages comments about the material in this issue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
Pygmies and their Shadows
In my review of Indiscrete Thoughts, by Gian-Carlo Rota [4], I noted that the last sentence of the book, "When pyg mies cast such long shadows, it must be very late in the day," was an adap tation of Erwin Chargaffs dictum [1, p.641] "That in our days such pygmies throw such giant shadows only shows how late in the day it has become." I am grateful to Professor Kurt Bret terbauer of Technische Universitat Wien for pointing out that Chargaffs formulation is itself based on a well known saying attributed to the Viennese satirist and critic Karl Kraus: "Wenn die Sonne der Kultur tief steht, werfen selbst Zwerge lange Schatten" ("When the sun of culture is low, even pygmies cast long shadows"); cf. [3, p.421]. Since Chargaff cites Kraus in his autobiography as having been "the deepest influence on my formative years" and "truly my only teacher" [2, p.l4], there can be no doubt that he was familiar with Kraus's mot. It will not have escaped the careful reader that Kraus's formulation does
Taking
the
not say quite the same thing as those of Chargaff and Rota; indeed, the lat ter provide a kind of incomplete or de fective converse to the former.The de fect in question was pointed out already in my review ("Can I be the only one to have noticed that shadows are just as long in the morning-or are we all late sleepers?").
REFERENCES
1 . Erwin Chargaff, Preface to a Grammar of Biology,
Science
1 72 (1 971 ), 637-642.
2. Erwin Chargaff, Heraclitean Fire, Rockefeller Univ. Press, New York, 1 978.
3. Johannes John,
Rec/ams Zitaten Lexikon,
Stuttgart, 1 992. 4.
The Mathematical lntelligencer 21
(1 999),
no.2, 72-74.
Lawrence Zalcman Department of Mathematics and Computer Sc ience
Bar-llan University
52900 Ramat-Gan Israel e-mail:
[email protected]
Easy Way
...till, demanding proof, And seeking it in everything, I lost All feeling of conviction, and, in fine, Sick, wearied out with contrarieties, Yielded up moral questions in despair, And for my future studies, as the sole Employment of the enquiring faculty, Tum'd towards mathematics, and their clear And solid evidence ... William Wordsworth,
The Prelude
© 2000 SPRINGER-VERLAG NEW YORK, VOLUME 22, NUMBER 2. 2000
3
SAMUEL S. HOLLAND, JR.
My Years as a Fu -Time Industria Mathematician
•
~
ndustrial mathematics has been getting more attention lately; witness the articles [2,6,15,16]. Much of this renewed attention and interest surely derives from the em ployment concerns of our graduate students who have been facing a dismal acade mic job market. Many, many years ago, from
as an industrial mathematician.* While my work since that time has been solely in "pure" mathematics in an academic setting, nonetheless many of the memories from those early days are still fresh with me. So it seemed to me that I might still usefully contribute to the ongoing discussion by putting on record a couple of my own experiences from those days long gone by. And by putting forth some of my own personal conclusions drawn therefrom, about industrial mathematics itself, and about the training of industrial mathematicians. I shall describe two projects in which I was involved, one at the beginning of my time with Technical Operations, Inc. (Tech/Ops), then a fledgling Massachusetts firm, and one at the end of my service with that company. Project One
Project One, a study of the penetration of neutrons in air, was supported in part by the United States Air Force un der a contract monitored by the Director, Research Directorate, Air Force Special Weapons Center (AFSWC), Kirtland Air Force Base, New Mexico. This work was done in 1955. Dr. Paul I. Richards was the primary investigator
on the project, and I was the other half of the team, hav ing just joined the company (Tech/Ops). Testing of nuclear weapons was still underway during this "cold war" period, and the Air Force wanted to know what neutron flux to ex pect from a nuclear weapon explosion in air, not only as a function of the horizontal distance from the explosion, but also as a function of neutron energy. They gave us three months, and they wanted numbers. At this point it might be well to point out to the aspir ing industrial mathematician the difference between work ing in a small start-up company, like Tech/Ops in 1955, where the entire company could go to lunch together, and a large multinational firm like AT&T. A small company is, and must be, very customer-oriented.There is no place to hide-each scientific staff person is usually directly in volved in some sort of commitment to a customer, and needs to get the job done-i.e., fulfill the contract to the customer's satisfaction.There is usually very little time to pursue ancillary mathematical questions that come up, however interesting they might be. In contrast, a large firm can internally finance research, and can allot more time for
"Actually 1 954-1 959 full-time, and 1 960--1 965 full-time summers but part-time otherwise.
4
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
1954 through 1965, I made my living
basic studies that have the potential to enhance the firm's corporate expertise [2]. So, in essence, AFSWC had a contract with Paul Richards and me; we were the "team." We began by settling on a sim plified model that we could realistically hope to analyze, yet would capture the essential features of the actual situation. Our model: an isotropic, monoenergetic point neutron source in an infinite, constant-density air medium. The physical process is this: a neutron, emitted at en ergy E0 by the source, proceeds unimpeded until it strikes an air molecule, either nitrogen (N2), or oxygen (02). In this collision three things can happen: (1) the neutron can scatter elastically-a "billiard ball" collision, (2) the neu tron can scatter inelastically, leaving behind some of its en ergy in the scattering molecule, or (3) the neutron can be absorbed by the struck molecule-it disappears. In the first two cases the neutron loses energy and changes direction, then proceeds to the next collision. The probability of a collision of any one of these three types depends on the energy of the impacting neutron, and is generally known only numerically, where it is known at all. One has analyzed this steady-state process completely when one knows the neutron flux N(r, n, E), the number of neutrons per second per unit energy E crossing a unit area orthogonal to the unit vector n, at distance r from the source. The flux N(r, 0, E) satisfies, and is determined by, the trans port equation. Thus we have modeled this industrial prob lem, as is very often the case, with a differential equation the transport equation. Actually, in this case, an integra differential equation of some considerable complexity. And now, having set up the mathematical model, our job is to use it to provide the information desired by the customer. The industrial mathematician frequently tackles problems in two stages: first, set up a mathematical model for the process under study (which we have just done); second, solve (in some fashion) the mathematical model. As we have just finished with the first stage of this particular project, we have reached a natural point for the question: What lessons are here for the training of today's industrial mathematician? As regards this training, I would first like to look at it apart from its mathematical component; then at the end of this section give my suggestions for the mathematical preparation of industrial mathematicians. First we must note that this specific kind of problem penetration and diffusion of radiation in matter-no longer has any contemporary interest. The day of nuclear testing, and the days of intense interest in nuclear reactors, are over. That particular study, so prominent 40 years ago, has gone out of fashion. Yet I think lessons can be drawn from our project that are still relevant today. Indeed, 40 years from now many of today's hot topics will probably have given way to new, difficult-to-imagine subjects. One would hope to formulate advice that shall remain valid even then. To set up our mathematical model, we needed to know the language and the basic aspects of physics, including atomic physics. I believe that basic instruction in physics and chemistry is a must for the aspiring industrial mathe matician-indeed for any serious student of science.
Further, today's industrial mathematician may have to know the language of molecular biology, or be conversant with the fundamentals of modem electronic circuitry. And as science advances, the training program must keep pace. The confidence of an industrial client will rise in propor tion to the consultant's familiarity with his teirninology and underlying concepts. But, however broad this training in the basic sciences, it seems impossible to have a program that covers all pos sibilities. Industrial mathematics includes an enormous va riety of areas, and requires many different skills. Here is a sample of the variety of jobs that flowed through Tech/Ops while I was there-I've listed the principals in parentheses: Penetration of neutrons in air (Paul Richards and me-this is Project One); Analysis of the method of constructing raised topographic maps from aerial stereo-photographic data (me); Mathematical explanation of "shock waves on the highway" (Paul Richards-these are traffic jams that persist long after the accident has been cleared); Fortran code for Monte Carlo calculations (Dom Raso); Is it pos sible to construct an engine that bums boron? (Everett Reed); Analysis of the exploding-foil hypervelocity gun (me-this is Project Two). And this tiny sample is taken from memory of work done by a small firm 40 years ago. One generalization that may hold up: industrial mathemat ics necessitates continuous on-the-job training. This is part of the fun of industrial mathematics, and part of the chal lenge. Often a given project will take a period of intense effort considerably longer than the three months Paul and I put into the AFSWC contract. Once a project is fmished, then it is on to another totally different one. Variety is the spice of industrial mathematics. Getting back now to our neutron penetration problem, having settled on the transport equation to model the process, Paul Richards and I had three months to use this equation to provide the Air Force with their report con taining quantitative predictions. The Air Force wanted N(r, E) = fN(r, 0, E)dO, the neutron flux integrated over all solid angles. This tells them the number of neutrons with energies between E and E + dE per unit area striking a small spherical target at distance r from the source. The transport equation, a linear but very complicated integra differential equation, is basically just a conservation equa tion-it counts neutrons at a particular energy E entering and leaving an infinitesimal volume in space. So advanced calculus training prepares one to understand this equation. But getting numerical answers from it is another matter, especially in our case where the input data, the various "cross-sections," were known only as rapidly varying nu merical functions of the neutron energy.Furthermore, the angular distributions of scattered neutrons were also com plicated and generally numerically given. The transport equation, in the early 1950's, was the fo cus of a great deal of attention, occasioned by generous government funding. Thus, when Paul and I began to work, we had a substantial high-quality literature to consult. The work of Lewis V. Spencer was crucial [17,18]. Spencer solved the transport equation this way: He noted that it was
VOLUME 22, NUMBER 2, 2000
5
equivalent to an infinite linked system of Volterra integral equations in the moments, fr nN(r, E)dr, of the distribution function. These integral equations admit numerical solu tions so that the first few moments can be calculated. Spencer then devised techniques to reconstruct N with reasonable accuracy from a knowledge of its first few mo ments. His method was tailor-made for our problem. We applied it successfully, and gave the Air Force their num bers in the allotted time [7,8,9]. In this pioneering outstanding work of L. V. Spencer, an exemplary model of industrial mathematics in my opinion, one finds the following topics: series expansions in polyno mials orthogonal with respect to a given weight function; the Fourier-Laplace transform and its inverse-particularly the relation between the strength of a singularity of the trans form and the asymptotic behavior of the original function; Gaussian quadrature; and Bessel and other special functions. These topics are representative of a general area, that one might call classical analysis, that should certainly be part of our graduate training in industrial mathematics. This training in classical analysis can be general. To han dle the variety of special applications that can arise in prac tice, one may consult reference books and on-line sources. On special functions, there is the classic work by Magnus, Oberhettinger, and Soni [13] and the three volumes of the Bateman Manuscript Project [4]. The Comprehensive Hand book [1] edited by Abramowitz and Stegun is currently be ing revised and put on the Web [12]. On integral tables and integral transforms, there are the two volumes [5], also of the Bateman Manuscript Project. The book [3] by Campbell and Foster has an extensive table of Fourier integrals. As for numerical tables and graphs of the special functions, they are rapidly becoming, like me, a relic of the past. Modern software packages such as Mathematica have built-in subroutines for most of the special functions, like Bessel functions, that will print out tables and graphs with a few touches of the keys. And these software packages enable one to use the special functions in programmed cal culations just like the sine and cosine. Beyond classical analysis, there are no other graduate courses that had a direct bearing on my work in industry. But then, I worked only briefly in one small company. Recommendations from other industrial mathematicians would be welcome here. In addition to course work, I be lieve that any industrial mathematics graduate program should provide the opportunity for its students to spend a year or two in industry. One can give all the academic courses in industrial mathematics that one likes, there is no substitute for being there. As for the undergraduate program in mathematics, I would recommend broad training at this stage, rather than a premature specialization. This from my own academic experience with undergraduate mathematics majors who really need to see various kinds of mathematics before they decide in which direction they would like to tilt. Summer work in industry is valuable, both financially and academ ically, for any undergraduate mathematics major. Apart from the course listings in mathematics, my own
6
THE MATHEMATICAL INTELLIGENCER
experience-on both sides of the aisle-has left me with the conviction that the manner in which the mathematics is conveyed is at least as important as what is taught. I have come to the earnest belief that, in the teaching of mathematics, understanding should come before rigor; that motivation, geometric meaning, and physical connections (where applicable), even numerical experimentation, should occupy just as prominent part of the presentation as proofs. This especially at the undergraduate level for students tilting toward applicable mathematics, but ap propriate to some extent for all students at all levels. In my university life, I put considerable effort into implementing this philosophy through the writing of my book [10]. (This solely at the undergraduate level, as most of my graduate teaching centered around my own research in algebra.) My book deals with orthogonal function expansions, the various classical ordinary and partial differential equations, and Dirac's delta function, among other things. My indus trial experience together with my subsequent experience in the undergraduate classroom combined in my mind to shape my presentation of these subjects according to the above-mentioned philosophy. For example, in treating Fourier series and other or thogonal function expansions, I had the students compute a few partial sums of the expansion and graph the result against the function. For Legendre polynomials, I explained how the Legendre partial sum is a global approximation while the Taylor polynomial is local-again having them il lustrate this by numerical tables and graphs. All this nu merical work is easy with currently available hardware and software. As for convergence of these series, many text books spend a great deal of time discussing various hy potheses that will guarantee pointwise, uniform, or ab solute convergence. In our work on the transport equation, Paul Richards and I took for granted that any such series "represented" its function. In this we were following all those we consulted and all those whose work we read. Not wishing to be this cavalier in class lest my academic col leagues suffer from shock, I took a cue from a remark I heard Professor George Mackey make (in connection with his work on quantum mechanics), namely that L2 conver gence is more appropriate for physical applications be cause a physical measurement is an average. So in my book I used L2 convergence exclusively. Testing for L2 conver gence amounts to determining whether or not the integral of Jt12 is finite or infinite, so falls under the calculus topic of "improper integrals". (Lebesgue measurability is irrele vant when dealing with functions that arise in practice.) In dealing with the wave equation, the heat equation, and Schrodinger's equation I took pains to derive them in detail from basic physical principles. (This was a special challenge for Schrodinger's equation.) I did these deriva tions to emphasize the physical connections, and to illus trate how one captures the essence of a complex situation in a mathematical model. Especially valuable training for an aspiring industrial mathematician, and, I think, valuable also for any student of mathematics no matter what his or her eventual specialty.
Dirac's delta function figures prominently in our work in the transport equation. And we used it, as did all others in this area, as ajunction. While the work of Sobolev, Schwartz, Lighthill, and others demonstrates that Dirac's formalism has a rigorous foundation, yet, when it comes to practical cal culations, no alternative formalism comes anywhere near Dirac's in elegance and simplicity. So, in my book, I explained the delta function just as Dirac explained it, and used it just as he used it. I feel that such training provides the students with a very useful tool. And, when properly explained, it does not corrupt their mathematical education in any way. The course based on my book ran for many years here at UMass. Taught not only by me, but also by Professors Richard Ellis, H. T. Ku, and Peter Norman. But, one se mester before I retired, the course was dropped. I know of no other university that developed a similar course. The lack of enthusiasm for my book exceeded my wildest ex pectations. No matter-while I may have overreached to some extent, I know that, basically, I am right. Project Two
block. The gun is fired by discharging a large capacitor through the foil. The explosion of the lower half of the foil is totally contained since it is between the separating insu lator and the lucite block. The upper half of the exploding foil impinges on the Mylar sheet, blowing a circular piece through the 0.32 em hole in the top of the sandwich. This punched out Mylar disk is the projectile of the gun A typi cal firing sequence uses a !-microfarad capacitor bank charged to 100 kilovolts, which thus contains 5000 joules of stored energy. The firing destroys the gun. As I mentioned, the standardized gun described above was the end product of a three-year experimental program. This program included not only trial-and-error evaluation of various gun designs, but included as well development of complex optical and electronic ultra-high-speed diag nostic systems to measure the various physical parameters, such as energy deposition rate and particle velocity. Now, with the testing phase of the project winding down, and with the more-or-less standardized guns being produced in quantity, Tech/Ops, and the Contractor, sought a theory for the gun a theory that might predict some of the observed phenomena-especially particle velocity and might suggest means for improving the performance of the gun, means more economical than cut-and-try. Standard ballistic theories do not apply to this gun for many reasons: (1) the energy is deposited in the breech electrically rather than chemically, (2) the rate of energy deposition is very great, on the order of 500 joules per mi crosecond, (3) the time scale is very short-the whole fir ing sequence is over in a few microseconds, (4) the pro jectile mass is of the order of milligrams, comparable to the mass of the driver gas, and (5) the projectile velocity is in the centimeter-per-microsecond range. A theoretical analysis needs to incorporate these conspicuous features. There is no "typical" industrial mathematics problem. But the problem which I have just described in some consider able detail does exemplify certain aphorisms that I have al luded to earlier. Industrial mathematics is fun. It is exciting. The industrial mathematician sees new things, has a wide va riety of experiences. Can be led into a room, wait while a ca pacitor is charging, see a brilliant flash and a tremendous crack, then be led away with the entreaty, "We need a theory for that." Industrial mathematics differs from academic pure mathematics which relies primarily on self-motivation, and sometimes suffers from lack of such motivation. In industrial mathematics, the problem is here and the time is now. My theory of the exploding-foil gun was based on a num ber of simplifying assumptions: (1) thermal equilibrium is maintained in the breech, (2) energy losses to the breech, projectile friction losses, and blow-by losses are negligible, (3) the pressure and temperature of the breech gas are functions only of time, not of position, (4) the cumulative energy in the breech is a linear function of time Cexperi mentally observed), and (5) the breech is filled with a per fect, monatomic, non-ionizing gas. Let x(t) denote the distance the projectile has moved in time t, x(O) x' (0) = 0. If V0 is the initial breech volume, A the area of the breech (same as the area of the projec.
,
Project Two, done ten years later, in 1964, dealt with the ex ploding-foil hypervelocity gun This project was done under contract between Tech/Ops and the Air Force Materials Laboratory at Wright-Patterson Air Force Base, Ohio. Its pur pose is well described in the introduction to the final report: .
. . . The goal of the over-all investigation was to de velop a system that could accelerate milligram-size particles to velocities in excess of 30 kmlsec. With such a capability, the effects of micrometeoroid impacts on materials could be studied in the laboratory as a requisite first step in the development of protective de vices for space vehicles and missiles subject to dam age and destruction by solid particles moving at high velocities in space. The "system" to generate these high-velocity particles was the exploding-foil hypervelocity gun. The experimen tal program to develop this gun had begun in 1961, and was already at a mature stage in 1964 when I was asked to con tribute a theoretical analysis. So there was an intense learn ing period for me as I was brought up to speed on this com plex experimental program. The exploding-foil hypervelocity gun is a sandwich. The ham is an insulator sheet 0.080 in thick The lower piece of bread is a 1-cm-thick lucite block, and the upper piece a 0.161-cm Fiberglass plate. This upper piece has a 0.32-cm hole in its center. Between the ham and the upper piece of bread is interposed a thin (0.0254 em) sheet of Dupont Mylar. Flat copper strips, fed in through the side of the sandwich on either side of the ham, are joined at the center by an alu minum foil loop (the ham has to be penetrated to complete the loop). Hence, looking down on the sandwich from the top one sees, through the 0.32-cm hole in the top Fiberglass plate, first the Mylar sheet, next the top half of the aluminum foil loop, then the separating insulator sheet (the ham), then the bottom half of the foil loop, and fmally the backup Lucite
=
VOLUME 22, NUMBER 2, 2000
7
tile), then L V0/A has the dimension of length, and y(t) 1 + x(t)IL is dimensionless. Combining the equation of state of a perlect gas, the formula for the internal energy in a perlect monatomic gas, Newton's law F ma, and conservation of energy, one gets the following nonlinear second-order differential equation for y: =
=
=
�:� (��r = at
3y
,
+
Here a =
+
y(O)
=
1,
y'(O)
=
0.
(1)
�)L2 , where ,\is the constant energy dep-
(m + 3
+
s(z ' )2
=
1
g' z(O)
=
1,
z'(O)
=
'
(2)
gration is inherently more accurate than numerical dif ferentiation. Make the substitution w z� in (2), then in =
tegrate once to get =
w
+
Neutron fu l x spectra in air,
J. Appl. Phys. 27 (1 956), 1 042-1050. [9] S. S. Holland, Jr . ,
Neutron penetration in infinite media; calcula
J. Appl. Phys. 29 (1 958),
tion by semi-asymptotic methods,
[1 0] S. S. Holland, Jr.,
Applied analysis by the Hilbert space method,
[1 1] S. S. Holland, Jr. ,
The exploding-foil hypervelocity gun,
1 964,
preprint. [1 2] http://math.nist.gov/DigitaiMathlib/ [13] W. Magnus, F. Oberhettinger, and R. P. Soni,
Formulas and the
orems for the special functions of mathematical physics,
3rd enl.
ed., Springer-Verlag, 1 966. [1 4] R. W. O'Neil, S. S. Holland, Jr., T. Holland, V. E. Scherrer, and H. Stevens,
1
IS
where z(s) = y(t), and primes denote differentiation with respect to s. Equation (2) would seem to require a numer ical solution. To cover the time period of interest, about 3 microseconds, we need information over the range 0 :5 s :5 105. Hence any numerical solution method needs both sta bility and convenience. At this point there comes into play an important rule of numerical methods: numerical inte
3sw'
[8] S. S. Holland, Jr. and P. I . Richards,
AFSWC-TR-55-27 (Unpublished).
Dekker, 1 990.
2
3sz")z
Penetration of neutrons in
air,
827-833.
osition rate (Joules/sec), m is the mass of the projectile, and M the mass of the driver gas. While y(t) is dimensionless, t is not. Introduce the di mensionless variables = at?. Equation (1) then becomes
(2z'
[7] S. S. Holland, Jr. and P. I. Richards,
Effects of hypervelocity impacts o n materials,
Tech/Ops
Report AFML-TR-65-1 4 (Unpublished). [1 5] D. G. Schaeffer, math,
Memoirs from a small-scale course on industrial
Notices Amer. Math. Soc. 43 (1 996), 550-557.
[ 1 6] J. Spanier,
The mathematics clinic: an innovative approach to re
alism within an academic environment,
Amer. Math. Monthly 83
(1 976), 771 -775. [1 7] L. V. Spencer and U. Fano,
Penetration and diffusion of x-rays.
Calculation of spatial distributions by polynomial expansion,
J. Res.
Nat'l. Bur. Stds. 46 (1 951 ), 446-456. [1 8] L. V. Spencer,
Penetration and diffusion of x-rays: Calculation of
spatial distributions by semi-asymptotic methods,
Phys. Rev. 88
(1 952), 793-803.
1
4 8 ds Vw- 1, 27 0
w(O)
=
1,
w ' (O)
=
2 . 27
(3)
AUTHOR
Still nonlinear, and still requiring a numerical solution, but much easier to solve accurately than (2). My numerical procedure to solve (3) was coded in Fortran for me by Peter Flusser, a company expert in pro gramming, and was on an IBM7094 (this was in 1964). The single solution to the dimensionless equation (3) al lows one to compute projectile velocities and other gun pa rameters for any particular gun configuration [11,14]. The maximum difference between theory and experiment was 12 percent, despite all the simplifying assumptions. In in dustrial mathematics, as in life, it sometimes pays to be lucky.
run
SAMUEL S. HOLLAND, JR.
Department of Mathematics and Statistics University of Massachusetts Amherst, MA 01 003-4515
REFERENCES
[1 ] M. Abramowitz and I. A. Stegun, tions,
N. B.S. Appl. Math. Ser. 55, U. S. Gov't. Printing Office, 1 965.
[2] R. Calderbank, in industry,
plications,
D. Van Nostrand, 1 954.
[4] A. Erdelyi (Editor),
Higher transcendental functions,
3 vols. ,
Tables of integral transforms,
2 vols. , McGraw
Hill, 1 954. [6] A. Friedman and F. Santosa,
Graduate studies in industrial math
Notices Amer. Math. Soc. 43 (1 996), 564-568.
THE MATHEMATICAL INTELLIGENCER
Samuel S. and his
v
Holland, Jr., recei ed his
Ph.D. in Mathematics under
in 1 961 . His Bachelor's thesis
Review. Mu ch
of Professor
McGraw-Hill, 1 953. [5] A. Erdelyi (Editor),
ematics,
e-mail:
[email protected]
A personal perspective on mathematics research
Notices Amer. Math. Soc. 43 (1 996), 569-57 1 .
[3] G . A. Campbell and R . M . Foster, Fourier integrals for practical ap
8
USA
Handbook of mathematical func
of his
life between
B.S. in Physics
in 1950,
Lynn Loomis at Harvard
was published
in
Physical
then and his present status
Emeritus is under considerati o n
in this a rticle .
He
enjoys his wife Mary of 41 years, his children and g randch ild ,
,
his friends and colleagues, the ocean, downhill skiing, choco
late,
scotch whiskey
and a good cigar.
RICHARD KAYE
Minesweeper NP-comp ete NP-completeness
Many programming problems require the design of an al gorithm which has a "yes" or "no" output for each input. For example, the problem of testing a whole number for primality requires an algorithm which answers "yes" if the input number x is prime, and "no" otherwise. In trying to devise an algorithm to solve a given problem, one aspect of obvious practical importance is the time it takes to run. Since a typical algorithm may take more time on some inputs than others, the running time of an algorithm is usually regarded as a function of the input. For technical reasons, it is convenient to consider the way this function varies with the number of symbols required to write the in put. (This number of symbols is usually denoted by n.) For example, for the input 17, our algorithm may require this number to be written in binary (as 10001), so here n = 5. Different algorithms for the same problem may run in different amounts of time, due perhaps to the different cod ing methods used or to different theoretical bases for the algorithms. However, it may be that for a particular prob lem, all valid algorithms can be shown to take at least a certain amount of time, due to the inherent difficulties in the problem being solved. Complexity theory aims to study the inherent difficulties of problems, rather than the time or memory resources used by any particular algorithm or program. It is certainly possible to find problems that can only be solved on a computer using a huge amount of time. It is also possible to fmd sensible-sounding problems that can not be solved on a computer at all! However, there are two classes of problems that are of greatest interest for com plexity-theorists. The first of these classes is the collection, P, of Poly nomial-time computable problems. These are the prob-
IS
lems that can be solved on a normal computer and within an amount of time of order n, or n2, or n3, or n4, (As be fore, n is the number of symbols required to write down the input to the problem. Note in particular that the run ning time of such a program is bounded by a polynomial in the length of the input, not the input itself.) Of course, for a rigorous treatment of the subject, a pre cise definition of the mathematical model of computer we are using and what constitutes the running time of the com puter, must be given. For the purposes of this article I will be less precise, but give here the two main points. Firstly, our computers will have an unlimited amount of memory that is to say that they always have enough memory to com plete the computation in hand. This does not seem partic ularly restrictive, as any terminating computation can only use a fmite amount of memory anyway, and for most al gorithms considered here, the amount of memory required for any particular computation can be estimated fairly ac curately in advance. Secondly, the time taken by the com puter is the number of steps required, where a single step can only process a single character's worth of information and a "character" comes from a fixed alphabet. (Characters could be single bits, or bytes, or 32-bit words, or symbols from some other finite set, provided this finite set is spec ified in advance.) To give an illustrative example, observe that arbitrary natural numbers can be represented on such computers (as sequences of binary digits, for example) and two such numbers can be multiplied together, but the time taken to multiply these numbers will not be a single step it will instead be a function of the length of the numbers, for the computer can only process the numbers character by-character. A large amount of heuristic evidence exists supporting the thesis that the notion of a polynomial-time computable . • • •
© 2000 SPRINGER-VERLAG NEW YORK, VOLUME 22, NUMBER 2, 2000
9
problem is independent of the particular computer model used. That is,
if a problem
is solved in polynomial time on
This algorithm is based on the property that a num
one computer then the algorithm used can be transferred
yx
nomial time there. There is also strong evidence that sug
It is recursive in the sense that it calls itself with
gests that the complexity class P consists of precisely those problems that are soluble
smaller values.
in practice on an ordinary com
1. On input x, if x = 2 answer "yes," and if x = 1 an
puter. Problems not in P may be theoretically soluble, but
swer "no." Otherwise go to the next step.
only with impractical running times even on the very fastest
2. Guess
computer.
3. Guess a prime factorisation
Nondeterministic Polynomial-time computable problems,
and
NP. These are problems that can be solved in polynomial
a 1 a2 . . . an of x
1
run the algorithm recursively to check that
each
time as before, but on a special "enhanced" computer able
-
ai is prime.
4. Verify that 2 c x - 1)/a; =I= 1 mod x for each prime fac tor ai of x- 1. If any of these fail, answer "no;"
to perform "nondeterministic" algorithms. The reason for the interest in NP is that this class contains a great many
otherwise answer "yes."
problems of significant practical importance that are not known to be soluble by an ordinary polynomial-time algo
y and verify that y x - 1 = 1 mod x. (If this
fails, answer "no" and stop.)
The second class of problems of interest is the class of
rithm, including some very well-known problems such as
x > 2 is prime if and only if there is y such that - 1 = 1 mod x and y'l =I= 1 mod x for all q < x - 1.
ber
to a different kind of computer and will also run in poly
Figure 1. Pratt's nondeterministic algorithm for primality.
that of the "travelling salesman." To defme NP, we just need to explain the idea of a non
deterministic
prime numbers, for example, it is not immediately obvious
algorithm. These algorithms are like ordi
how one might show that the set of primes (the comple
nary ("deterministic") ones except that there is an extra
ment of the set of composites) is recognizable in polyno
kind of instruction allowed which instructs the computer
mial time by a nondeterministic algorithm. The problem
to guess a number. The computer performing this instruc
here is to guess something that shows the input x is prime,
tion is assumed to have the very special ability always to
and then to verify our guess quickly, but what should we
make a correct guess if one is available, and it is this as
guess? In fact, there is just such a "certificate of primality,"
pect of nondeterminism that is difficult to implement in
as was first observed by Pratt1 (see Figure
1).
Needless to say, no "nondeterminism chip" has been de
practice! Having made a guess, the nondeterministic algo rithm is required to verify that the guess was indeed a cor
veloped to use in real computers (though some believe that
rect one, because only by doing this can it determine
quantum mechanics implies that something rather like non
whether a correct guess was possible at all.
determinism might be built into a usable device).
As already mentioned, the class NP of Nondeterministic
For example, it is easy to use nondeterminism to tell if a whole number input x is composite (i.e., not prime). The
can be solved in polynomial time on a nondeterministic
yz = x then the machine has
machine. It is generally believed that nondeterminism re
verified that the guess was correct, so may answer yes, the number x is composite. If
problems is the class of problems that
and
machine should guess two whole numbers compute their product, yz. If
y, z > 1
Polynomial-time
yz * x
then the machine may
ally does introduce problems that were not already in P, and also that there are NP problems whose complement
safely answer no, as in this case it is allowed to assume
does not lie in NP, but here lies the main problem. To date,
that no better guess was available, i.e., that x really is
no one has managed to fmd an NP problem and prove it is
prime. Since a single multiplication can be carried out
not in P. The famous "P = NP" question is whether there
rather quickly, this nondeterministic machine will decide
is such a problem. This is one of the most important open
if a number is composite very rapidly without any lengthy
problems in mathematics-perhaps even
search over all the possible factors.
tant open problem. It has the same status as Fermat's last
A nondeterministic machine is not allowed to guess the
the most impor
theorem before Wiles's solution, with a long history (going
answer ("yes" or "no") to the problem and output that, be
back well before computers). The majority of mathemati
cause the machine would not have verified this guess. The
cians believe that P and NP really are different (though sev
special power of these machines lies in the fact that it is
eral well-respected mathematicians consider it quite plau
not necessary to verify that any particular guess was
correct
(because only correct guesses are chosen
if
in
they
are available). It is only required to verify that a guess is
correct.
Because of the different nature of these "yes" and
sible that P
= NP), but no one has a proof. Every
mathematician dreams of solving a problem like this, and a huge number have tried, but no one has succeeded. The difficulty of proving that P * NP is not due to lack
"no" answers, it is not always true that the complement of
of examples of interesting problems in NP. In fact, mathe
a problem solvable using nondeterminism is as easy to
maticians now have a huge list of problems-including the
solve nondeterministically. In the case of composite and
travelling salesman and many others of practical interest-
VR. Pratt, "Every prime has a succinct certificate," SIAM J. Comput. 4 ( 1 9 75), 21 4-220.
10
THE MATHEMATICAL INTELLIGENCER
�
m
T F
A
F T
boolean circuit is a circuit built of
with inputs that may be true put
(T)
A
A
A
� � �� A
B
T T F F
T F T F
A
VE T T T F
A
B
AI\B
A
B
A + B
T T F F
T F T F
T F F F
T T F F
T F T F
F T T F
the familiar logic gates such as AND
or false
(F).
(/\),
OR
(V), XOR ( + ), and NOT (--.), each p2, ... , Pn and an out
A circuit will have several inputs labelled p 1,
q. The problem SAT is
Given a boolean circuit C, is there some combination of true/false values for the inputs of C so that the output of C is true? There are algorithms to answer this question, but none running in polynomial time is known. The obvious algorithm (to check all possible combinations of the inputs of C) takes too long, as there are 2n combinations for n inputs. SAT is NP-complete. Figure 2. The NP-complete problem SAT. which are in NP and for which we have a proof that if P =F
Although there are a great many NP-complete problems of
NP, then the problem is
not in P. A problem, A, is typically shown to be of this type by proving that it is NP-complete,
practical importance, no one has found one which may be
i.e., that every other NP problem, B, can be solved by a de
lieved that no such exist. Turning a necessity into a virtue,
terministic polynomial-time program which converts its in
many people have attempted to design cryptosystems so
put, x, for the problem B to an input,j(x), for the problem
that a potential codebreaker would have to solve an NP
solved by a polynomial-time algorithm, and it is widely be
A, with the property that the answer to problem B for in
complete problem in order to break the code-taking too
put x is the same as the answer to problemA for inputj(x).
much time even on the fastest computer. Either way, an
If there is a polynomial-time computable functionj(x) with
answer to the P = NP question would have significant prac
these properties, we say the problem B reduces
tical importance.
to the prob
lem A. Loosely speaking, a problem B reduces to a prob lem A,
if A
"includes" all instances of B as special cases,
and the NP-problem A is NP-complete if it "includes" (in this sense)
all other NP-problems.
The Minesweeper Game Many of the ideas mentioned above may be illustrated ef fectively with a game many readers will be familiar with.
To see the importance of this, consider a problem B in
Minesweeper comes
with Microsoft's Windows operating
NP, and suppose also that we are given an NP-complete
system. 5 In it, the player is presented with an initially blank
problem,A. Then there is a polynomial-time computer pro
grid. Underneath each square there may be a mine, and the
gram that converts each instance, x, of the problem B to
object of the game is to locate all these mines without be
an instance,J(x), of the problemA. But if our NP-complete
ing blown up. You select a square to be revealed; if it is a
problem A is actually in P, the problem A for j(x) can be
mine you are blown up (and the game is over), but with
solved in polynomial time by a deterministic algorithm,
luck, perhaps it isn't. In this second case, when the square
0 to 8, which is the num
hence B also can be solved in deterministic polynomial
is revealed you see a number from
time, because the answers for A on inputj(x) and B on in
put x are the same.2 This also applies to any other C in NP
ber of mines in the eight immediately neighbouring squares. Figure
(with a different functionf(x) of course), so if A is in P,
The numbered squares are the squares that have been re
then every problem in NP will be in P, i.e., P = NP. CooJ.<3 and, independently, Levin4 first showed that NP complete problems exist. In particular, the problem SAT of logical satisfiability (see Figure
2)
is NP-complete.
3 shows a typical position in such a game.
vealed, and no others have been uncovered yet. Two of the unrevealed squares are marked with a*, and these squares have already been identified as having mines in them. The others have been labelled with letters for identification.
2There is an important technical consideration omitted from the argument here: if A is in P, then the running time for the algorithm for A on input f(x) is bounded by a polynomial in the length of f{x), not the length of x itself. However, f(x) itself is computed by a polynomial-time algorithm, and it is straightforward to deduce from this
that the length of f(x) is itself bounded by a polynomial in the length of x, so the algorithm just outlined for B is really polynomial time in the input x. 3S.A. Cook, "The complexity of theorem proving procedures," Proc. Third Annual ACM Symposium on the Theory of Computing (1971), 151-158.
4L. Levin, "Universal search problems," Problems of Information Transmission 9 (1973), 265-266. 5"Windows" is a trademark of Microsoft. The author has no connections with Microsoft, and nothing here should be regarded as comment on any of Microsoft's products.
VOLUME 22, NUMBER 2, 2000
11
F A 2 0 0 0
D A 2 0 1 1
2 3 3 1 1 c
1 2 4 * 5 1 4 1 2 E E *
1 B B B B E
2 2 2 2
2 0 0 2
2 0 0 2
2 2 2 2
Figure 3. An example position in Minesweeper.
Figure 4. Determine the location of all mines.
Faced with such a position in a game, there are several things one can deduce about the position of the mines, and which squares can be revealed safely. First, the squares marked A have mines, because of the 2s just below them. Next, the squares marked B also have mines because of the 4s and the 5 to their left. (These numbers include the two previously identified mines marked with stars.) Similarly, the square C has a mine. It follows that the squares markedD and E are clear since the mines at A, B and C account for the numbers neighbouring these squares. At this stage, it is not possible to determine if square F has a mine or not. However, the player may mark the identi fied mines A, B, C and uncover the safe squares D and E, and from the number revealed at squareD (a 2 or a 3) de termine if square F is safe or has a mine, thereby clearing the whole board. Now that the rules of the game have been explained, the reader may like to consider the configuration in Figure 4. This particular game is played on a 6 X 6 board, and six teen squares are revealed as shown. It is possible to de duce the location of all the mines from the information given. The general Minesweeper problem is: Given a rectan gular grid partially marked with numbers and/or mines, some squares being left blank, to determine if there is some pattern of mines in the blank squares that give rise to the numbers seen. In other words, to determine if the data
given are consistent. This is a typical yes/no problem, as discussed above, and if we could solve this problem effi ciently on a computer, we would have an excellent method for playing the game. To determine if a square is safe, we could write down the configuration we currently see with a single change made by marking the square in question with a mine, and feed this into the computer; if the com puter says this pattern is inconsistent, then there is no mine at the square in question and it is safe to reveal it, other wise there may be a mine. Similarly, by changing the de scription of the square in question to one containing a "0", then a "1", and so on up to "8", we may determine if it is correct to identify a mine at that square. The Minesweeper problem is in NP, for to determine if an incomplete description is consistent, it suffices to guess the positions of the mines and then verify that these mines produce the numbers seen. It is not at all clear whether the complementary problem-whether some input configura tion is inconsistent-is in NP, for what might we guess to show inconsistency? It is also reasonably straightforward to see that the Minesweeper problem can be reduced to SAT, for the rules of the game and any particular configu ration can be described by a boolean circuit (see Figure 5). In fact, the Minesweeper problem is NP-complete. This means it is just as difficult as any of the other NP-complete problems (such as SAT, the travelling salesman, and so on) and it is highly unlikely that there is an efficient algorithm
Consider a three-by-three block of squares labelled as shown.
a b c d e f g h i Let am denote "there is a mine at a," and for 0 $} $ 8 let aj denote "there is no mine at a and precisely j mines in the neighbouring squares around a"; and similarly for b, c, d, ... , i. Then the rules of Minesweeper for the centre square e can be described by the following statements: 1. precisely one of em, e0, et, . . . , es is true; 2. for k 0, 1, . . . , 8, if ek is true then precisely =
k of am, bm, Cm, dllll fm, Ym, hm, im are true;
and these can all be expressed (in a rather cumbersome fashion) by boolean circuits in the 90 inputs am, a0, . . . , i 8. If we let C be the circuit consisting of all of these circuits for all points in the rectangular grid in place of e, the
outputs of all these being combined into a single AND gate, then the Minesweeper problem becomes equivalent to an instance of SAT: given certain inputs for C being true or false, are there truth values for the other inputs that makes the output of the whole circuit C true? Figure 5. Reduction of the Minesweeper problem to SAT.
12
THE MATHEMATICAL INTELLIGENCER
for solving it. One way to prove Minesweeper is NP-com plete is to show how to build "computers" using Mine
.
sweeper configurations. Since computers can be thought
.
of as being made out of wires and logic circuits, that is
.
what we will try to imitate in Minesweeper. In fact, as SAT
X -----* 1 1 1 . 1 x' X . 1 1 1
.
is NP-complete, this suffices, because we will have shown how SAT reduces to Minesweeper, and Cook's result shows that any NP problem can be reduced to SAT.
Boolean Circuits in Minesweeper Examine the configuration in Figure
6. (Here, again, the
letters x and x ' label unrevealed squares which may or may
1 2 2 1 2 * * 3 1 1 x' * * 2 1 2 X * 2 1 1 2 1 1 x' 1 1 X 1 X 1 1 1 �
(a)
X -----* 1 1 1 2 * 3 1 1 1 1 1 ... 3 * x' X 1 x ' X 1 . . . 2 * 3 1 1 1 1 1 . .. 1 1 1
(b)
Figure 7. (a) A bent wire. (b) A terminated wire ..
not contain mines.) A moment's thought will show that there are just two possible configurations: either all of the squares marked x contain a mine and those marked x' do not, or else the other way round. We shall regard this as a wire carrying a value which is
1 X 1 1 x' 1 X -----* 1 1 1 1 1 1 1 X 1 . x' X 1 x' 2 x' 1 1 1 1 X 1 1 1 1 1 x' 1 1 X 1
true or false depending on whether the xs or the x' s have
. .
the mines. To defme the value true or false carried in the
.
wire precisely, we arbitrarily choose a direction for the
. .
wire-here going from left to right-and say that the value is
true if the
xs are mines, and false otherwise. In other
words, if the squares just behind the centre 1s are mines ("behind" meaning in the sense of the chosen direction of the wire) then the value carried is true, and it is false oth erwise. Note in particular that the truth of the signal in the
t
X X' � 1 1 1 ... 1 X x' . . . 1 1 1 ... x �
Figure 8. A three-way splitter.
wire is defmed relative to its direction and the position of
8). This is obviously an im
the centre 1s, not in terms of any absolute position on the
part of the splitter in Figure
grid.
portant device for logic circuits, but it is useful here in one
We will need to be able to bend wires, and to split them.
other important respect: since we defined truth/falsity in a
8 show how to do this. Figure 7 (a) is a sim
wire relative to the position of the centre 1s in the wire,
ple 90° bend in the wire. Figure 7 (b) shows how a wire can
we may find a problem when we want to combine two or
Figures 7 and
be terminated. In these two diagrams, the squares marked
more signals if they are not aligned correctly. Figure 10
* have mines in them. Such configurations can be given by
gives a configuration made from two NOT gates and pro
explicitly marking the square as having a mine, but in all
vides one possible solution: this device enables the align
of the configurations here it is not necessary to do so. In
ment of the 1s in three-by-three blocks to change so that
these and all the following diagrams, the areas outside the
the wire in question can be used as the input to some other
bounding lines are assumed known to contain Os, and in
device placed just about anywhere on the grid. (It is also
particular do not contain mines, and all the positions of the
possible to make a phase-changer out of three bent pieces
mines indicated by *s on the diagrams can be deduced from
of wire of the correct length.)
the numbers given. For (a), the mines are located by the
So far the configurations have been comparatively simple,
122 1 to the top and to the right and the 3 between them,
but in order to mimic arbitrary boolean circuits we will need
and in (b) the mines are located by the 12321 to the left. Figure
8 shows a way of "splitting" a wire. Notice that the
to have AND, OR, XOR gates, and so on. At first sight, it would seem that just the AND gate will suffice, because (as
is well
outputs are two signals X and an inverted signal X' . Any of
known from digital circuits) any gate or boolean circuit can
these wires can be terminated by a piece as in Figure 7 (b)
be made from a combination of AND and NOT gates. For ex
and the splitter can be combined with bends and further
ample, we can make up an OR gate from AND and NOT gates
splitters, to make splitters with any number of outputs. Figure 9 shows how to construct a NOT gate (similar to
..
0 0 0 1 1 1 . X 1 x' 1 1 1 . 0 0 0
. .
.
. .
.
Figure 6. A wire.
x� 0 0 0 0 0 0 0 1 1 1 1 1 1 1 ' ' X 1 x X 1 x X 1 1 1 1 1 1 1 0 0 0 0 0 0 0
0 1 1 1 0
0 0 0 1 1 1 ' x X 1 1 1 1 0 0 0
0 0 . 1 1 ... ' x X ... 1 1 ... 0 0 ...
by the familiar formulaA
V B = -,(-,A
1\ --J3). But in prin
ciple there could still be a difficulty, in that we have not
. .
.. .. .. Figure 9. A
X' ---* 1 1 1 X ---* 1 1 1 1 1 2 * 2 1 1 1 1 1 .. ' ' ' ' x X 1 x X 3 x' 3 X x 1 X x . . 1 1 1 1 1 2 * 2 1 1 1 1 1 .. 1 1 1
NOT
gate.
VOLUME 22, NUMBER 2 , 2000
13
0
0
0
0
X --t 1 1 2 1 1 1 1 1 1 2 * 3 * X I X 1 XI X 3 X I 5 X 1 1 1 1 1 2 * 3 * 1 1 2 1
• 0
1 2 1 1 3 XI X 2 1 1 1
Figure 1 0. A phase-changer made from two
and three XOR gates. An XOR gate can in turn be made out
X -+ 1 1 1 .. 1 XI X . . 1 1 1 ..
of
Figure
13 shows how an AND gate can be constructed.
This is rather more complicated than previous ones. It takes two input wires U and V, has one output wire, la belled T, and has a central square at the heart of the gate
NOT
4) which is where the signals are combined. AND gate has two internal wires, R,S, which are
(containing a
gates.
The
yet provided any method by which two signals can cross over each other. In fact, this turns out not to be a problem after all. Crossing two wires over is clearly not going to be possible in the plane, but Figure
AND and NOT gates, as Figure 12 shows. (Planar cir
cuits like these were discovered by Goldschlager.6)
1 1 shows that a crossing of two
wires can be simulated in the plane by using three splitters
aligned and looped back to a splitter at the output T via a pair of devices labelled
if the ts are mines and the t's are clear. In this 3 above and below a3, we have that a2 and must be mines, so a1 is clear, and s is a mine. Similarly r a3 T is true, i.e.,
case, from the
is a mine. Thus the central
r,
A -t
�
a1, a2, a3 and b1, b2, b3 To analyse if the output
what happens here, we first see what happens
4 already sees four mines-s, t,
and the *-so u',
v'
are both clear and the in
puts U, V are both true. This shows that
B --+ -
u>---
if t is
a
mine, all the other unknown squares are deter mined, and it is straightforward to check that these
values are consistent with the data given. Now suppose one or both of u, v is clear of mines, i.e., at least one of the inputs is false. Then,
�
B
as we have just seen, t must be clear, and t' must
A ---+
u�---------
__---+ __ __ __ �-----------------�.
__
be a mine. The central the u',
Figure 1 1 . Crossing two wires with three xoR gates.
4 sees 2 or 3 mines out of
v ' , and the *. So either one or both of r, s
must be mines. We need to check both cases are
s is a mine, it is easy a1 and a3 being mines
possible. But if
A -+
to check that and
a2
clear is consistent with the
data given. Likewise, if s is clear, then
A+B
--+
a1 and a2 being mines and a3 clear is
consistent. The argument is identical for
r,
so
if one
or both of the inputs
U, V is false then the output T is false,
B ---+
and each case is consistent with
Figure 12. Making an xoR gate with
AND
and
NOT
the data given. Therefore the whole
gates.
configuration represents an AND gate, as required.
u
1 .} 1 1 1 2 2 * 2 * 2 4 2 * 2 * 1 2 1 1 t 1 v
1 ul u 2 ul *
5
1 1 1 1 2 3 *
1 1 2 u 4
3 V 2 2 2 1 1 v 1 1 V1 1 1 1 *
VI
Figure 13. An
AND
1 2 2 2 * * 2 4 * 8 * * 4 * 4 81 3 1 u1 8 2 1 * 4 t tl V1 r 2 1 4 rl 3 1 * * 4 * 2 4 * r 2 * * 1 2 2
1 1 1 1 3 2 3 * 2 a1 a2 a3 e 3 3 2 3 * 2 1 0 1 1 1 1 1 1 1 1 1 t tl 1 t 1 1 1 1 1 1 0 1 1 1 3 2 3 * 2 bt llh b� tl 3 3 2 3 * 2 1 1 1 1
gate.
1 t 1 0 1 t! 1 0 1 t 1
1 2 e 1 0 1 1 1 0 1 e 2 1
1 1 * 3 3 * 2 t 1 2 1 tl t 2 1 e 1 2 2 t 3 * * 3 1 1
With these building blocks, we now have
2 1 * 2 * 2 T --+ 2 1 1 1 1 1 1 t 1 e t 1 .. 1 1 1 1 1 .. 2 1 * 2 * 2 2 1 0 0
enough information on how to convert boolean circuits to Minesweeper configurations. Figure
14 illustrates the idea for the formula (P V Q)
1\ (R V --.Q). We write a program which, given as in
put a boolean formula, constructs a Minesweeper configuration such as that in the figure. The crossed lines are cross-overs, the filled-in circles are splitters, boxes denote gates, and the lines are wires. Square brackets denote terminators, as in Figure 7(b ), except that the terminator marked T forces this wire to have the value 'true'. (This can be done by simply cutting a wire going from left-to-right just after a vertical row of three Is.) It is clear from the diagram how to devise an algorithm that converts an arbitrary boolean for-
6L.M. Goldschlager, "The monotone and planar circuit value problems are log space complete for P," SIGACT News 9(2) (Summer 1 977), 25-29.
14
THE MATHEMATICAL INTELLIGENCER
p ��------� P
Q �+-�---4�----� Q R
R t---+---+----...---i P V Q
-, 1---'1----e---+--1 -,Q 1---+--......
---l R V
-
Figure 14. A M inesweeper circuit for (P V
Q) 1\ (R V ---,Q)
mula to a Minesweeper configuration that is consistent if and only if the boolean formula is satisfiable. Each gate, terminator, and crossover can be put inside an N X N box (for some fixed value of N that can be predetermined), and the overall size of the configuration is therefore of the or der of N 2n2, where n is the number of symbols in the boolean formula. It follows that our program runs in poly nomial time, and hence that SAT is reducible to the Minesweeper problem. But SAT itself is NP-complete, hence so is the Minesweeper problem. It is worth pointing out that the NP-completeness proof just given is slightly stronger than originally stated. First, as has been pointed out, no *S need be given in the Minesweeper configurations used to test the satisfiability of boolean formulas. This is because the position of these mines can rapidly be deduced from the numbers in neigh bouring squares. More interestingly perhaps-as the ref eree of this article has pointed out to me-the configura tions may be taken to satisfy the condition that all squares neighbouring one marked 0 are uncovered. (Certainly all the gadgets in the figures satisfy this restriction.) This means that the action of the Microsoft Minesweeper pro gram to automatically clear all such squares does not give any significant help for solving the Minesweeper problem. Of course the configurations you can get in an actual game (where the mines are set at random by a computer) are unlikely to be like any of these boolean circuit config urations, so there remains a considerable art to playing the game, and there are many nuances and different kinds of deductions that one can make other than those used here. So it may even be that some polynomial-time algorithm is
-.Q
"good enough" at solving the sort of Minesweeper problems that occur in practice, even though (as suming P =I= NP) it cannot actually solve all theo retically possible configurations. Many of the other NP-complete problems known are studied in the same way, with a view to finding algorithms that are not guaranteed to work, but do seem to work in most cases of interest. Finally, it is nice to know that to current knowledge, there may still be an ef ficient algorithm for Minesweeper, and fmding it could solve one of mathematics's most important open problems. Acknowledgment
The author would like to thank the anonymous referee for particularly helpful remarks on the first draft of this paper, which have proved invaluable in its revision. A U T H OR
RICHARD KAYE .
School of Mathematics and Statistics The University of Birmingham Birmingham. B15 2TT England e-mail:
[email protected]
Richard
Kaye
studied
first at Cambridge and then at
Manchester, obtaining a PhD for work on models of arithmetic under the supervision of Jeff Paris. After six years of post doctoral work at Oxford he went to his present position as se nior lecturer at Birmingham. He has written a monograph on models of arithmetic and a textbook on linear algebra. He is a keen amateur trombonist, and he currently leads the trom bone section of the Oxford Symphony Orchestra.
VOLUME 22, NUMBER 2, 2000
15
i@Ffij!f§i.Shl¥11§ii§ 4fllrlrrt§l'd
A Cultural Gap F Revisited
This column is d«Voted to mathematics for fun. What better purpose is there for mathematics? To appear here, a theorem or problem or remark does not need to be profound (but it is allowed to be); it may not be directed only at specialists; it must attract and fascinate. We welcome, encourage, and frequently publish contributions from readers-either new notes, or replies to past columns.
Please send all submissions to the Mathematical Entertainments Editor, Alexander Shen,
Institute for Problems of
Information Transmission, Ermolovoi 1 9, K-51 Moscow GSP-4, 1 01 447 Russia; e-mail:
[email protected]
16
Alexa n d e r S h e n ,
ourteen years ago
Editor
The Mathemat
ical Intelligencer published an
arti cle by E.W. Dijkstra ("On a Cultural Gap," vol. 8, no. 1, 48-52) that discussed the roots of a cultural gap between the typical computer scientist and the typ ical mathematician. According to Dijkstra, this gap is a significant obsta cle both for mathematicians and pro grammers and should disappear in the future when "programs will display all the beauties of a crisp argument." Looking at today's software (com mercial and even free), we have to ad mit that this future hasn't come yet. But Dijkstra's arguments are still con vincing, and such a future indeed looks possible (though it may never come due to commercial reasons). Nevertheless, there may be a cul tural gap between mathematics and computer science (or programming) at a more subtle level. To explain it, let us consider the following puzzle.
There are N objects that seem to be identical, but in fact belong to several different types. One of the typesforms a majority (more than N/2 objects be long to that type). Our task is to point out one of the objects from this ma jority. The only tool we have is a de tector that cannot tell the type of an object but that when applied to any two objects will say whether those two objects are of the same type or not. Of course, one can apply the detector to all pairs of objects and get a com plete classification, but the number of measurements will be about N2/2. How many measurements do we really need? It turns out that a more efficient approach is possible, where the num ber of measurements is proportional to N, not N2• I am now going to present two proofs of this claim, a mathematician's proof and a programmer's proof. See if you agree with me about the difference in viewpoint.
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
I
Both proofs start with the following simple observation: if two objects have
different types, both of them can be discarded without changing the ma jority type. Indeed, in discarding two
objects of different types, we discard at most one "good" object and at least one "bad," so the majority remains a majority. For the same reason, two friends who are going to vote for dif ferent candidates may agree to ignore elections if both are sure that there is some candidate who has more than 500/0 support. The mathematician's proof contin ues: Let us assume for the moment that N is even. Then we can group our N objects into N/2 pairs and apply the de tector to each pair, making N/2 mea surements. Pairs that contain different objects are discarded. Mter that, we are left with a number of pairs, each consisting of objects of the same type; from each pair we retain only one ob ject. Then we have the same problem of finding the majority representative, but with (at most) N/2 objects. Therefore we get the recurrence T(N) ::::; N/2 + T(N/2), where T(n) is the minimal number of measurements required to solve the problem for at most n objects. Taking into account that T(2) = 0, by induc tion we see that T(N) < N. It remains to explain why odd val ues of N do not spoil this nice picture. In the odd case, after grouping ele ments into pairs and discarding pairs with different elements, we have some pairs with equal elements and one un matched element. For example, we may have pairs (a,a), (b,b), (c,c) and one unmatched element d. In this ex ample 7 elements remain. We know that "winning" elements form a major ity among them, so there must be at least 4 winning elements. But then at least two winning pairs exist, other wise there would be at most 3 winning elements. Therefore winning elements
form a majority among a,b,c, and we can drop d completely. The situation is different if after dis carding equal pairs we have four pairs (a,a), (b,b), (c,c), (d,d) and one un matched element e. Here we need 5 el ements to form a majority, and it may be achieved using two pairs and e. So winning elements do not need to form a majority among a,b,c,d. But they do need to form a majority among a,b,c,d,e; for otherwise there would be 5 losing elements. So in this case we should retain e in the sample. It is easy to see that one of these two arguments is applicable; we need only to keep the sample's size odd. (End of mathematician's proof.) The programmer replies: Imagine that you are locked in a magic room with the detector and N objects. There are three big boxes in the room. The boxes are labeled: UNTESTED, IDEN TICAL, and DISCARDED. Initially all the objects are in the box labeled UNTESTED, and it is guaranteed that most of them have the same type ("winning type"). The magic room has two laws, called also "invariant relations." (If you violate one of them, you are executed immediately.) Here they are: objects in the IDENTICAL box must be identical (if the box is not empty).
All
Objects of the winning type (which is determined in advance but un known) must form a majority among non-discarded objects (i.e., among objects that are in either the UN TESTED or the IDENTICAL box). Evidently, these conditions are satis fied in the initial state. When the UNTESTED box (U for short) becomes empty, the door is un locked and you are free. (You deserve it, for at that point all non-discarded ob jects are identical, so you have found at least one object of the winning type.) What will you do after the rules are explained? Some observations are al most evident. First, if the IDENTICAL box (I for short) is empty (while U is not), one
can safely move one object from U to (The set of non-discarded objects re mains the same, so the laws are not vi olated.) Second, if both I and U boxes are not empty, one can take one object from each and compare them, using the de tector. If they have the same type, it is safe to put both objects into I; if they have different types, it is safe to discard both (as we have seen earlier). It remains to point out that (1) in any situation one of these observations can be applied (unless U is empty); (2) the number of untested objects de creases at each step; (3) the detector is used at most once at each step. Therefore the U box becomes empty after at most N operations (in fact fewer, because the first operation does not involve the detector). (End of pro grammer's proof.) It is easy to transform the program mer's story into a short program (the cu rious reader may find it on pp. 71-72 of my book Algorithms and Program ming: Problems and Solutions, Birk hauser, 1997). The mathematician's ar gument, of course, also can be transformed into a program, but this program is much more complicated. Looking at this example, one may try to interpret the difference between mathematician's and programmer's viewpoints. Here is one possible ex planation. If some problem P is de composed into many similar subprob lems P1, . . . , Pn all of which are trivial, the mathematician is in the habit of considering P as trivial and may write something like, "Pi having been proved, we may leave all other Pi as exercises for the pedantic reader." The programmer, on the other hand, knows very well that it is her/his task to write programs for all Pi, so even if all these programs are short, for a big n (s)he has a lot of work, and it would be bet ter to find another solution for P that does not involve many subproblems. I.
Pentangram: Correction
We received the following letter in re sponse to the column about pentan grams:
I would like to point out that there is an error in the article about pen tangrams in the spring 1999 issue of The Mathematical InteUigencer. In the 3rd paragraph of the left-hand column of page 16, the author states that for regular polygons with 7, 9, 11, 13 or 19 sides, the ratio of the chords and sides is transcendental. This is plainly wrong. The ratios are algebraic. Gabor Megyesi e-mail:
[email protected] The same error was also pointed out by Prof. John Sharp (Watford, England). I apologize for not finding this error earlier. Clearly, for any n the nth roots of unity are algebraic and all ratios formed from them are algebraic too.
The Time Has Come . . . . . . to send for the latest copy of the free Consumer Information Catalog. It lists more than 200 free or low-cost
government publications on topic!? like money, food, jobs, children, cars, health, and federal benefits. Send your name and address to: Consumer Information Center Department TH Pueblo, Colorado 81009 A public service of this publication and
the Consumer Information Center of the
U . S . General Services Administration
VOLUME 22, NUMBER 2, 2000
17
Why Circlest T
dots on the two transparencies do not coincide but overlap, and look like an arc:
he following question was asked by Prof. Jacques Mazoyer (Ecole Normale Superieure de Lyon, France). Make two identical transparencies that contain a random pattern with gray and white squares like this:
center
Put them together. If they match ex actly, you see the same pattern. But if their positions are slightly different, you can see circles:
Why do these circles appear? Here is one of the possible expla nations. Two positions of the pattern differ by a rotation. Near the center of the rotation the difference is negligible and the patterns fit each other exactly. (This area is clearly visible near the center point of the circles.) Farther from the centerpoint, gray
These arcs are extended by the eye to the circles that we see. According to this explanation, cir cles are best seen at distance d/a from the centerpoint, if d is the dot radius and a is the rotation angle. In fact big ger circles are clearly visible too. Why? Maybe the average density of gray dots has fluctuations that are perceived as "dots" of bigger size. Or our vision has some kind of "inertia" that extrapo lates circular patterns onto the area where they are almost nonexistent. (Indeed, if we cover the lower half of the picture by a sheet of paper, it is much harder to find circles in the up per half.) Anyway, these transparencies are easy to make and worth thinking about.
MOVING? We need your new address so that you do not miss any issues of THE MATHEMATICAL INTELLIGENCER. Please send your old address (or label) and new address to: Springer-Verlag New York, Inc., Journal Fulfillment Services P.O. Box 2485, Secaucus, NJ 07096-2485 U.S.A.
Please give us six weeks notice.
18
THE MA.THEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
FLORIN DIACU
A Centu ry- ong Loo p* The first goal of this work . . . is to . . . deal with a branch of geometry called Analysis Situs, which describes the relative position of points, lines, and surfaces, with no regard for their size. Henri Poincare, in Analysis Situs, 1895 Most mathematical theories are like unfaithful offspring: they forget their origins. But some remember them. In 1892, while pursuing his studies on the 3-body problem, Henri Poincare laid the foundations of algebraic topology. The new field flourished, finding applications in many branches of mathematics. A hundred years later its tools were used to answer Poincare's initial question. This is a tale about a theory that glimpsed back to its roots, solving one of the problems that had created it. But before telling this story, let me generalize on how mathe matical theories arise. The Birth of New Theories
Two basic mechanisms--one internal, the other external govern the development of mathematics. The internal one acts when posing a purely mathematical problem. This leads to theorems, new concepts, generalizations, and higher lev els of abstraction. Galois theory, for example, grew as a re sult of repeated attempts to find a formula for the solutions of a polynomial equation of any degree. The external mech anism is triggered by other fields. Some of the questions they pose stimulate the rise of new mathematical branches. In this sense, Newton founded the theory of differential equa tions while trying to explain the motion of the moon. Most theories grow under the influence of both internal
and external factors, thus closely relating the evolution of pure and applied mathematics. A new field is usually born under the weight of one or more questions asked at the right time, which are intriguing enough to rouse attention and to keep the interest alive. But even at maturity, a the ory may be unable to solve the original problems. Most re searchers of the new domain are unaware of or uninter ested in the initial setting, whereas those who still seek answers are usually overwhelmed by the growth of the new field. If enough time elapses, changes in fashion may dis card the original statements or, in rare cases, promote them to the rank of famous conjectures. So in many cases the birth of new mathematical branches is a deflective phenomenon: when unable to solve a problem, mathematicians create a theory in order to an swer the initial question. Often the question fades out of the collective memory and the new field takes on different paths (see Figure 1).1 Algebraic topology followed the same rule. But at least one of its original questions stayed alive due to the growing interest in the qualitative theory of dy namical systems. Poincare's Question
Soon after finishing his doctoral degree (1878), Poincare started working on the 3-body problem. At that time he had
'To the memory of Aristide Halanay (1 924-1 997) of the University of Bucharest, founding editor of the Journal of Differential Equations. jThis is not the paradigm-type evolution described by Thomas Kuhn [K, 1 970]. Unlike scientific theories, which replace each other, mathematical fields live together and aim to make connections. To relate Kuhn's model to this one would take another paper.
© 2000 SPRINGER-VERLAG NEW YORK, VOLUME 22, NUMBER 2, 2000
19
may or may not
new theory
answers
other q uestions
answer
lead
leads to
to
new concepts
Figure 1. Deflective development: an original question leads to new concepts and to a new theory, which answers other questions but may not answer the original one.
no idea that his results in celestial mechanics would launch such a brilliant career. In 1889 he was awarded the presti gious prize of King Oscar II of Sweden and Norway (see [BG, l997] and DH, l996]). Poincare published the prize pa per a year later and developed it during the next decade into the 3-volume masterpiece Les Methodes nouveUes de
la Mecanique celeste.
The ideas he promoted in the theory of differential equa tions and in celestial mechanics were revolutionary. Instead of seeking particular solutions or attempting to re duce the order of the system (methods that applied only to a few classes of problems), he developed a qualitative point of view, following the Swiss mathematician Charles Sturm, who had started on this path in 1836. For a given system of differential equations,
x' = f(x), Poincare considered the n-dimensional set of the variable x, called phase space, and viewed the solutions of the sys tem as curves in this space. His goal was to offer a global geometric description of the solution curves. He explained this in his address to the Fourth International Congress of Mathematicians held in 1908 in Rome:
In the past an equation was only considered solved when one had expressed the solution with the aid of a finite number of known functions; but this is hardly possible one time in a hundred. What we can always do, or rather what we should always try to do, is to solve the qualitative problem so to speak, that is, to find the generalform of the curve representing the un known function. But applying this strategy to the 3-body problem was no easy task. The equations describing the gravitational mo tion of 3 point masses mi, m2, m3 in physical space are
{
qi ,
= m-i I Pi
i = 1, 2, 3, iiU(q) Pi - aq, , where qi = (qf , q'f, qT) and Pi = (p}, p'f, pf), i = 1, 2, 3, rep resent the position vectors and the momenta (i.e., mass X ,
_
velocity), respectively;
is the potential function (the negative of the potential en ergy); q = (qi, qz, q3) is the configuration of the particle system; and G is the gravitational constant. The 18-dimensional phase space can be first reduced to a 12-dimensional one. More precisely, the equations of mo tion remain unchanged if one shifts the origin of the ref erence frame to the center of mass of the particle system, a change that can be expressed by 6 scalar equations de rived from first integrals (i.e., functions that are constant along solutions):
m1 qi
+ m2 qz + maQa = 0
and
PI + P2 + P3 =
The reduction can be carried further by using other first integrals. The energy integral
T(p) - U(q)
=
h,
THE MATHEMATICAL INTELLIGENCER
(1)
2 where T(p) = �Cm1 1 IP 1 I + m2 I IPzl 2 + m3IIPai2) is the ki netic energy, p = (pi, P2, p3) is the momentum, and h a
real constant, foliates the 12-dimensional phase space into 1 1-dimensional "slices," which can be subsequently foliated using the 3 angular-momentum integrals
QI
X
PI + Qz X P2 + Q3 X Pa = c,
(2)
where c is a constant vector. This means that the initial IS dimensional space is partitioned into infinitely many 8-di mensional so-called integral manifolds, M = M(h, c). Since the equations of motion are invariant under rotations, the study can be further reduced to the 7-dimensional com ponents M7 = M/S02 (M factorized to rotations), called re duced integral manifolds. If we consider the planar 3-body problem instead of the spatial one, the integral manifold corresponding to M, say m, is 6-dimensional; the one cor responding to M7 is 5-dimensional; let us denote it by m5. To apply his program of describing the qualitative be havior of solution-curves in M7 and m5, Poincare had first to understand the shape of these sets, or their topology, as we call it today. For this he needed a language, so he searched the literature for appropriate tools. He found something2 in a posthumously published fragment of a
2 0ther aspects of topology started developing at about the same time (see [Sc, 1 994] and [Da, 1 994]), viewing manifolds from a different perspective.
20
0.
manuscript by Bernhard Riemann [R, 1953] and in a paper of Enrico Betti [B, 1871]. The two had worked together in Pisa, where Riemann, happy to leave the wet climate of Gottingen, which worsened his tuberculosis, had accepted several long-term invitations of his Italian colleague. But Poincare found he needed to develop these notions further. Thus algebraic topology was born. In Search of the Origins
This is the most plausible scenario. Unfortunately there is no clear proof that Poincare thought exactly this way. No written statement has yet been traced. The American math ematician George David Birkhoff, an expert in Poincare's work in dynamics, wrote a few decades later ([Bi,1927], p. 288),
The manifold M7 has fundamental importance for the problem of three bodies, but, so jar as I know, it has nowhere been studied, even with respect to the ele mentary question of connectivity. The work of Poincare . . . does not consider M7 in the large. This is of course no proof that Poincare ignored the problem. (Painleve, for example, attributes to Poincare what is known today in celestial mechanics as Painleve's conjecture, though there is no trace of it in Poincare's writ ten heritage (see [DH,1996]). Most probably Painleve learned of it during a private discussion.) There is, how ever, clear evidence that Poincare connected algebraic topology (or analysis situs, as he called it following Riemann [R, 1953]) to the 3-body problem. In a 1901 paper, published posthumously in 192 1, Poincare wrote ([P,1921], p. 101),
All the various ways in which I have successively en gaged myself have led me to Analysis Situs. I needed the results of this science to pursue my studies of the curves defined by differential equations and for ex tending them to higher-order differential equations, in particular to those of the 3-body problem. I needed them for the study of nonuniformjunctions of two variables. I needed them for the study of periods of multiple in tegrals and for applying this study to expanding the perturbation junction. Finally, I would foresee in Analysis Situs a way of approaching an important problem of group theory, the research of discrete groups or offinite groups contained in a given con tinuous group. Poincare's first contributions to algebraic topology ap peared in 1892 in a Comptes Rendus note, which he de veloped in 1895 into a longer article entitled Analysis Situs. This happened at a time when he was deeply in volved in research in celestial mechanics and especially in the 3-body problem. Five more papers on topology ap-
peared between 1899 and 1904 (see [P,1953]), after the pub lication of Les Methodes nouvelles. How could Poincare connect analysis situs to the 3body problem without thinking of the topological descrip tion of M7 or m5? He was interested in periodic orbits, and _ he needed to determine the topology of the space in order to find them. Periodic orbits are crucial for understanding what Poincare finally aimed at-the geometry of the flow, which he could not possibly study without knowing the shape of the integral manifolds. Like most of us, Poincare reached his results from ex ample to theorem, i.e., from the system describing the 3body problem (on which he worked intensely at the be ginning of the 1890s to expand his prize paper into the first two volumes of Les Methodes nouvelles) to the general the ory of differential equations. But like most of us too, he presented his results the other way around, considering the 3-body problem as an application of his theory. The previ ous quotation follows the same pattern of thinking. On the other hand, Poincare made this statement almost two decades after publishing his first paper on analysis situs. Initially he had foreseen applications only "to higher-order differential equations and, in particular, to those of celes tial mechanics" (see the Introduction in [P, 1895]). These are other arguments favoring the idea that Poincare thought of the topological description of M7 and m5. But why then did he never state the problem explicitly? Perhaps because the tools he invented were too crude to help him make significant progress, so his interests shifted towards more promising directions. This would be no won der-the problem is very difficult. The first to publish a statement and make a step towards solving it was Birkhoff, who had become famous in 1912 by providing his fixed point theorem3, thus answering another question unsuc cessfully attacked by Poincare (see [DH, 1996]) and also rooted in the 3-body problem. Though we will never be sure of what was in Poincare's mind, it seems likely that in developing analysis situs he was also targeting the topological description of M7 and m5.
From Betti Numbers to Cohomology
The main tools Poincare used for the topological charac terization of a manifold were the Betti numbers, which he named after the Italian mathematician Enrico Betti (1823-1892), who had previously introduced certain topo logical invariants. Poincare defined the Betti numbers of a manifold in his first paper on analysis situs, then recon sidered them in some later articles. In today's terminology, if X is an n-dimensional topological space, the Betti num bers {30, {31, , f3n are defined as: f3o-the number of con nected components of X; and f3k (k ::::: 1)-the number of k dimensional holes of X (see Figure 2). In connection with Betti numbers, Poincare introduced the notion of homology, which later on developed into the •
.
•
3At the end of October 1 91 2, Birkhoff presented to the American Mathematical Society a communication entitled "Proof of Poincare's geometric theorem." The paper derived from this communication appeared a year later ([Bi , 1 9 1 3]).
VOLUME 22. NUMBER 2, 2000
21
�
•
�
•
•
0
=
1
0
=
2
0
to 1960 [D, 1989] , claiming that the next twenty years would easily fill another volume. Though Poincare's work on the subject was barely mentioned during the first decades after his death (19 12), things changed afterwards. In the preface to his historical account, Dieudonne wrote [D, 1989],
- 1
�
0
=
1 '
�
,
-
0' �
2
=
0
At first, algebraic topology grew very slowly and did not attract many mathematicians; until 1920 its ap plications to other parts of mathematics were very scanty (and often shaky). This situation gradually changed with the introduction of more powerful al gebraic tools, and Poincare's vision of the funda mental role topology should play in all mathematical theories began to materialize. Since 1 940, the growth of algebraic and differential topology and of its ap plications has been exponential and shows no signs of slackening. The Topology of Integral Manifolds
�
8
�
�'-'o
=
1
'
�
�'--' 1
=
0' �
2
=
1
Figure 2. Betti numbers: (a) A point is 0-dimensional and has one component, so f3o
=
1. (b) The space formed by two points is 0-di
mensional and has two components, so f3o
=
2. (c) A circle is 1 -di
mensional, has one component and one 1-dimensional hole, so f3o {31
=
=
1. (d) A disc is 2-dimensional, has one component and no holes,
so f3o
=
1, {31
=
0, and f32
=
0. (e) A sphere is 2-dimensional, has one
component, no 1 -dimensional holes, but one 2-dimensional hole, so f3o
=
1, /31
=
0, and /32
=
1.
concept of lwmology group. If in its early times algebraic topology paid more attention to numerical invariants, as fashion dictated, it soon became clear that a structural de scription is richer. In fact, today's idea of algebraic topol ogy is to reduce topological problems to algebraic ones, i.e., to obtain information about homeomorphic maps by studying the induced isomorphisms between the corre sponding groups. Thus new concepts appeared, the colwmology group among them, which in a certain sense is the "dual" of a ho mology group. Cohomology groups are not "better" than homology groups (in fact they are less natural, so it took quite a while for this notion to crystalize), but they reveal different topological aspects of the manifolds studied. Also they offer an alternative language for expressing certain topological properties. Similar things can be said about ho motopy groups, which emerged in the 1930s from the work of Witold Hurewicz and Heinz Hopf. The growth of algebraic topology has been far from lin ear, and its history is now a research subject in itself. Jean Dieudonne dedicates an entire volume to the period 1900
22
THE MATHEMATICAL INTELLIGENCER
As I mentioned earlier, the first who explicitly dealt with the topology of M7 and m5 was Birkhoff. In 1927 he con sidered the problem but achieved only little success. In his now famous Dynamical Systems ([Bi,1927]), Birkhoff made a few unsatisfactory arguments for the following statement:
For h < 0, the topologies of M7 and m5 can change only at points that correspond to rel ative equilibria.
Birkhoff's Statement.
A relative equilibrium is a point (q, p) in phase space which if taken as an initial condition for the 3-body prob lem leads to a uniform motion of the bodies on concentric circles. The q-component of a relative equilibrium is called a central configuration, and it is always such that the grav itational force has the same direction as the position vec tor, i.e., q'' = kq for some constant k > 0. The only possible central configurations for three bodies are the equilateral triangle, and the straight-line position in which the ratio of the distances satisfies a relation depending on the masses (see Figure 3). If released with zero velocity from such a configuration, the bodies move homothetically towards a simultaneous total collapse. Because there are three ways of arranging the masses on a line, the spatial 3-body prob lem has four central configurations. If the 3-body problem is restricted to a plane, the equilateral triangle has two pos sible orientations, so the number of central configurations increases to five. The next notable statement was made by the Austrian mathematician Aurel Wintner in a book containing the most significant mathematical results obtained on the n body problem up to 1941 [Wi, 194 1 ] . In his bibliographical notes, Wintner mentions that ". . . nothing explicit is known as to the topological structure of M7." Three decades of si lence followed until the American mathematicians Stephen Smale [S, 1970a], [S, 1970b] and Robert Easton [E,1971] came up with important results. Employing Morse theory,
center
•
of
mass
X
•
•
m z
Figure 3. The equilateral and straight-line configurations of the 3-body problem.
Smale found the bifurcation types and proved Birkhoff's statement for m5. Independently, Easton described the topology of m5 in case of equal masses in terms of prod ucts of intervals, spheres, and tori. At that time Easton had just obtained his Ph.D. degree and was seeking recogni tion. He had considered the problem without knowing about Birkhoff's and Wintner's remarks, but in the summer of 1970 found out from a famous Brazilian mathematician, Mauricio Peixoto, that Smale had already obtained results in this sense. Anxious, he sent his paper to Smale, who replied with congratulations, saying that these contribu tions paralleled his. Using the notes of his mentor Aristide Halanay of the University of Bucharest, who had attended Smale's lectures on the subject, the Romanian mathematician Andrei Iacob rewrote and completed Smale's program [I,1973]. Iacob's results were later included in the second edition of the book by Abraham and Marsden (see [AM, 1978], Theorem 10.4.21). A few years later the Chinese mathematician X.Y. Chen published some interesting results on m5 [Ch,1978]. Rotating configurations into the plane, Chen reduced the spatial problem to the planar one. Unfortunately, due to technical difficulties, Chen missed the complete descrip tion of the topology of M7. Though the planar case could now be considered understood, the spatial one still re sisted. But the offensive was strong and all the recent progress was seeding hope. The attack on the topology of M7 was launched by the Brazilian mathematician Hildeberto Cabral, who in 1973 published the results of his Ph.D. thesis written in Berkeley under the supervision of Smale. Besides some results on m5, Cabral characterized M7 for negative energy and zero angular momentum. Robert Easton made the next step [E,1975]. He extended some of his previous results by pro jecting the 3-dimensional problem onto the plane. The idea of the projection method had already appeared in Cabral's paper, but the Brazilian mathematician did not pursue it. Easton obtained a series of nice results but unfortunately missed the fact that the topology of M7 may change not only at central configurations, but also at so-called criti cal points at infinity. These are values of the parameter v that are not central configurations and appear in con nection with the behavior of the energy function restricted
to certain level-manifolds of the angular momentum (for a technical definition see [Al, 1993], p. 475). To give an idea of what a critical point at infmity means, here is an analogy. Imagine the intersection L n C of the curve C in Figure 4 and the line L, which is parallel to the horizontal axis. When the line L moves up and down, the topology of the set L n C changes at the fmite critical points x1, x2, and x3, but also at + oo, because the curve C is asymptotic to the horizontal axis. In 1970 Stephen Smale was the first to point out the pos sible existence of a bifurcation point at infmity. A few years later the Spanish mathematician Carles Sim6 [Si,1975] proved the existence of three such points at which the topology of M7 can change. In 1993, in a paper in which he attacked the more difficult problem of understanding the topology of integral manifolds in the n-body problem, the French mathematician Alain Albouy [Al, 1993] showed that three was the maximum number of critical points at infm ity at which the topology of M7 can change. All these im plied that Birkhoff's statement was false in the 3-dimen sional case.
Figure 4. The idea of a critical point at infinity can be seen from the above picture: when the line L moves up and down, the topology of
the set L n C changes not only at the finite critical points x1, x2, and
x3,
but at
oo
too.
VOLUME 22, NUMBER 2, 2000
23
Other important results were obtained by the American
constant in (1) and c is derived from (2) by taking the ref
mathematician Donald Saari between 1984 and 1987. Sim6
erence frame such that the angular-momentum constant
had proved the presence of a bifurcation point of the para
is of the form
c = (0, 0,
c
c). There are 9 special values for
meter having an interesting property (see [Si,1975]). For val
v at which the topology of integral manifolds may change.
ues larger than it, there appear restrictions on the orienta
We must therefore ask about the topology in each of the
tion of the plane of motion (e.g., the angular-momentum vector cannot lie in this plane), and for values smaller than it there are configurations with unrestricted orientation. Using a clever decomposition of the angular momentum, Saari gave a geometrical description of the integral mani folds in terms of a sphere bundle and completely explained the above restrictions ([Sa, 1984], (Sa, 1987a] , [Sa, 1987b]).
v: I = ( - w, v1), II = Cv1, Vz), III = (Vz, �), v4), v = (v4, V[)), VI = (vt., V6), VII = (v6, V7), VIII = ( v7, Vs), IX = (Vs, Vg), X = ( ��g, w) . The values v1, Vz, �. v4, and ll5 correspond to critical points at infinity ( v1 is due to the change from h > 0 to h < 0, and Vz, �. and v4 were found by Sim6), whereas v6, ll7, v8, and Vg are due to rela ten intervals for IV = c�.
tive equilibria. Apparently the existence of
v5
contradicts
Albouy's finding that there are no other critical points at
The Solution of the Original Problem
infinity except the ones of Smale and Sim6. But in fact ll5
The announcement of the complete solution of the problem
is there; it's just that the topology of the integral manifolds
came in the fall of 1994 at a conference on Hamiltonian
remains unchanged at V[). This came out from the results
Systems and Celestial Mechanics in Cocoyoc, a small town
of McCord, Meyer, and Wang, who computed the coho
south of Mexico City, in a hacienda founded by Hernando
mology groups ofM7 in each case. The following table sum
Cortez. The three authors, Chris McCord, Kenneth Meyer,
marizes their conclusions in terms of Betti numbers.
and Quidong Wang of the University of Cincinnati, were all present. McCord, trained as a topologist, had learned about the problem from his colleague Ken Meyer, a leading re
f3o
[3,
f32
f33
f34
{35
f36
f37
0
4
0
5
0
2
0
Ill
0
4
0
3
0
2
0
IV
0
4
0
0
2
0
2
0
Betti number
searcher with many important results in celestial mechanics.
Wang, who had just completed his Ph.D. degree with Meyer,
was already known for the convergent power series solution he had obtained for the singularity-free n-body problem, a quest generalizing the problem attacked by Poincare for King Oscar's competition (see [W, 1991] and [Di,1992]). Meyer had first heard of the possibility of bifurcations due to critical points at infinity from Alain Albouy at the 199 1 conference on celestial mechanics in Guanajuato. Upon his return to Cincinnati, Meyer asked Wang whether
0
0
2
0
0
0
0
v
0
4
0
0
VI
0
4
0
0
1
2
0
VII
1
3
0
0
0
3
0
2
0
0
0
VIII
0
3
0
0
0
IX
2
0
3
0
0
0
X
3
0
3
0
0
0
0
he would like to read Albouy's paper. Wang was already familiar with Chen's and Saari's work and showed imme diate interest. Then McCord joined the team. This collaboration was a happy one. McCord was a mas
V[), the critical point at infinity, was particularly troublesome
for the team. The preliminary computations showed that the
topology changed at V[), in contradiction with Albouy's pre
ter of algebraic topology. Beyond his erudition in dynami
vious conclusion. Intrigued, Meyer e-mailed a note to
cal systems, Meyer had a rich and fruitful research expe
Albouy, who replied that he had three proofs for his result
rience and a good feeling for avoiding traps; Wang brought
and saw no alternative. Soon Meyer understood that Albouy
to the team his courage, decisiveness, and enthusiasm.
was right and convinced the others that the mistake must
Having together all the ingredients for success at a time
be their own. But it took several months of checking and
when the problem was ready to yield, McCord, Meyer, and
rechecking their arguments until, to their relief, they found
Wang provided after many months of intense work a com
a mere computational error, which when corrected proved
plete topological description of the integral manifolds
that the topology of M7 was unaffected at V[).
associated to the 3-body problem. Their 90-page paper ap peared in 1998 in the
matical Society
Memoirs of the American Mathe
The work of McCord, Meyer, and Wang closes a cen
tury-long loop, solving a problem to which many others
[MW, 1998] to rave reviews.
have made direct or indirect contributions. But this is not
The main idea followed by McCord, Meyer, and Wang
the end of the journey. New questions regarding the topol
was to modify the rotation Chen used in [Ch, 1978] and thus
ogy of integral manifolds associated to different restricted
simplify some of the derived algebraic equations. This al
3-body problems and to the n-body problem in general can
lowed them to overcome the difficulties that had stopped
now be attacked with the methods developed in all those
Chen. Their work involves several algebraic-topological
years. Moreover, we might be able to understand better the
techniques accessible only to specialists: Gysin and Mayer
geometry of the flow associated to the 3-body problem
Vietoris sequences, results due to Seifert and Van Kampen,
a goal toward which Poincare strived his entire life.
Thorn classes, bootstrapping, etc. The final result, however, is easy to grasp.
The author is indebted to Alain Albouy, Hildeberto Cabral,
v = -c2h,
Chris McCord, Bob Easton, Andrei Iacob, Ken Meyer, Don
values of the parameter
24
Acknowledgments
The integral manifolds are analysed with respect to the
THE MATHEMATICAL INTELLIGENCER
where
h
is the energy
AU T HOR
-
[Ch, 1 978] Chen, X. Y. The topology of the integral manifold general three-body problem,
Acta Astra. Sinica 1 9
of the
M8
(1 978), 1 - 1 7 (in
Chinese). [Da,1 994] Dauben, J.W. Topology: lnvariance of dimension, in Companion Encyclopedia of the History and Philosophy of the
vol. 2, pp. 939-946 Grattan-Guinness, 1 . ,
Mathematical Sciences,
•.
editor, Routledge, London and New York, 1 994. [Di , 1 992] Diacu, F.
Singularities of the N-Body Problem -An Introduc
Les Publications CRM, Montreal, 1 992.
tion to Celestial Mechanics ,
[DH, 1 996] Diacu. F. and Holmes, P. of Chaos and Stability,
Celestial Encounters - The Origins
Princeton University Press, Princeton, N.J.,
1 996. [D,1 989] Dieudonne, J. A History of Algebraic and Differential
FLORIN DIACU
Department of Mathematics and Statistics University of Victoria, Victoria,
P.O.
[E, 1 97 1 ] Easton, R. Some topology of the three-body problem,
Box 3045
Differential Eq. 1 0
British Columbia
Differential Eq. 1 9
e-mail:
[email protected]
[1, 1 973] lacob, A
www .math.uvic.ca/faculty/diacu/index.html
Journ.
(1 971 ), 371 -377.
[E,1 975] Easton, R. Some topology of the n-body problem.
Canada, VSW 3P4 webs�e:
Topology
1900-1960, Birkhi:iuser, Boston, Ma. , 1 989.
Journ.
(1 975), 258-269.
Metode Topologice In Mecanica Clasica,
Editura
Academiei, Bucure§ti, 1 973. Florin Diacu was born in Romania, obtained his doctoral de gree from the University of Heidelberg, Germany, and is cur rently the UVic-Site Director of the Pacific Institute for the
[K, 1 970] Kuhn, T.S.
The Structure of Scientific Revolutions,
[MW,1 998] McCord, CK, Meyer, K.R., and Wang, Q. The integral man
Mathematical Sciences. He is known in mathematics for in
ifolds of the three body problem,
troducing the study of differential equations given by quasi
[P, 1 895] Poincare, H. Analysis Situs,
homogenous potentials, which have applications in astron
(2)1 , (1 895 1 - 1 2 1 ), or in
omy, physics, and chemistry. He coauthored with Philip
Villars, Paris, 1 953.
Holmes
Celestial Encounters - The Origins of Chaos and
Stability,
a best-seller translated into several languages. He
also wrote an introductory differential-equations text entitled Order and Chaos
that will be published this fall with W. H .
Freeman & Camp. His hobbies include reading, fitness work outs, and hiking.
3rd. edi
tion, University of Chicago Press, Chicago, Ill .. 1 996. Memoirs AMS 1 32,
Oeuvres,
No. 628 (1 998).
Journal de I'Ecole Polytechnique,
tome 6, pp. 1 93-288, Gauthier
[P , 1 92 1 ] Poincare, H. Analyse de ses travaux scientifique, 38 (1 92 1 ), 3-1 35.
[P, 1 953] Poincare, H.
Oeuvres ,
tome 6, Gauthier-Villars, Paris, 1 953.
[R, 1 953] Riemann, B. Fragment aus der Analysis Situs, in Works of Bernhard Riemann,
Publ., New York, 1 953.
Acta Math.
The Collected
H . Weber. editor, pp. 479-482, Dover
[Sa, 1 984] Saari, D.G. From rotation and inclination to zero configura tional velocity surface, I. A natural rotating coordinate system, Celestial Mech. 33
(1 984), 299-31 8.
[Sa, 1 987a] Saari, D.G. From rotation and inclination to zero configura
Saari, and Don Wang for discussions or for suggesting improvements to earlier versions of the manuscript. Sup ported in part by NSERC Grant OGP0122045.
tional velocity surface, II. The best possible configurational velocity surface,
Celestial Mech . 40
Dynamical Systems
editors.
REFERENCES
[AM , 1 978] Abraham, R. and Marsden, J.
Foundations of Mechanics,
[AI, 1 993] Albouy, A Integral manifolds of the N-body problem, Poincare and the Three Body Problem,
[8, 1 87 1 ] Betti, E. Sopra gli spazi di un numero qualunque di dimen Annali di Matematica
Transactions of the American Mathematical Society 14
( 1 9 1 3),
Companion
pp. 927-938, Grattan-Guinness,
1.,
editor,
[Si, 1 975] Sim6, C. El conjunto de bifurcacion en problema espacial de Acta I Asarnblea Nacional de Astronornia y Astrofisica
(1 975), pp. 21 1 -2 1 7 , Univ. de Ia Laguna. Spain. lnventiones Math. 1 0
(1 970), 305-331 . [S, 1 970b] Smale, S. Topology and mechanics II,
lnventiones Math. 1 1
(1 970), 45--64 .
1 4-22. Dynamical Systems,
American Mathematical
Society, Providence, R.I., 1 927. [C, 1 973] Cabral, H. On the integral manifolds of the N-body problem, lnventiones Math. 20
81 (1 988). 23-42.
[S, 1 970a] Smale, S. Topology and mechanics I,
(2)4(1 87 1 ) . 1 40-1 58.
[Bi , 1 9 1 3] Birkhoff, G.D. Proof of Poincare's geometric theorem,
[Bi , 1 927] Birkhoff, G.D.
vol. 2,
Sciences,
tres cuerpos,
American Mathematical Society, Providence, R.I., 1 997. sioni,
Contemporary Math.
[Sc , 1 994] Scholz, E. Topology: geometric, algebraic, in
Routledge, London and New York, 1 994.
(1 993), 463-488.
[BG,1 997] Barrow-Green, J.
Hamiltonian
(Boulder, CO 1 987), K.R. Meyer and D.G. Saari,
Encyclopedia of the History and Philosophy of the Mathematical
2nd ed. , Addison-Wesley, New York, 1 978. lnventiones Math. 1 1 4
(1 987), 1 97-223.
[Sa,1 987b] Saari, D.G. Symmetry in n-particle systems,
(1 973), 59-72.
[W, 1 99 1 ] Wang, Q. The global solution of the N-body problem, Mech. 50
Celestial
(1991), 73-88.
[Wi, 1 94 1 ] Wintner, Mechanics,
A
The
Analytical
Foundations
of
Celestial
Princeton University Press, Princeton, N.J., 1 941 .
VOLUME 22, NUMBER 2, 2000
25
M a them atic a l l y B e n t
Col i n Ad a m s , Ed itor
Research Announcement B
The proof is in the pudding.
Opening a copy of The Mathematical Intelligencer you may ask yourself
uneasily, "U'hat is this anyway-a
iPhdMA seeks same for long week ends of commutative algebra, in cluding walks in the woods discussing math, candlelit dinners over some ex cellent local rings, and tranquil evenings spent staring into the fire as we contemplate ideals. Tired of the conference scene? Sincere mathemati cians only need respond. Let's trade papers and see where it goes from there.
mathematical journal, or what?" Or
Code:
you may ask, "U'here am !?" Or even
Bi: Gender irrelevant Phd: Ph.D. M: Mathematician A: Algebraist.
"U'ho am I?" This sense of disorienta
tion is at its most acute when you
***
open to Colin Adam's column. Relax. Breathe regularly. It's mathematical, it's a humor column, and it may even be harmless.
Dear BiPhdMA, I'm not the kind of mathematician who answers personal ads, but I found yours intriguing. I have included a reprint of an article I wrote on bira tional extensions as well as some jot tings on maximal ideals of formal fiber rings. What do you say we get together over a few lemmas and see where it goes from there?
Dear Craig, I too had a wonderful time. I find my self thinking about how you showed that the associated prime ideal in that localization of a polynomial ring needn't be embedded. I cannot get it out of my head. I am impatiently waiting for the next time we can meet. With greatest affection for your math ematics, Ahmad
*** Craig Dearborn and Ahmad Ashanti an nounce the initiation of a collaboration on research on the Formal Fibers of Excellent Rings. The union will be cel ebrated with a tag team talk at 4:00, on Saturday, June 1 1, to be followed by tea and cookies.
*** Craig Dearborn and Ahmad Ashanti an nounce the birth of their newest result, Theorem 1.7, born at 1 1:42 A.M. Mon day, December 1 1, 1999, weighing in at 7 ounces. It has been named the Dear born Ashanti Rigidity Theorem.
Ahmad Ashanti
*** Dear Ahmad,
Please send all submissions to the column editor, Colin
Adams,
Department of
Mathematics, Bronfman Science Center Williams College, Williamstown, MA 01 267
I can't tell you how much I enjoyed our time together. Your lemmas have a cer tain swagger, pushing the mathemati cal envelope as they do. And when you finally stated your proposition, I thought I would swoon. Who would have thought that Noetherian rings could behave so badly? Thank you for making me look forward to each new day, and the mathematics that it will contain. Yours, Craig
USA
26
THE MATHEMATICAL INTELLIGENCER !;> 2000 SPRINGER-VERLAG NEW YORK
***
*** Well, we never thought we would be writing one of these letters, but so much has happened this year, and we finally had to face up to the fact we wouldn't be able to personally contact all of you whom we hold dear.
It has been quite a year for us. January started things off with a bang, when it was announced that Theorem 1. 7 [DA Rigidity Thm.) would be ap pearing in the Annals of Mathematics. You can imagine the thrill. Craig's theorem from a previous collaboration, Theorem 2.2. 1 [Dearborn Kawauchi 3] has had a great year as well, having been cited in no fewer than three papers. And the big news is that we will be expecting a new arrival in August, although a few details still have to be worked through. Craig has been elected to the editor ship of the Rhode Island Journal of Mathematics (RIJM), a dream of his for many years. And Ahmad has been no slouch either, having refereed no fewer than fifteen papers this year, three for journals other than the RIJM. Quite the popular referee. Here's hoping your year has been as fruitful as ours. With Best Holiday wishes, Craig and Ahmad
NUMBER
From Ah mes to Cantor
Midhat Gazale
We might take n u m bers and counting for granted, but we shouldn't. Our number literacy rests upon centuries of human effort, pu nctuated here and there by strokes of genius. In his successor and com panion volume to Gnomon: From Pharaohs to Fractals,
M id hat Gazale introduces us to some of the most fascinat ing and ingenious characters i n mathematical history. As
he deftly blends together h istory and mathematics in his characteristically compelling style, we discover the funda mental notions underlying the acquisition and recording of " n u mber," and what "number" truly means. Cloth $29.95
ISBN 0-69 1 -0051 5-X
*** Hope this holiday letter finds you in better spirits than we are experiencing around here this year. A small logical in consistency was discovered in Theorem 1.7 [The Rigidity Theorem), and it has not been doing well ever since. It ap pears the inconsistency is spreading and all attempts at treatment have failed. But we are ever hopeful. Craig has resigned the editorship of the Rhode Island Journal of Mathe matics in order to spend more time with the Rigidity Theorem. Ahmad has been helping out when he can, but the complete homomorphic images prop erty, and its implications for excellent rings, take up a lot of time. Season's Best, Craig and Ahmad
*** Retraction: Craig Dearborn and Ahmad Ashanti are saddened to report that Theorem 1. 7 has been retracted, an an nouncement of which will appear in the Annals of Mathematics. In lieu of flowers, donations should be sent to the Centennial Fellowships of the American Mathematical Society.
***
Due March
Princeton University Press
AT FINE BOOKSTORES OR CALL 800-777-4726
This year, our research program seems to have stalled. We have con sulted with various experts. They agree that this area is still fertile ground, and have no explanation for why we are not producing. Most likely, we just need to get back into the rhythm. Unfortunately, that will be difficult, as Ahmad will be spending the spring semester in Australia where he will be working with the excellent ring theo rists at Melbourne University. Then he will be in Berkeley for the summer. Craig is obligated to stay home and fin ish up several joint preprints, but his heart will be in Melbourne. Our Best to You, Craig and Ahmad
*** Surveillance Report: Aug. 12, 1 1:32: Subject Ahmad Ashanti was seen entering the Mathematical Sciences Research Institute with Australian mathematician Hugh
•
WWW. PUP.PRINCETON.EDU
Rubenstein. Discussion appeared ani mated. Telephoto lens caught subject and Rubenstein taking turns writing equations vigorously on the black board. It appears that an exchange of preprints did occur.
*** These papers, filed with the American Mathematical Society, officially an nounce the cessation of collaboration between Craig Dearborn and Ahmad Ashanti. The reprints of their joint pa pers will be divided as follows: 400Al for Dearborn, 400Al for Ashanti, and 20% for the lawyers. All future correspondence on the joint work should be directed to the law firm of Rudin & Rudin.
***
BiPhdMA seeks same for long term re search relationship. Seeking a book level commitment here. Researchers trolling for a fast paper or two need not apply.
VOLUME 22. NUMBER 2. 2000
27
ARTH U R M. LESK
The U n reasonab l e Effectiveness of M athe m atics i n M o ec u l ar Bio ogy*
y title is an emulation of that of the weU-known paper by E.P. Wigner, "The unreasonable effectiveness of mathematics in the natural sci ences [ 1j. " Of course the irony cuts in opposite ways in physics and molecular biology. In physics, mathematics is obviously effectivemany of the giants on whose shoulders physicists stand are mathematicians-and the surprise is Wigner's suggestion that this is unreasonable. In molecular biology, the proper role of mathematics is not obvious, and there is fear, far more credible than for physics, that it may be unreason able to expect mathematics to be effective. Of course, many common tools of computational molecular biology-for in stance, searching in databases for sequences similar to a probe sequence--are certainly based on mathematics and computer science. But whether our ultimate understand ing of living processes will be expressed in the language of
*Based on a talk delivered at the final symposium of the program, "Biomolecular Function and Evolution in the Context of the Genome Project, " at The Isaac Newton Institute for the Mathematical Sciences, Cambridge, U.K., 20 Dec. 1 998.
28
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
mathematics-in the way, for example, that concepts of symmetry underlie the statement of laws of physics-or in the traditional descriptive "anecdotal" language of biology, is still moot. Why might it be reasonable to doubt the effectiveness of mathematics in biology? Observed properties of living systems are determined by a combination of • •
•
The laws of physics and chemistry The mechanism of evolution Historical accident
It is difficult to sort out their effects, and a creative ten sion among them pervades our investigations. Many of the laws of physics describe the natural world-including liv ing systems-by specifying relations between initial and fi-
nal conditions. In biology the complexity of the set of pos
In our computers DNA sequences are character strings:
sible initial conditions creates difficulties. The large role of
one-dimensional objects. Genes, substrings of genome se
historical accident hinders and humbles us: Even if funda
quences, are translated into amino acid sequences of pro
mental laws of physics and chemistry have simple conse
teins, by a nearly-universal cipher. Amino acid sequences
quences that would provide detailed descriptions of life
of proteins are also represented by one-dimensional char� acter strings. Next, proteins fold spontaneously to unique
processes, we may not be able to discover them, because our observables are more complex, resist simplifying ide
"native" three-dimensional structures. (The evidence for
alizations, and show features dominated by a choice of ini
this is that they can be denatured-the three-dimensional
tial conditions from a very large and diverse set of possi
structure destroyed-by heating, and when cooled they re
bilities. Apples, in biology, do much more than fall on
sume their original form, like shape-memory alloys.) The
people's heads.
spontaneous folding of proteins is the point at which Nature makes the leap from the one-dimensional se
The Subject Matter of Computational Molecular Biology
quences of genes to the three-dimensional world we in habit.
The objects of our study at least have a form to which we can attempt to apply mathematics. These include:
The Goals of Computational Molecular Biology
•
What are our goals in dealing with this material? First, sim
• •
•
DNA sequences of genes
ply to describe the similarities and differences among se
amino acid sequences of proteins
quences and among structures, and to classify them. What
protein structures
are the topologies of sequence space, structure space, and
protein functions
the space of protein function? What are the mappings
Readers will have heard of the gerwme projects, which are
the determinations of the complete sequences of the DNA in
among these spaces? We wish to be able to describe and predict the interrelationships among protein sequence,
organisms: the set of blueprints. The DNA sequences in
structure, and function, using evolution as the organizing
genomes contain all the information required by the organ
principle.
pletion of the sequencing of the yeast genome in
1996, we
lamented: "The problem with biology is that it has no har
lmow as much about a yeast cell as a yeast cell does. This
monic oscillator." By this he meant that in biology, unlike
the information. Admittedly, we may not be able to interpret
alization. In physics, the harmonic oscillator is a simple
ism to be born, to develop and grow, and to die. With the com
statement is not as arrogant as it sounds. We do possess all
How do we go about this work? Sydney Brenner once
physics, there is no escape from complexity, even
via ide
it as effectively as a yeast cell can, but we do have the com
problem, solvable exactly by many methods; it applies ex
static de
actly to some phenomena, and is a useful approximation
scription of structure and potential activity; it remains to ex
to others. In physics it is the traditional testbed for new
tend our observations to the integration of protein expression
methods. In fact,
and function, in time and space within an organism. Collection
two "harmonic oscillators" in Brenner's sense: Sequence
plete set of blueprints. But blueprints give only a
computational
molecular biology has
of these data is lmown as the "proteome project," and it is
alignment, and Structure superposition. These operations,
gathering momentum for a role in the post-genomic era.
which can be carried out exactly and efficiently, are basic
The rate of measurement of gene sequences is extremely large, and increasing. In DNA from a worm, pleted
1998 the complete sequence of the
Caenorhabditis elegans,
was com
7 (9.7 X 10 bases). It is likely that 1999 and 2000 will
see, respectively, the completion of the sequencing of the DNA from the fruit fly genome
(1.8 X 108 bases) and the human
(3.4 X 109 bases), as well as numerous other or
to many analyses of sequence-structure relationships in molecular biology. Now, it will come as no surprise that the real world is often anharmonic. Nevertheless, many valuable tools have been developed from these simple cases. However, tools provide answers but not questions. Research in this field continues to depend on the interac-
ganisms both large and small. Louis XV could say, "Apres moi, le deluge." Noah, in contrast, could not; nor can we.
The sequences and structures that we study are inter related in important ways. On the molecular level, the DNA sequences of genes encipher the amino acid sequences of proteins. Then the amino acid sequences of proteins de termine the three-dimensional
structures of proteins.
The leading mathemattctart'L:1\L Gelfand, who is also
a leading physiol�t; bot\y'l1$ties being a mathe matical biologist. To Wigner's Principle-the unrea sonable effectiveness of mathematics in the physical sciences--he countezposes
Protein structure then determines protein function (Figure
1). A precise three-dimensional structure is necessary for protein function because the required interactions depend on bringing together different parts of molecules in exact spatial relationships. Finally, the feedback from protein function back to gene sequence-by evolution through nat ural selection--closes the loop.
the
in.
�nreasonable ineffectiveness of mathelil�tics biological sciences
th;e
whicb some of his colleagues Wigner-Gelfand Principle.
advocate calling
the
VOLUME 22. NUMBER 2. 2000
29
A S eq uence of
Bases in DNA
••.
Is Translated to a Sequence of
Which Folds Spontaneously
Amino Acids in a Protein ...
to a Precise Three-Dimensional Structure
, c
�rn Cl c 0
E
e
-
-g _. 2!
"
J 0
Three Bases
!
l
l
l
UUU F
ucu s
UAU Y
uou c
UUC F
ucc s
UAC Y
uoc c
UAA "EOF"
UOA "EOF"
UUO L
UCA S
uco s
CUU L
ccu p CCC P
CAC H
CUO L AUU I
ceo p
CAO
AUC I
ACU T
AUO M
ACC T
ACO T
AAC N
AAO K
UUA L
cue L
CCA P
CUA L
AUA I
ACA T
uoo w
UAG "EOF"
CAU H
CAA
COU R
COC R
COA R
Q Q
COO R AOU S
AAU N
AOC S
AAA K
AOA R
AOO R
ouu v
OCU A
OAU D
oou 0
ouc v
OCC A
OAC D
ooc 0
OCA A
OAA E
OOA 0
OCO A
OAO E
000 0
QUA V
ouo v
!I .!!
a. '1:
1-
t
t
_.
t
!
One Amino Acid
Genetic Code "Translation Table"
Figure 1. Information flow during "readout" from a gene. Genes, the basic blueprints of organisms, are contained in the structure of DNA
(left). At the left of the figure is the double helix of DNA, containing two intertwining strands- one drawn in narrow lines, the other in bold.
The "staircase effect" is created by a set of chemical subunits called the "bases". At each position on either strand there are four possible bases: A, T, G, and C. The sharp-eyed reader can see that the bases-the "risers" of the helical staircase -come in different forms. Each base interacts with another base at the same level, on the other strand, and these interactions demand strict complementarity: presence of an A on one strand requires a T opposite it on the other, a G on one strand requires a C on the other, and vice versa: a T on one strand is complementary to an A and a C to a G. In this way, each strand contains enough information to direct the synthesis of its partner. Logically, the way to replicate DNA is to take the strands apart and synthesize the complement of each of the separated strands. The sequences of bases in genes encipher the amino acid sequences of proteins, by a direct translation table known as "the genetic code" {center). The ge
netic message is written in the four-character A, T, G, C alphabet. Proteins are also polymers containing a sequence of chemical residues such that each position contains one of twenty amino acids, with mnemonics A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y. To spec
ify a total 20 amino acids, each requires more than two bases; in fact the DNA sequence is read three bases at a time, with a redundancy that is quite important for evolution. (Three triplets of bases are reserved for "End-of-file" terminator signals.) Proteins fold spontaneously to native, active three-dimensional structures (right). This is the point at which Nature makes the leap from the one-dimensional sequences of
genes to the three-dimensional world we inhabit. The example shown is a toxin from a sea snake, one of the many protein structures de termined by X-ray crystallography. Each gene has a sequence of bases that dictates first the amino acid sequence of a protein, thereby its three-dimensional structure, and thereby its function.
tion of the human scientist with the data, using mathe matical and computational methods in support. Sequences and Sequence Alignment
Gene and protein sequences take the form of character strings. For gene sequences the characters are chosen from a set of four, {A, T, G, C), symbolizing the nucleotides Adenine, Thymine, Guanine, and Cytosine. For protein se quences the characters are chosen from a set of twenty, symbolizing twenty canonical amino acids. Alignment
The alignment of two character strings is the determina tion of a meaningful correspondence between their ele ments.
30
THE MATHEMATICAL INTELLIGENCER
Given two character strings: g
c
t g a a
c
and
c
t a t a a t
c
then two possible alignments are: g -
c
c
t g a - a
-
-
t - a t a a t
c
c
and
g
-
c
c
t g - a a
-
t a t a a t
c
c
How can we decide which of these two, or of many other possibilities, is the best alignment? Can we devise a met ric for character strings and define distances between them? Measures of dissimilarity between character strings include: (1) The Hamming distance, defmed between two strings of equal length, is the number of positions with mis matching characters.
(2) The Levenshtein distance, between two strings of not necessarily equal length, is the minimal number of "edit op erations" required to change one string into the other, where an edit operation is a deletion, insertion, or alter ation of a single character in either sequence. A given se quence of edit operations induces a unique alignment, but not vice versa. In molecular biology, we know that insertions and dele tions have occurred in gene and protein sequences. Therefore the Hamming distance is not general enough. Moreover, there is evidence that some changes are more likely to have occurred than others. Even the Levenshtein distance must therefore be generalized, to include differ ential weighting of different edit operations, based on our underlying evolutionary model. For instance, mutations are likely to be conservative: the replacement of one amino acid in a protein by another with similar size or physico chemical properties is more likely than its replacement by another amino acid with more dissimilar properties. To re flect this, instead of a discrete counting of edit operations we assign a "cost" (E IR) to each change in the sequence. Also, there is evidence that the cost of a gap is not pro portional to its length as in the Levenshtein model; how ever, the proper choice of gap weights as a function of length is a matter of considerable delicacy. Many schemes apply a linear function with one parameter a for gap initi ation and another, smaller, parameter f3 for gap extension, to give a gap cost of the form a + f3 X (gaplength 1 ). Algorithms exist to determine best alignments by mini mizing the sum of the costs of the edit operations that trans form one string into the other. A formal statement of the problem of optimal sequence alignment is as follows: We are given two character strings: A = a1a2 · · ·an and B = b1b2 · · · bm, with each ai and bj a mem ber of an alphabet set .stl. Let .stl + = .stl U { 4>}. A sequence of edit operations is a set of ordered pairs (x, y), with x, y E .stl +. Individual edit operations include: -
Substitution of bi for ai-represented (ai, bj). Deletion of ai from sequence A-represented (ai, 4>). Deletion of bj from sequence B-represented as ( 4>, bj) · A cost function
d is defined on edit operations:
d(ai, b1) = cost of a mutation d(ai, 4>) or d(cf>, bj) = cost of a deletion or insertion. The minimum weighted distance between sequences A and B is
D(A, B) = min !.d(x, y), A->B
where x, y E .stl + and the minimum is taken over all se quences of edit operations that convert A to B. If d(x, y) is a metric on .stl + , D(A, B) is a metric on strings of char acters from .stl + . (This statement of the problem assumes length-independent gap costs; more realistic gap-weighting schemes are generalizations.)
The problem is to find D(A, B) and one or more of the alignments that correspond to it. An algorithm that solves this problem in (iJ(mn) time has been known for a long time, and has been applied to many problems including text edit ing, speech recognition, and analysis of birdsongs [2] . It en tered molecular biology in a seminal paper by Needleman and Wunsch [3] . Several features of this algorithm are noteworthy. •
•
It produces a global optimum: recall that this is the first of computational molecular biology's two "harmonic os cillators." We have a method guaranteed not to get trapped in local minima. That was the good news. The bad news is that interpre tation of the results is not so straightforward. Although a sequence of edit operations derived from an optimal alignment may correspond to an actual evolutionary pathway, it is impossible to prove that it does. The larger the edit distance, the larger the number of reasonable evolutionary pathways. Not only may the optimal align ment be nonunique, but there may be many suboptimal alignments that score quite close to the optimal. For ex ample, Fitch and Smith analysed the chicken genes for a and f3 haemoglobin [4]. They found 1 7 optimal align ments, one of which agreed with the alignment based on the known haemoglobin structures, and over a thousand alignments scoring within 5% of optimum.
Problems with Pairwise Sequence Alignment
It is observed that as proteins evolve, the amino acid se quence diverges more quickly than the structure. In many cases we can detect an evolutionary relationship between two protein structures even though there is no detectable similarity between the gene sequences or the amino acid se quences. What is happening is this: The genes are exploring the space of DNA sequences, but natural selection is acting as a brake on changes in structure, in order to conserve func tion. The redundancy in the genetic code-the facts that sev eral different triplets of bases encipher the same amino acid, and that many single-base changes interconvert amino acids with similar physico-chemical properties--moderates the structural consequences of sequence changes. Even if similarity is detectable at the sequence level, for distantly-related proteins the optimal pairwise sequence alignment often gives the wrong answer, relative to com parison of the structures, which is the court of last resort. However, if many related sequences are available, mul tiple sequence alignment gives more significant and accu rate results than pairwise sequence alignment. Why do mul tiple alignments enhance sequence information? It is the appearance of patterns of conservation. The extent and na ture of the variation at individual positions is an important guide to the structural or functional role of different regions ofthe sequence (Figure 2). For instance, residues conserved over an entire family of proteins are usually involved in function, or at least are usually essential for the structure. Conversely, regions in which insertions and deletions are very common usually correspond to peripheral regions.
VOLUME 22, NUMBER 2, 2000
31
TYLWEFLLKLLQDR . EYCPRFIKWTNREKGVFKLV . VQLWQFLLEILTD . . CEHTDVIEWVG . TEGEFKLT . IQLWQFLLELLTD . . KDARDC ISWVG . DEGEFKLN . I QLWQFLLELLSD . . SSNSSCITWEG . TNGEFKMT . I QLWQFLLELLTD . . KSCQSFISWTG . DGWEFKLS . I QLWQFLLELLQD . . GARSSCIRWTG . NSREFQLC . IQLWHFILELLQK . . EEFRHVIAWQQGEYGEFVIK . VTLWQFLLQLLRE . . QGNGHI I SWTSRDGGEFKLV . ITLWQFLLHLLLD . . QKHEHLICWTS . NDGEFKLL . LQLWQFLVALLDD . . PTNAHFIAWTG . RGMEFKLI .
. DSKAVSRLWGMHKN . KPD . DPDRVARLWGEKKN . KPA . QPELVAQKWGQRKN . KPT . DPDEVARRWGERKS . KPN . DPDEVARRWGKRKN . KPK . DPKEVARLWGERKR . KPG . DPDEVARLWGRRKC . KPQ . DAEEVARLWGLRKN . KTN . KAEEVAKLWGLRKN . KTN . EPEEVARLWGIQKN . RPA
IHLWQFLKELLASP . QVNGTA IRWIDRSKGIFKIE . . DSVRVAKLWGRRKN . RPA RLLWDFLQQLLNDRNQKYSDLIAWKCRDTGVFKIV . . DPAGLAKLWGIQKN . HLS RLLWDYVYQLLSD . . SRYENFIRWEDKESKIFRIV . . DPNGLARLWGNHKN . RTN IRLYQFLLDLLRS . . GDMKDSIWWVDKDKGTFQFSSKHKEALAHRWG I QKGNRKK LRLYQFLLGLLTR . . GDMRECVWWVEPGAGVFQFSSKHKELLARRWGQQKGNRKR L
fl
lL
i w
F
a
WG
Acylphosphatase
Acylphosphatase
Acylphosphatase
Acylphosphatase
Acylphosphatase
Acylphosphatase
K
Figure 2. Multiple alignment of partial sequences from a family of proteins called ETS domains. Each line corresponds to the amino acid sequence from one protein, specified as a sequence of letters each specifying one amino acid. Looking down any column shows the amino acids that appear at that position in each of the proteins in the family. In this way patterns of preference are made visible. For instance, the third position is a leucine, L, in every sequence-this implies that some structural or functional constraint has dissuaded evolution from changing this position. Letters underneath the table indicate positions of invariant residues (upper case) or invariant with one exception (lower case). Note the uneven distribution of variability in the different columns. The periodicity of conserved residues (3, 4 or 8) suggests that the proteins contain helices, which is true. Other patterns are hidden more deeply, and require computational analy sis to identify them. Such patterns might include correlations be tween the distributions of amino acids at different positions: For in stance, in the fourth column from the left the amino acid tyrosine, Y, appears only in the last two sequences; the others contain tryp tophan, W. There is approximate correlation in the pattern of vari ability between this column and the fourth and fifth columns from the right. It is widely believed (or at least hoped) that correlations of patterns of variability at different positions of such a table of se quences should give some clues about sites that interact in three dimensions. Unfortunately the signal is quite weak.
(One sequence plays coy about its structure. A pair of aligned sequences whisper about their structure. Three or more sequences shout about their structure out loud.) sequences give only an indirect glimpse of structures, why not deal with structures alone? The reason is that the amount of sequence data known far exceeds the amount of structural data. For about 20 organisms the entire genome is sequenced, giving us the complete sequences of all the genes. Only for a small minority of these genes do we have the three-dimensional structures of the corre sponding proteins. If
Analysis of Protein Structures
The first problem in analyzing the structures of molecules as complex as proteins is one of presentation. Computer graphic techniques have been developed to draw simpli fied representations of proteins. Figure 3 illustrates, for a small protein molecule, the difficulty in interpreting a fully
32
THE MATHEMATICAL INTELLIGENCER
Figure 3. Proteins are sufficiently complex structures that it has been necessary to develop specialized tools to present them. This figure shows a relatively small protein, acylphosphatase, at three different degrees of simplification. Top: complete skeletal model; mainchain bolder than sidechains. Center: the course of the chain is represented by a smooth interpolated curve, the chevrons indicating the direction of the chain. Bottom: schematic diagram, in which cylinders repre sent helices and arrows represent strands of sheet. The solid objects in the picture are represented as ''translucent" by altering lines that pass behind them to broken lines. To superpose adjacent represen tations, rotate the page by goo and view in stereo (not for too long!).
detailed, literal representation, and the kind of simplified pictures that programs produce to give us visual access to the material. An active cottage industry has produced many different representations; that is, many people have pro posed different simplified representations, and these be come subsumed into general graphics packages. A skilled
molecular illustrator will combine them to show different aspects of a structure in finely-tuned degrees of detail. Such pictures, rendered in full color and with fancy (but unre alistic, given the size of the molecules relative to the wave length of visible light) shadowing effects, adorn journals, posters, and even T-shirts and mugs. We now know the structures of 10000 proteins, and see in them a vast variety of spatial patterns. To Rutherford's comment, "All science is either physics or stamp collecting," I reply that the study of protein structure combines the best of both fields! We have the spectacular variety, but also have faith in the existence of underlying unifying principles. Every protein consists of a linear (that is, unbranched) repetitive polymer mainchain with different amino acid sidechains hung on it at regular intervals. A protein is anal ogous to a string of Christmas tree lights, with the wire cor responding to the repetitive mainchain, and the sequence of colors of the lights to the individuality of the sequence of sidechains. The mainchain describes a space curve which is stabi lized by favourable interactions between the sidechains that are brought into contact. Such a space curve is most easily seen in the center frame of Figure 3. Two regions at the front of the picture have the form of helices-they look like a classic barber pole-with their axes almost vertical. This is one of the two standard structures that local re gions of the chain adopt. The other standard structure is the almost extended strand of sheet: the protein in Figure 3 contains four strands of sheet, approximately vertical in orientation. These strands interact laterally to stabilize their assembly. In the bottom frame of Figure 3, the helices and strands are represented as "icons": helices as cylinders and strands of sheet as large arrows. The top frame of Figure 3 shows the most detailed representation of the structure, including mainchain and sidechains; the contrast demonstrates the importance of simplification in produc ing a visually intelligible picture of even a small protein. The initial stage of "parsing" a new structure involves identifying the regions of helix and sheet. This is the in formation required to convert the representation in Figure 3, center frame, to that of the bottom frame. The most com mon type of helix in protein structures contains 3.6 residues per tum. Features in the sequence that show this periodicity suggests helical regions. Superposition of Structures
As in the case of sequences, a fundamental question in an alyzing structures is to devise and compute a measure of similarity. Suppose that we have coordinate sets repre senting two structures:
i = 1, . . . N and Qj =
(x), y), zj),
j = 1, . . .
M.
Just as in the case of sequences, the question of alignment arises. Consider the contrast between three related prob lems that arise in computational chemistry:
(1) Measure the similarity of two sets of atoms with known correspondences:
i = 1 , . . . N. (The analog for sequences is the Hamming distance.) This problem can be solved exactly and efficiently-it is the sec ond "harmonic oscillator" of computational molecular biology. (2) Measure the similarity of two sets of atoms with
unknown correspondences, but for which the molecular structure-specifically the linear order of the residues restricts the correspondence. In the case of proteins the alignment must retain the order along the chain: PiCk)
�
Q.j(k)•
k = 1, . . . , K :::; N, M,
with the constraint that k1 > kz ::::} i(k1) > i(kz) andj(k1) > j(k2). This can be thought of as corresponding to the Levenshtein distance between character strings, or to se quence alignment with gaps. (3) Measure the similarity between two sets of atoms
with unknown correspondence, with no restrictions on the correspondence: Pi(k)
�
Qj(k)!
k = 1, . . . , K ::s N, M
This problem arises in the following important case: Suppose two (or more) molecules have similar biological effects, such as a common pharmacological activity. It is often the case that the structures share a common con stellation of a relatively small subset of their atoms that is responsible for the biological activity, called a pharma cophore. To identify the pharmacophore it is useful to be able to find the maximal subsets of atoms from two or more molecules that have similar structure. Problems (2) and (3) require determination of the align ment of the points. Alignment methods based exclusively on the coordinates (that is, not on the amino acid sequences) are called structural alignments. In structural alignments, corresponding residues are identified because they occupy the same position relative to the structure as a whole. One must think of extracting the maximal common substructure and basing the alignment on this. (For instance, the maxi mal common substructure of the letters B and R is the let ter P.) Residues outside the maximal common substructure are unalignable, a fact that cannot be detected by pairwise sequence alignment; this is one of its weaknesses. The most general approach to these three problems is based on the solution of problem (1), the case of known correspondence Pi � Qi · Two identical objects can be superposed by a rigid-body translation and rotation of one of them onto the other. Two objects that are similar can be brought into approximate superposition by rotation and translation. If the objects are ordered sets of points, a mea sure of their similarity is the root-mean-square deviation Ll after optimal superposition: il2 = min R,t
N
! I ll Rpi + t - QillJ, i�l
where R is a proper rotation matrix (det R = 1) and t is a translation vector. In the optimal superposition, the mean VOLUME 22, NUMBER 2. 2000
33
positions (colloquially, the "centers of gravity") of the two
how changes in sequence are reflected in
sets coincide. The problem of determining the correct rel
ture; these should be easier to understand-think of this
changes in struc
ative orientation is known as the "Orthogonal Procrustes
as the "differential form" of the protein folding problem:
problem," and solutions based on standard techniques of linear algebra are available
[5].
Solution of the maximal common substructure problem provides the basis of a metric for structures. It allows de tection ofpartial and tenuous similarities, and induces a clas sification tree of the entire corpus of protein structures. Approaches to maximal common substructure calcula tions have been based on two representations of structures:
(1) in terms of lists of coordinates Pi = (xi, Yi, zi), i 1, . . . , n, or alternatively (2) in terms of distance matrices Dp(i,J) = Wi pJ The main advantage of distance matrices is that they =
Topic:
Protein folding
Observation: Form of problem: Status of problem:
sequence
Protein evolution
__.,..
change in sequence
structure
change in structure
"integral
"differential form"
__.,..
form" unsolved
unsolved but
should be easier
-
provide an origin- and orientation-independent representation
A simple argument suggests that structure should be a
of the structure. In terms of distance matrices, the maximal
nearly "continuous" function of sequence, at least for nat
element of the difference distance matrix, maxi,J (ILJp(i, J) Dq(i, j)ll, provides one measure of the structural difference
urally evolved sequences and structures. Suppose that
between two aligned point sets.
in the amino acid sequence) produced an unstable struc
-
there were a protein such that
any mutation
(any change
Coordinates and distance matrices are nearly equivalent
ture. Then, nature could not ever have achieved this struc
representations of a point set. Calculation of the distance
ture by evolutionary processes, because it could not have
matrix from the coordinates is trivial. It is less obvious that
had any stable precursor. It follows that natural structures
the coordinates can be recovered exactly and directly from
must be robust. Most small changes in sequence should
the distance matrix, but this can be done by a matrix di
leave the structure intact. (This need not apply to artifi
agonalization
[6]. To be sure, the distance matrix specifies
cially engineered protein structures.)
equally both the original structure and its enantiomorph
Indeed, natural proteins with very similar sequences
(thus corresponding right and left gloves are two enan
have very similar structures. Before synthetic human in
tiomorphs), but this ambiguity is not a serious problem for
sulin became available, pig insulin was an effective clini
applications to molecular biology. Position and orientation
cal treatment of diabetes in human patients, even though
information are of course also lost.
the amino acid sequences of pig and human insulin are not
The major difficulty in maximal common substructure
identical.
Confidence in such similarities provides
a
2 and 3 is the combinatorial complex
method for predicting the structures of proteins from
ity of considering the many possible alignments. Algorithms
known structures of close relatives, a procedure known as
calculations of types
based on distance matrices have proved more effective than
"homology modelling. " However, as evolution proceeds, se
those based on coordinate sets in dealing with this. Related
quences and structures eventually diverge more radically.
matrix representations based on structural elements such as
Figure 4 shows two distantly related proteins, plastocyanin
helices or sheets rather than on atomic coordinates provide
and azurin, in which the region at the right, containing two
compact
folding patterns.
sheets packed face-to-face, forms a conserved "core" of the
Extraction of maximal common submatrices reveals the
representations
of
protein
structure, whereas the long loopy region at the left has an
largest substructures with a common folding pattern. Such
entirely different conformation in the two structures.
representations also permit enumeration of all possible pro
tein folding patterns. It is estimated empirically that all nat ural proteins have no more than about
1000 different fold
Protein Structure Prediction Nature has an algorithm which specifies the three-dimen
ing patterns. Complete enumeration allows us to examine
sional structure of a protein from its amino acid sequence
nature's choices, to try to distinguish between historical ac
alone. We ought to be able to discover it. We should then
cident and architectural necessity.
be able to predict the structures of the proteins inherent
Protein Evolution
and apply them to practical problems such as drug design.
in the gene sequences in the human and other genomes, Protein evolution is the study of how corresponding amino
Protein structure prediction has proved to be a difficult
acid sequences and protein structures differ in related
problem. Many approaches have been taken, and many
species. It is an informative type of investigation, which
claims advanced. However, at present there is no compu
helps us in understanding sequence-structure relationships.
tational method that can consistently produce even a qual
For although we know that a single amino acid sequence
itatively correct prediction of protein structure from amino
contains all the information necessary to specify the struc
acid sequence, unless a close relative is available.
ture of the protein, we do not yet understand how to rea
Suppose you were given the amino acid sequence of a new
son from the sequence to the structure. Think of this as the
protein, and asked to predict its structure. What might you
"integral form" of the protein folding problem. It is an un solved problem. In studying protein evolution we observe
34
THE MATHEMATICAL INTELLIGENCER
try to predict? The most complete information that a pre diction might provide is a full set of three-dimensional coor-
roughly, in decreasing order of difficulty. Most scientists accept that "granting agencies" is the appropriate level to aim at! Whom must you satisfy? 1. Crystallographers 2. NMR spectroscopists 3. Granting agencies 4. Referees of papers 5. Colleagues 6. Your mother Azurin
Azurin
Plastocyanin
Plastocyanin
Figure 4. During evolution, gene sequences accumulate mutations, and protein sequences and structures diverge as a result. This figure shows two related electron-transport proteins, poplar leaf plastocyanin and bacterial azurin. The portion of the structures in the right half of the pic ture, containing the solid and dashed arrays of "ribbon-like" regions called sheets -and the copper binding site, are well conserved during evolution, whereas the portion of the structure in the left half of the pic ture has diverged more radically.
How in fact could you convince people that you had a successful method for protein structure prediction? Two types of claims are in principle untestable. One is that you can predict the structure of a protein, the structure of which is already known. The other is that you have predicted the structure of a protein, the experimental structure of which is unknown and is likely to remain so for a long time. One must work in the intermediate domain between the known and the not-to-be-known-for-a-long-time, coor
VOLUME 22, NUMBER 2, 2000
35
There is evidence that natural selection has shaped not only the final native states of proteins, but their folding pathways as well. For not only must proteins have evolved to form a stable active conformation, they must achieve that conformation, starting from an unfolded state con taining a mixture of random conformations, in a reason able time. A simple calculation, based on rates of atomic motions in solution, shows that exhaustive exploration of the space of possible conformations would not be fast enough by many orders of magnitude. (This is sometimes called Levinthal's paradox.) There is no evidence that the folding pathway actually affects the final state, although theoretically it could. If alternative folded states were pos sible, but the pathway evolved to promote one, then we predictors would be forced to adopt the Nature rather than the Nudger approach. Where is the obstacle to structure prediction? We think that we understand the forces that stabilize native protein conformations. It is even possible to write down an explicit conformational energy function of the coordinates. All we need to do is to minimize it. However, it is important to recognize that proteins are, in thermodynamic terms, only marginally stable. In fact the conformational energy of a folded protein is a very small difference between very large opposing terms, a numerical analyst's nightmare. Is the difficulty that we can't write down the energy function accurately enough, or is the function so compli cated that we can't optimize it? One test is to minimize the conformational energies of proteins, starting from their known native states. Such calculations do converge to minimum-energy conformations close to the starting point, showing that the energy functions are adequate in the vicin ity of the right answer. (Not too surprising, because the functions are defmed by fitting parameters to reproduce observed native states.) But this is not enough. A function correct in the vicinity of the minimum will not necessarily provide a complete set of trajectories in conformation space that can enable a program to fmd the global mini mum from an arbitrary starting point. There are two problems. First, many of the forces that stabilize proteins are short-range. Even if we knew the energy function exactly, if we started a minimization from a random extended, non-compact, conformation, we would find that there are no long-range forces driving the system towards the correct structure. Second, even if we achieved collapse to a compact state, the landscape of the energy as a function of coordinates contains many local minima, separated by high barriers. Many of these local minima will be candidates for the native state. Real pro teins overcome these problems by a combination of (1) extensive "parallel processing" in which all the residues simultaneously explore their own local dimensions of conformation space, and (2) evolving folding pathways that channel the system towards the right answer. Our computers cannot achieve the parallel processing, our en ergy functions cannot account for the long-range folding pathways, and our algorithms cannot easily find the global
36
THE MATHEMATICAL INTELLIGENCER
minimum of a complicated multivariate nonlinear func tion. (Where is the harmonic oscillator when we need it?!) The difficulty of the a priori case has led, as we have noted, to development of empirical methods, based on the known sequences and structures. Prediction methods that use databanks include (1) methods for homology model ling-prediction of the target structure from a closely re lated protein of known structure; and (2) methods forfold recognition-assessing the compatibility of the amino acid sequence with the library of known protein folding pat terns. These methods are growing more powerful, partly but not entirely because of the growth in the databanks. The more sequences and structures that are known, the more likely that a new protein will be similar to one that is already known. In contrast, ab initio methods are im proving more sluggishly. A grudging comment about them after a recent CASP competition was that at least " . . . fail ure can no longer be guaranteed [8]." A pessimist might predict that the growth of databanks will mean that infor mation-based methods will provide pragmatic solutions to
AUTHOR
ARTHUR M.
LESK
Department of Haematology Cambridge Institute for Medical Research University of Cambridge Cambridge CB2 2QH
U.K.
e-mail:
[email protected]
Arthur Lesk received his Ph.D. in Physics and Physical Chemistry from Princeton University in
1 966. He became a
Professor of Chemistry at Fairleigh Dickinson University in New Jersey. Since
1 977 he has worked at the MRC Laboratory of
Molecular Biology in Cambridge, England , where he is now also on the faculty of the University's School of Clinical Medicine. He was a founder member of the Biocomputing Programme at the European Molecular Biology Laboratory in Heidelberg . His education and career choices have been gov erned by the beliefs that to understand biology you must un derstand chemistry; to understand chemistry you must un derstand physics; to understand physics you must understand mathematics;
so
you might as well start by learning mathe
matics and then work your way along the list.
such a large majority of questions, that interest in and sup port for the development of ab initio methods will wane. It would be a shame if one of the most interesting of bio logical computations were thereby lost to computational biology! I thank the organizers of the Newton Institute Pro gramme, P. Donnelly, W. Fitch, and N. Goldman, for the opportunity to participate in the project; Prof. H. K. Moffatt, N. Goldman, and L. Lo Conte for comments on the manu script; and the Wellcome Trust for support.
and Macromolecules: Comparison.
The Theory and Practice of Sequence
Addison-Wesley, Reading, Mass.
[3] Needleman, S.B. and Wunsch, C.D. (1 970). A general method ap plicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Bioi. 48, 443-453. [4] Fitch, W.M. and Smith, T.F. (1 983). Optimal sequence alignments. Proc. Nat/. Acad. Sci. USA
80, 1 382-1386.
[5] Golub, G. and van Loan, C . , Matrix Computations. Johns Hopkins Press. Baltimore, 2nd ed. 1 989. [6] Young, G. and Householder, AS. (1 938). Discussion of a set of points in terms of their mutual distances. Psychometrika 3, 1 9-22. (For background and history see [7].)
REFERENCES
[1] Wigner, E . P. (1 960). The unreasonable effectiveness of mathemat ics in the natural sciences. Communications in Pure and Applied Mathematics
1 3, 1 -1 4.
[7] Blumenthal, L.M. (1 938).
Distance Geometries. A study of the de
velopment of abstract metrics.
University of Missouri Studies 1 3,
#2.
[2] Sankoff, D. and Kruskal, J.B., eds. (1 983). Time Warps, String Edits,
[8] New York Times, March 25, 1 997.
Ambigrams Burkard Polster Department of Pure Mathematics University of Adelaide Australia
VOLUME 2 2 , NUMBER 2, 2000
37
Fra' Giovanni's T lntarsias in Verona Hartwig Thomas and Kim Williams
ravelers are drawn to Verona usu ally by its arena, an opera perlor mance, the antiquities, artistic treasures, or the house where Juliet Capulet was born. Beyond the curve in the river, however, on a stroll to the somewhat out-of-the-way Giusti Gardens, one comes across the Santa Maria in Organo (figure 1 ), whose gorgeous tower invites you to take a glance inside. Yet only those who ask or join in on a tour of the church discover that they are only steps
Does your hometown have any mathematical tourist attractions such as statues, plaques, graves, the cafe where the famous conjecture was made, the desk where the famous initials are scratched, birthplaces, houses, or memorials? Have you encountered a mathematical sight on your travels? .(fso, we invite you to submit to this column a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks.
Please send all submissions to Mathematical Tourist Editor, Dirk Huylebrouck,
Aartshertogstraat 42,
8400 Oostende, Belgium e-mail:
[email protected]
38
Figure 1 . The Santa Maria in Organo, Verona.
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
away from marvelous mosaics made with pieces of inlaid wood. In the dark apse of the church there is a wonderlul choirstall with intarsias by Fra' Giovanni da Verona (14571525). Vasari (151 1-1574) wrote that the adjacent sacristy of this former Olivetan monastery was the most beautiful in all of Italy. It contains a collection of intarsias as well as, in a fresco above the door, a portrait of the artist Fra' Giovanni. This extraordinary
three-dimensional set-up, for example here a small cupboard with half-open doors, books, and wooden models of the polyhedra. This intarsia is, how ever, a flat, varnished panel in reality. The illusion is created by the care with which the artist obeyed the laws of per spective, as well as the skill with which he even used the grain of the wood in order to achieve this effect. Under the varnish, the intarsia is made from hun dreds of tiny pieces of different kinds of wood, which were glued onto a wooden substrate (figure 3). Fra' Giovanni's pictures with obvi ous mathematical references reflect the mathematical topics and techniques of the time. The systematic investigation of perspective had made considerable progress in the fifteenth century. Al-
though manuscripts on the subject by Brunelleschi and Piero della Francesca circulated among Italian artists, no pub lished works on perspective were at Fra' Giovanni's disposal when he learne<_! his craft. Even Albrecht Diller wrote his friend Willibald Pirckheimer from Venice, on his second trip to Italy in October 1506, "After ten days I'll be free here and then I will ride to Bologna for the sake of the art of secret perspective, which someone wants to teach me . . . Representations of a Campanus sphere, an icosahedron, and a trun cated icosahedron with twenty hexa gons and twelve pentagons are found in Fra' Giovanni's intarsia. All three polyhedrons can be traced back to il lustrations from Luca Pacioli's book (De) Divina Proportione, published in "
Figure 2. The elegant church tower was Fra' Giovanni's work.
artist brought the art of inlaid wood work to a high point in Renaissance Italy, and was active as a sculptor and architect. Indeed, he designed the ele gant church tower of Santa Maria in Organa (figure 2). In this sacristy, one unexpectedly comes across two intarsias devoted to mathematics from the time between 1519 and 1 525. These naturally draw the attention of the mathematical tourist. If one were to see their warm tones of brown and the shimmering grainy nature of their surface structure first in a reproduction, one might be de ceived by the illusion and think that it was an illustration of an arranged
Figure 3. lntarsias in the Santa Maria in Organo, Verona.
VOLUME 22. NUMBER 2, 2000
39
fascination, as can be concluded from the list of very recent references be low. Their algebraic characteristics of symmetry have been the concern of mathematicians such as Felix Klein in his Vorlesungen uber das Ikosaeder
und die Aujlosung der Gleichungen vom junjten Grade (Leipzig, 1884).
Graphic artists and architects were in terested in the more pragmatic surface characteristics of such bodies, such as the fact that one cannot construct a closed body only from hexagons. Architect Richard Buckminster Fuller, for instance, used the principle that for a closed sphere, which contains only pentagons and hexagons, one needs twelve pentagons. This is exemplified by the internationally standardized soccer ball, still the most popular poly hedron in Italy today. REFERENCES
[ 1 ] Peter Cromwell, Polyhedra,
Cambridge
University Press, 1 997. [2] J.V. Field, Rediscovering the Archimedean Polyhedra: Piero della Francesca,
Luca
Pacioli, Leonardo da Vinci, Albrecht Durer, Daniele Barbaro,
and Johannes Kepler,
Archive for History of Exact Sciences, vol. 50, no. 3, 241 -289, 1 996. [3] Helmuth Gericke, Mathematik im Abendland, Springer-Verlag, 1 990. [4] George W. Hart, website on polyhedra, http://www.li.net/-george/virtual-polyhe dra/intarsia. html. [5] Alan Tormey and Judith Farr Tormey, Figure 4. An illustration out of Luciano Rognini's Some brief artistic notes about the marquetries
Renaissance lntarsia: The Art of Geometry,
of S. Maria in Organo in Verona, edited with the support of the Banca Popolare di Verona. It is
Scientific American, vol. 24 7, 1 36-- 1 43,
a special five-page document, locally provided to tourists visiting the Verona church.
July, 1 982.
Mathematical tourist Prof. Benno Artmann, of the Georg-August-Universitiit Gottingen, wondered if the Verona intarsia contain a mistake, on the bottom of the polyhedron below.
Hartwig Thomas [text],
Maybe Fra' Giovanni did not dare to alter Pacioli's work and thus oppose the great da Vinci?
Sempacherstrasse 22 CH-8032 Zurich
Venice in 1509. It contains many geo metric constructions and drawings which, according to Pacioli, are by Leonardo da Vinci. The first, the Campanus sphere, even refers as far as
40
THE MATHEMATICAL INTELLIGENCER
to Euclid XII, 17/18, where it was proved that the volumes of two spheres relate as the third powers of their radii. From the fifteenth century, polyhe drons have been objects of continuing
Switzerland Kim Williams [pictures], Via Mazzini 7 50054 Fucecchio-Firenze Italy
A Geometrically W Decorated Renaissance Box B. Artmann
ell after the high point of the art of intarsia in Italy from about 1470 to 1520, works of inlaid wood be came popular in southern Germany. Towards 1570 a highly skilled artisan created the box that is presented in the "Museum fUr Kunsthandwerk" in Frankfurt am Main, Germany. Its size is approximately 50 X 50 em, its height 18 em. Various kinds of wood, mother of-pearl, and ebony (for the mytholog ical scenes on the sides) were used for the complicated pattern of some 20 regular and semiregular polyhedra. E. Brieskorn of the Mathematics Department at Bonn has given a de tailed classification of these polyhedra. Dr. Hildegard Hoos incorporated it into a comprehensive description of the various art-historical aspects of the box in "Ein Renaissance-Pultkasten aus dem Museum fUr Kunsthandwerk Frankfurt am Main." This document is available at the Museum, Schaumainkai
1 7, Frankfurt am Main, D-60594, Ger many. The designs of the individual poly hedra are very similar to the pictures of about 120 polyhedra published in the voluminous book of Wenzel Jamnitzer. His "Perspectiva corporum regular ium" appeared in Niimberg in 1 568, and perhaps Jamnitzer himself sug gested the design of this nice piece of mathematical art. In 1973, his work was reprinted by the Academische Verlagsanstalt, edited in Graz. An even more recent proof of the revival of interest in the subject is F1eur Richter's work "Die Asthetik geo metrischer Korper in der Renaissance" published in Stuttgart, Ha1je Publica tions, in 1995. Mathematisches lnstitut Bunsenstrasse 3-5
D-37073 G6ttingen Germany
Figure 1. The decorated Renaissance box, reproduced with kind permission of the museum.
© 2000 SPRINGER-VERLAG NEW YORK, VOLUME 22, NUMBER 2. 2000
41
Figure 2. Three details, showing some individual polyhedra.
42
THE MATHEMATICAL INTELLIGENCER
ARCADII Z. GRINSHPAN, M O URAD E.H. ISMAIL,* AND DAVID L. MI LLIGAN
C om p ete Monoton i c ity and D i ese Fue S pray
"Of the many varieties of truth, mathematical truth does not stand the lowest. " -Norbert Wiener The applied mathematician is very fortunate because she/ he has access to a vast treasury of pure mathematical re sults to use. In fact the problem often faced by the applied mathematician is to catch the right tool at the right time. On the other hand, applied mathematicians repay their debt by formulating problems which puzzle and dazzle pure mathematicians and frequently lead to intriguing and diffi cult mathematical problems. We selected the diesel fuel spray problem as a sample of this two-way relationship for two important reasons. The first is that it has the right mix of pure and applied ideas. The second reason is that this is something we can write about. We shall formulate a
purely mathematical problem that arose from our analysis of a diesel fuel spray model. Diesel Fuel Spray: an Idealized Model
Fuel spray (Figure 1) is an important feature of Diesel Engine/Fuel Injection Equipment systems. Ideally we would like to describe a given system analytically. This might allow us to solve the diesel optimization problem the dream of every diesel engineer. Unfortunately, the physical and chemical aspects of the question are very com plex and are not quite understood. Some researchers use empirical equations to overcome these difficulties. The
*Research partially supported by NSF grant DMS-9970865
© 2000 SPRINGER-VERLAG NEW YORK, VOLUME 22, NUMBER 2, 2000
43
Figure 1. An enhanced version after an original photograph of diesel fuel spray made by S.A. Romanov and others in CNITA (St. Petersburg, Russia).
published mathematical models of both the entire system
its leading edge, which is less than the speed in the tail.
and diesel fuel spray are complicated and involve certain
Therefore, new portions of fuel constantly join the front
restrictions (e.g.,
[9] , [7]).
However, surprise, surprise
zone as long as fuel is injected. The idealization of the
sometimes it is mathematically more enlightening to con
model assumes constant injection pressure and particle ve
sider a simplified engineering problem where one is not
locity in the tail, negligible evaporation, and a sufficiently
overwhelmed by the technical complexity.
large combustion chamber containing the gas medium.
Prior to discussing an idealized model of diesel fuel spray
With the conditions described, the penetration of the
[ 10], let us point out that it and its later generalizations (see [9] and its references) were based on the results from spe cific experiments [26]. These experiments have established
front zone may be considered as a moving variable-mass body under some force. This leads to
(V
that in a stationary gas environment the developing fuel spray has two main components: the spray "tail" and the spray front zone (Figure
2).
The spray tail is characterized
by the high volume concentration of the atomized liquid fuel,
which impedes interaction of particles-i.e., fuel droplets
-
Vj)
d.M
=
[ 10]
[.M dV1 ] dT, dT
+ Q
(1)
.M = .M(T) is the fuel mass in the front zone, d.M is the arriving fuel mass for the period dT, V is the rate of injection, and v1 V1(T) is the velocity of the front zone. where T is time,
=
with the ambient medium. The spray tail serves, so to speak,
The left-hand side of
as a tunnel to deliver fuel to the front zone. In the head part,
portion of the fuel in the relative coordinates. The bracketed
(1)
is the momentum of the arriving
where the spray interacts with the ambient gas, the spray
expression in
front zone is formed, and its motion in the combustion cham
force and force of aerodynamic resistance Q = Q( T). In
ber determines the spray penetration.
(1)
is the sum of the forces acting: inertial
(1)
and below, the units of seconds, meters, kilograms, and their
Fuel particle velocities are different in the two compo nents of the spray. In the tail the particle velocity is deter
combinations, are used but not indicated explicitly. We have that d.M
=
(V - V1) ceo-1 dT, where ce is the mass
mined by the injection rate. When this rate is a constant,
concentration of the fuel per unit volume of the front zone
As to
and o-1 is the cross-sectional area at the top of the tail. Since
the velocities of all particles are roughly the same.
the front zone, intensive interaction of the fuel droplets
ce .Msecl .Msec is the fuel mass which goes through the
the particle velocity in the tail is constant, then
=
with the medium results in aerodynamic resistance to the
(Vo-1), where
spray penetration. The front zone travels at the speed of
nozzle per second. The last two equations imply that
Figure 2. Illustration of diesel fuel spray.
44
THE MATHEMATICAL INTELLIGENCER
fuel penetration length
d.M = MsecO - V;lV) dT,
(2)
and after integration,
M ('T) = Msec r (1 - VJ(t)/V) dt.
(3)
0
r (V - VJ(t)) dt
+
Q('T)V!Msec ·
(4)
We assume that Q('T) equals the sum of resistance forces m
Qk = Qk('T) of all droplets in the front zone: Q = I Qk, k�l
m = m('T). The aerodynamic resistance force of a single droplet with diameter dk = dk(T) is defined by Qk = Cx 7Td�pgV}/8, where Cx is the resistance coefficient and pg is the gas density. When one deals with experimental data describing the resistance to a body moving in a gas, the re sistance coefficient Cx is often taken to be proportional to a power of the so-called Reynolds number, which is pgdkVjlp.g in our case (p.g is the gas viscosity). This empir ical approach gives [ 10]
a- ldav a -3 V"'M. Q = A Py - 2 /J.g PJ f
(5)
Here A and a are some empirical coefficients, usually A E (4, 9), and a E (1, 2). Based on the results of extensive ex periments involving various types of injection conditions, and interpreted by a mathematician and engineer team [ 10], it has been determined that 15/2 and 3/2 serve well as ap proximate values of A and a for the usual injection condi tions. For earlier experimental background we refer the reader to the monograph [2 1 ] . In (5), Pi is the fuel density and dav = davC'T) is the average diameter of the droplets in the front zone, defined by (6) In (6), it is tacitly assumed that 0 < a < 3. The quantity dav can be physically interpreted as being the diameter of each droplet (in the front zone) such that their number has mass .M and they collectively generate an aerodynamic resis tance force equivalent to that produced by the original m droplets. Condition (6) can be restated in the integral form d;t;;3 = fof3"' - 3d y/f3) , where y'" is the distribution of the cubes of diameters in the front zone, normalized by f0d y/f3) 1 (cf. [21]). In our model dav is constant. Equations (4), (5), and (3) imply that (see [ 10], [5]) the fuel penetration length (Figure 2) =
S(T)
=
r Vj(t)dt,
subject to the initial conditions s(O) = 0, s'(O) is represented by
v2 - a S('T, V, K) = � y(x), X = KV"' - 1 'T.
S('T)
=
=
Vj(O)
=
V,
(7)
=
- y ' )2 - (y ' )"', x-y y(O) = y'(O) - 1 = 0. y"
Combination of (1), (2), and (3) gives [ 10]
(V - V1('T))2 = �('T)
In (7), K = Apg- 1p.� - "'d;t;;3!p1 is a constant depending on the physical properties of the gas and fuel involved, as well as on the fineness of the fuel's atomization; and the func tion y satisfies the nonlinear initial-value problem (1
X
> 0;
(8)
Thus, for a fixed a, if K and V are given independently, i.e., there is no additional information on any connection be tween them, then the function s('T) is determined by these two parameters. However, if a relation K(V) is known, then s(T) is determined by the single parameter V. On physical grounds, K is always a function of V (the average diame ter dav depends on the injection rate), so the independence of K and V should be interpreted as a failure as yet to de scribe a function K(V). Of course, one should have physi cal sense when choosing independent values of K and V (see references in [5]). In (8), the empirical coefficient a from (5) is a parameter which belongs to the interval (1, 2). By our assumptions and from physical considerations, s is a positive-valued monotonically increasing function of time 'T, 'T E [0, oo) . Hence y, which is known as the diesel para meter function, is expected to be a positive-valued mono tonically increasing function of x, x E [0, oo) . Inverse problems: an example. In contrast to the usual direct problem of calculating the diesel fuel spray pa rameters based on the given conditions of injection, we call a problem an inverse problem, if we are required to de termine some conditions of the injection (or, more gener ally, some relationships among the injection conditions) on the basis of certain known (desirable) parameters of the fuel spray. Presumably the desired fuel spray parameters are those that produce the most effective fuel spray for the considered diesel system (see [ 1 1 ] , [5] , and their refer ences). The mathematical aspects of diesel spray inverse prob lems seem to be more challenging than those of direct prob lems. Note too that one direct problem gives rise to sev eral inverse problems, depending on the desired qualitative and quantitative features of the spray. With the idealized model as outlined above, we consider the following (now-settled) inverse problem: Determine the conditions of injection, i.e., the values of K and V in (7), from two points s 1 and s2 at the respective times 'T1 and T2 ( 'T1 < 72) on the spray development curve s = s( 'T). Necessary and sufficient conditions were found, namely
( 'T2 )113 T}
s S1
72 2 <<-
'TI '
in order for the corresponding values of K and V to be de termined uniquely (in the most practical case where a = 3/2); this has useful applications to diesel engineering (see [5] and references there). In the analysis, the following non trivial inequality was used:
[ y' ] ' x (x) --y(x)
< 0,
x > O.
VOLUME 22. NUMBER 2, 2000
(9)
45
It was conjectured in
[8]
that (9), the basic inequality
for this inverse problem, is just an instance of the mono tonicity of the function
xy'(x)ly(x) and its derivatives for
F(z)
=
z + a2z2 + a:yZ3 + . . . , analytic in E, satisfies the
above condition if it is univalent in E and all of its Taylor coefficients an are real. According to the Rogosinski crite
many values of the parameter a. In order to develop these
rion
thoughts further, we digress to the representation of func
only if F(z)
tions possessing certain sign regularity in terms of integrals
son theorem, based on the Rogosinski criterion and the
with respect to positive measures, and in particular to the
Herglotz formula, states that a function F(z) is typically
concept of complete monotonicity.
[4,
Chapter
10],
a function F(z) is typically real if and
= p(z)z/(1 - z2), where p(z) E 91'. The Robert
real if and only if it can be written in the form
From Sign Regularity to Integrals with Positive Measures
F(z)
We will give several classical examples involving sign reg ularity with the following common thread. A class of func tions defmed by analytic or geometric properties is char acterized by representing its members in terms of an integral of a fixed kernel (depending on the class) with re spect to a positive measure (depending on the function un der consideration). Combining various aspects of this ap proach with some numerical results on the fuel spray penetration length, we will derive in subsequent sections an exponential-integral representation of this length in
terms of a positive measure. By a positive measure f.L we
=
J
l
-I
1-
2 df.L(t), 2zt + z z
where f.L is a probability measure on
[ - 1, 1] [4, Chapter 10].
The second application of the Herglotz formula involves
normalized starlike functions, i.e., conformal mappings F(z), F(O) = F' (O) -
1
=
0, ofE onto a domain starlike with
respect to the origin. These functions satisfy the Nevan linna criterion
[4,
Chapter
8]:
0t{zF' (z)IF(z)j >
0,
z
E E.
One can use this criterion and the Herglotz formula to show that the function F(z) is a normalized starlike function if and only if it has the following exponential-integral repre sentation:
mean a nondecreasing function defined on an interval, say
[a , b ].
Each integral
fg f df.L below is a Stieltjes integral.
i) A complex-valued function F defined on ( -oo,
ates the forms �lk=l CjckF (xj
numbers
X1, . . . , Xn (n
) gener
oo
- xk) for every choice ofn real 1, 2, . . . ). Of particular interest in
harmonic analysis are the positive-definite forms, which are =
positive for any nontrivial ch . . .
, Cn· Positive-definite func
tions are those functions which generate positive-definite forms for all n and all choices of the x s. In other words,
�lk =l CjckF(xj - xk) > 0
}
except when all the c s are zero.
Bochner's theorem (e.g., [ 19, Chapter
6])
}
states that F is
F(x) f�oo df.L < oo.
positive-definite and continuous if and only if
f-'='oc exp(ixt)dJ.L(t),
for a positive measure f.L,
=
A
similar theorem holds for locally compact Abelian groups. For 27T-periodic functions, F is positive-definite and contin n uous if and only if F(8) = ��= oo anei fl with an � 0, where � the an's are not all zero and �"' an < oo [ 19, Chapter 7].
ii) Now we tum to geometric function theory. Let E de note the open unit disk {z : jz j < 1 }, and let 91' be the class of all functions p(z), p(O) =
where f.L is a probability measure on
[0, 27T] [4,
Chapter
8].
iii) Finally, we consider completely monotonic functions.
An infmitely differentiable function g is called completely
monotonic on an interval I if ( - 1)ngCnl(x) � 0 on I (n = 0, 1, 2, . . . ). We say that a function g is strictly completely monotonic on an interval I if ( - l)ngCnl(x) > 0 on I for all n. A typical completely monotonic function on (0, oo) is exp( tx) , t � O. lt is clear that positive linear combinations -
of such functions are also completely monotonic. A theo
rem of S. N. Bernstein (see
[ 1]
[3, Chapter 1 3] , [27, Chapter 4],
g is completely monotonic on and
for this and related results) states that a function
(0, oo) if and only if there is [0, oo) such that
a positive measure f.L supported in
g(x)
=
r
exp( -xt) dJ.L(t).
Another useful fact is that if g(x)
exp( -u(x)) and
u'(x)
then g is com
1, that are analytic in E, and such that for z E E, 0t{p(z) } > 0. According to the Herglotz theorem (e.g., [4, Chapter 7]), a function p(z) belongs to
pletely monotonic on I. This follows from the rule for
the class 91' if and only if
di Bruno's formula in
p(z)
=
0 L27T
(1
+ zeit)J( l - zeit) dJ.L(t),
where f.L is a probability measure on
[0, 27T), i.e., f57T df.L
=
cations. We mention two of them, having in mind some deeper sign-regularity conditions and exponential-integral representations.
A function F(z), F(O) = F'(O) - 1 0, is said to be typ ically real in E if it is analytic in E and if for every nonreal =
E E, sign(2J{F(z)})
46
=
sign(2J{z}). For example, a function
THE MATHEMATICAL INTELLIGENCER
differentiating composite functions and induction (see
[25,
Chapter
2]).
Laplace transforms of nonnegative functions
Herglotz representation formula, has many useful appli
z
I,
Originally, completely monotonic functions arose as
and f.L is positive. This representation, known as the
1
=
is completely monotonic on an interval
[27], [15]. Now [16], [14]).
they appear in many areas of mathematics (e.g.,
An instance of complete monotonicity in probability theory variable X with distribution measure f.L is called infinitely
is the concept of infinite divisibility. A non-negative random
divisible if and only if for every positive integer n there are independent non-negative random variablesX1,
• . .
, Xn. each
having the same distribution f.Ln, whose sum is X, that is,
Loo 0
exp( -xt)d J.Ln (t) =
[L"" 0
exp( -xt)dJ.L(t)
]1/n.
It turns out that a probability measure is infinitely divisi ble if and only if the Laplace transform fo exp(- xt)dt-t(t) = exp( -u(x)), where u(O) = 0 and u ' is completely mono tonic on (0, oo), i.e.,
r
exp ( -xt)dt-t(t) = exp
{f
}
[exp( -xt) - 1 ]t-l d v(t) ,
where v is a positive measure. This characterization as an exponential-integral representation is beneficial in estab lishing the complete monotonicity of several probability distributions (like the student t-distribution and the F-dis tribution) using the theory of special functions, for the Laplace transforms of probability measures are usually special functions, and u ' , being the logarithmic derivative of a Laplace transform, is a quotient of two special func tions. For information see [ 17], where the approach uses integral representations. In fact some of the work on com plete monotonicity of quotients of special functions in [ 1 7] led naturally to certain probability distributions which were later found to be hitting-time distributions for Brownian motion [24]. The complete monotonicity of log arithmic derivatives of special functions turned out to be a problem of independent interest; advances along this di rection are exemplified in Hartman's works [ 12], [13].
Now we provide a bridge from the diesel parameter func tion y with which we began to complete monotonicity. For a given a, 1 :::::; a < oo, let [8] xy ' (x) y(x)
,
x > 0,
and
h(O) = 1,
It follows that for a = oo, h(x) = 1 + f' (x) -
�
f(x) X
,
f (: ) - ].
a h
x > 0,
, a
and
1
h(O) = 1,
( 1 1)
(12)
where f(x) satisfies the nonlinear initial-value problem f"
=
(!') -f - ef; 2
x > 0; f(O) = f' (O) = 0.
We shall refer to the conjecture above as the DFS (diesel fuel spray) conjecture. We believe that although h may not be completely monotonic on any interval [0, 77) for a < a0, its conjectured complete monotonicity for a ::o:: a0 seems to spill over to smaller values of a, producing regular sign be havior of the function h and its derivatives. Implications of the DFS Conjecture
A proof of the DFS conjecture would give new information about the nature of the initial-value problem (8) and the model in [ 10] and [9]. Indeed the complete monotonicity of the function h(x) on an interval [0, 7]), 7J = 77(a), a0 :::::; a < oo, will bring the exponential-integral structure to the study of the fuel-spray penetration length, s(T). This struc ture will provide a companion (or possibly an alternative) to the differential equation approach. More precisely, let If h is completely monotonic on (0, 77) then g is completely monotonic on (0, oc), and Bernstein's theorem supplies a representation
(10)
where y is defined by (8). Note that a is now allowed to go beyond the "diesel interval," i.e., the domain (1, 2) spec ified by the diesel spray model [10]. In the asymptotic case (a = oo) we define [8] h(x, oo) = 1 +
and that max[a o,""l 77(a) is attained for some a close to a0.
g(A) = h(7](1 - e-A)), A E [0, oo).
The Logarithmic Derivative of the Diesel Parameter Function and a Related Conjecture
h(x) = h(x, a) =
putations of the function h and its derivatives for various finite values of a, as well as for a = oo (see [8], [23], and below) support the following modified version of the con jecture in [8] . For some a0 ::0:: 3/2 and each a E [ a0, ooj, thefunction_h, defined by (8) and (10) - (13), is strictly completely mo notonic on an interval [0, 77(a)). It is likely that a0 = 3/2
(13)
We can think of h as a family of functions parametrized by a. It tums out that for many values of a E [1, oo], the function h(x) (Figure 3) exhibits certain features of com plete monotonicity on some (maximal) interval [0, 77(a)). In reality, the lifetime of a diesel spray is very short, and thus x-defined by (7)-belongs to quite a restricted seg ment. It would be nice if this were a subset of [0, 77(a)) for "diesel" values of a. It will also be of interest to determine whether or not 77(a) attains its global maximum just on the diesel interval, and to find this maximum. Extensive com-
where I-ta is a positive measure supported in [0, oo) . Since h(O) = 1 , it must be the case that Jo dt-ta 1 . Consequently, the truth of the DFS conjecture, together with equations (14), (10) and (7), would give the following representation of the fuel-spray penetration length s( T): =
s(T)
=
VT exp
{l"'IJaT 0
L0
(1 - x)t - 1 X
]
dx
}
dt-ta(t) , T< a'
1
(15)
where a = a(a) = _KVa - lf7J(a). The representation ( 15) is a characterization of physi cal and geometric properties of diesel sprays converted, via a mathematical model [ 10], to an analytic expression displaying a sign regularity property comparable to those of the examples in i)-iii). It involves an empirical coeffi cient a and a probability measure I-ta depending on a, nei ther of which is known explicitly. Nevertheless this repre sentation, and its possible generalization to evaporated fuel sprays which are generated by given time-dependent in jection pressures, would be useful. In particular this would be important if it is required to solve inverse problems con cerning fuel spray (see above). The question of the exis tence of a solution to a typical inverse problem is closely connected with monotonicity and other qualitative prop-
VOLUME 22, NUMBER 2, 2000
47
h
k------r---,�--�--�� X
250 0 Figure 3a. Graph of the function h(K) in the case
a =
1.
500
750
1000
h
0
250
Figure 3b. Graph of the function h(K) in the case
a =
3/2.
500
750
1000
X
h
0
250
Figure 3c. Graph of the function h(K) in the case
48
THE MATHEMATICAL INTELLIGENCER
a = co .
500
750
1000
X
erties involving various combinations of the function s(r) and its derivatives. It turns out that the representation (15), if proven, will be a convenient tool to answer such a ques tion (see [5] and references there). Our belief that (15) is valid is borne out by computer experiments.
12
• • • • • • •
Numerical Analysis
A computer "confirmation" of the DFS cof\iecture has been based upon a parametric coefficient approach, using both the Taylor coefficient properties of differential equation so lutions and the proper equations with certain auxiliary pa rameters. The detailed analysis will appear elsewhere. Here we discuss some results derived through the Taylor poly nomial approximation. n Let {F}n denote the coefficient of x in the Taylor series expansion of a function F(x) about x = 0. For a given a, a 2:: 1, one obtains the sequence {h}n, n 2:: 1-using equa tions (10) and (8) and triple recursion if a is finite, or equa tions (12) and (13) and double recursion if a = oo [8]. Let n n(a) be the smallest natural n if any such that ( - 1) {h}n < 0. Calculations suggest that there exists a finite n(a) for each a < 3/2. In other words the function h is not completely mo notonic on any interval [0, TJ) if a < 3/2 (Figure 4). For a 2:: 3/2, we take into account some properties of n the n-sequences {h}n (x )Ck) with suitable values of x > 0 and k = 0, 1, 2, . . . (n up to 50,000 in our experiments), and use the Taylor polynomial approximation of the func tion h(xp). Here p is a positive parameter which is close to lim supn..... llhlnl-1/n if the degree of the related Taylor poly nomial is large. This allows us to demonstrate that many successive derivatives of h alternate on a suitably chosen interval [0, TJ), which is contained in the interval of con vergence of the Taylor series expansion of h around the origin (Figures 5 and 6 provide some examples). In addi tion one can use differential equation (8) and some auxil iary parameters to see whether or not for a finite a a num ber of successive derivatives of h alternate on larger intervals. Equations (8) and (13) themselves allow us eas ily to compute at least the function h for values of x on a large interval, showing that h(x) is positive valued and de creases for various a (Figure 3). It turns out that when a = oo the possible interval of complete monotonicity [0, TJ(a)) is a proper subset of the interval of convergence above. Of course, this case of the DFS cof\iecture takes us well beyond the realm of engi neering. However, the asymptotic DFS cof\iecture looks simpler (with its reduced recursion) and might be handled (due to the exponential composition in (13)) by an expo nentiation approach (e.g., Milin's approach [22]). Further more, if for a = oo, it is shown that h is strictly completely monotonic on some interval [0, TJ), then one would expect that for any ij E (0, TJ) and for sufficiently large values of a, h is strictly completely monotonic on [0, ij/a). It is also possible that this property can be extended to certain smaller values of a. The simplest case of numerical examples is when a = oo and parameter p equals 1. The details are too long to be included here and will appear in a future work. Figure 6
:
I
9
<.p
6
oc
3
•
0
• •
• • • •
• •
1.1
• • • • • • • • •
:
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
1 .2
1 .3
1.4
I ,
1.5
a
Figure 4 . Illustration of the function �a) = log n(a) on [1 , 3/2).
gives a related illustration when fifteen initial derivatives of the function h are considered on the interval [0, 2]. Then it is sufficient to use the corresponding Taylor polynomiVOLUME 22, NUMBER 2, 2000
49
180
150
120
90
60
.1 30
.05 X
0
.5
1
1.5
2
0
.5
1
1.5
2
X
0 .5
1
1.5
2
-.05 -100 -.1 -200 -.15 -300 -.2 -400
-500
-600 Figure 5. Selected graphs of h(kl on [0, 2] (a =
50
THE MATHEMATICAL INTELLIGENCER
%).
X
3000
2500
2000
1500
1000
.1 500
.05 X
0
.5
1
1.5
2
0
.5
1
1.5
2
X
0 .5
1
1.5
2
X
-.05 -5,000 -.1 -10,000 -.15 -15,000 -.2 -20,000
-25,000
-30,000 Figure 6. Selected graphs of hCkl on [0, 2] (a =oo).
VOLUME 22, NUMBER 2, 2000
51
als of orders 400 to 500. The omitted sequence of graphs between those shown there all confirm the DFS cof\iec ture. However, we cannot conclude from this example that 7J(oo) 2: 2. More extensive calculations suggest that 7J(oo) = lima-.oo 0!7)(a) 1/2. =
Some Sources of Ideas
As mentioned above, P. Hartman [ 12], [ 13] investigated the complete monotonicity of the logarithmic derivatives of the solutions of linear ordinary differential equations. This raises the question of possibly extending Hartman's theory to include nonlinear differential equations such as (8) and (13). Equation (8) can be transformed to an Abel differen tial equation of the second kind in terms of u = x y as a function of v = 1 - y '. This makes v E [0, 1). A similar transformation works for equation (13). For Abel differen tial equations and related topics, see E. Kamke [ 18, Chapter 1]. One technique to prove complete monotonicity o f a function g, which predates Hartman's work (see references in [ 1 7]), is to derive integral representations for (log g)' and then use the theory of special functions. This intro duces another idea to this set of problems. Also, as we re marked, complete monotonicity of quotients of Bessel functions, first derived through integral representations, is related to infmite divisibility of hitting-time distributions of Brownian motion. In turn, Brownian motion is related to diffusion processes and heat conduction boundary value problems. Further, the DFS cof\iecture brings to mind the analysis by L. de Branges [2] of the behavior of Milin's functionals along a Loewner chain generated by a continuously in creasing family of simply connected domains (see [22, Chapters 2 and 3] and Loewner's paper [20]; a simplified version of de Branges's proof can be found in [6]). In this case we deal with a relatively simple partial differential equation involving one real and one complex variable (Loewner's equation) and a corresponding sequence of mo notonic functions defined by the logarithmic coefficients of a solution. The presence of the sequence of monotonic functions in this instance is a substitute for complete mo notonicity. Besides, representation (15) (if valid) can be rewritten in terms of the binomial coefficients (the Taylor coefficients of the function (1 - x) C), which were used by I. M. Milin to describe his exponentiation approach. As far as we know there is no analytic way to handle the DFS cof\iecture to date. However, we hope that an idea or a combination of ideas from the sources above will be effective in fmding one. At the same time it would be of great interest to get a direct physical proof of the repre sentation (15), and thereby of the DFS cof\iecture. These observations hint that there is more work to be done on flows arising from heat/mass transfer problems like diesel fuel spray and geometric flows which share some of the properties of Loewner's chains. We may actu ally be seeing only a few peaks in a whole mountain range where complete monotonicity-or a concept generalizing
it to possibly a complex variable or several variables plays a fundamental role, and where physical, geometric, and engineering models meet partial-differential equations and function theory. What seems to be important is the ubiquity of logarithmic derivatives and some kind of mo notonicity. It would be nice to have a unified theory which explains these interconnections. In summary we propose that an exponential structure involving logarithmic derivatives may be behind results in an engineering model for diesel fuel spray. The cof\iectured exponential structure may resemble known structures in function theory like the one used to describe univalent functions in the unit disk.
-
52
THE MATHEMATICAL INTELLIGENCER
REFERENCES
[1 ] R. P. Boas, Jr., Signs of derivatives and analytic behavior, Amer. Math. Monthly 78
(1 971 ) 1 085-1 093.
[2] L. de Branges, A proof of the Bieberbach conjecture, Acta Math. 1 54 (1 985) 1 37-1 52.
[3] W. Feller, An Introduction to Probability Theory and Its Applications, volume 2, John Wiley & Sons, New York, 1 966.
[4] A. W. Goodman, Univalent Functions, V. 1 , Polygonal Publ. House, Washington, NJ, 1 983. [5] A. Z. Grinshpan, The use of parameterization for modeling diesel fuel spray, SAE Technical Paper Series 921 728 (1 992) 1 -6. [6] A. Z. Grinshpan, The Bieberbach conjecture and Milin's function als, Amer. Math. Monthly 1 06 (1 999) 203-2 1 4. [7] A. Z. Grinshpan and S. V. Belyi, The system theory approach to some engineering problems, SAE Technical Paper Series 932460 (1 993) 1 -7. [8] A. Z. Grinshpan and M. E. H. Ismail, On a parametric diesel func tion, Math. Rep. Acad. Sci. Canada 18 (1 996) 53-58. [9] A. Z. Grinshpan and S. A. Romanov, Improvement of diesel en gine performance through fuel-injection equipment optimization, SAE Technical Paper Series 91 1 820
(1 99 1 ) 1 -1 6.
[1 0] A. Z. Grinshpan, S. A. Romanov, and Yu. B. Sviridov, Analytic model of liquid fuel spray penetration in a stationary gas atmo sphere, Trudy CNITA 64 (1 975) 1 7-23 (in Russian).
[1 1 ] A. Z.
Grinshpan,
S. A.
Romanov,
and
Yu.
B.
Sviridov,
Determination of fuel injection conditions in diesel engines on the basis of some known mixture formation parameters, Trudy CNITA 75 (1 980) 3-1 1 (in Russian).
[ 1 2] P. Hartman, Difference equations, disconjugacy, principal solu tions, Green's functions, complete monotonicity, Trans. Amer. Math. Soc. 246
(1 978) 1 -30.
[1 3] P. Hartman, Uniqueness of principal values, complete monoto nicity of logarithmic derivatives of principal solutions, Math. Ann. 241 (1 979) 257-281 .
[1 4] H. Hattori and K. Mischaikow, A dynamical system approach to a phase transition problem, J. Differential Equations, 94 (1 991 )
34Q-378.
[1 5] I. I. Hirschman and D. V. Widder, The Convolution Transform , Princeton University Press, Princeton, 1 956.
[1 6] M. E. H. Ismail, Complete monotonicity of modified Bessel func tions, Proc. Amer. Math. Soc. , 1 08 (1 990) 353-361 .
[1 7] M. E. H. Ismail and D. H. Kelker, Special functions, Stieltjes trans forms and infinite divisibility, SIAM J. Math. Anal. 1 0 (1 979) 884-901 .
A U T H O R S
MOURAD
ARCADII Z. GRINSHPAN
Arcadii Z. G ri nshpan was born in St.
Pe
tersburg, Russia. He graduated from St. his Ph.D.
work
University of South Florida
Tampa, FL 33620-5700 USA
[email protected]
e-mail:
[email protected]
Mourad E. H. Ismail was born and raised
David L. Milligan was born in Tampa, Florida.
Cairo University, M . Sc. and Ph.D. from the
of South Florida, aided
in Cairo, Egypt. He holds a B.Sc. from
the University
by two major pro
was done under di
University of Alberta in Canada. He has
fesso rs - the
been interested in classical analysis, mostly
paper. He is attached to the industrial math
in
special functions and orthogonal polyno
ematics group working through USF. His
and Production Institute
mials, asymptotics, integral transforms,
research interests include numerical analy
and i neq ualities.
Under the inftuence and
sis, computer graphing, image processing,
Richard A. Askey and
graph theory, and number theory. From h i s
20 years as senior mathematician Scientific
for Fuel Systems of Engines, t ive member of the
and as an ac
Go l uzin seminar on
g eometric function t heory in St. Peters
burg. He then moved to the Center for International
Environmental
Cooperation,
encouragement of
Dennis W. Stanton he developed a side in
tion identities in co m b i natorics. ce nt ly
the fuel spray problem and indeed t hey
-
ing professor and later as industrial
math
ematics coordinator. His research interests
range from
mathematical
co m plex function theory
to
model i ng i n d i esel eng i neer
and numerical analysis. A lot of his work
was inspired by the Bieberbach conjecture
l ist .
More re
Univers ity of South Florida fi rst as a visit
found
other authors to the present
childhood he has been an ardent philate
terest in enumeration problems and parti
Russian Academy of Sciences, and to the
ing
He recently graduated from
of Isaak M. Milin. He spent about
in complex analys i s
CNITA,
Department of Mathemat ics
Tampa, FL 33620-5700 USA
e-mail: azg@math . usf.edu
recti on
DAVID L. MILLIGAN
Mathematics
University of South Florida
University of South Aorida
Tampa, FL 33620-5700 USA
Petersburg State University ;
E.H. ISMAIL
Department of
Depart ment of Mathematics
Grinshpan twisted his arm to study grou n ds for collaboration. He has
held positions at Cairo University, McMaster
University, and Arizona State University and is currently a professor
,
at the Univer
sity of South Florida.
and other coefficient problems for univalent functio ns .
[1 8] E. Kamke, Differentialgleichungen: Losungsmethoden und Losungen,
[23] D. L. Milligan, How Mathematics Aids Engineering and Engineering
[1 9] Y. Katznelson, An Introduction to Harmonic Analysis, 2-nd edition,
[24] J. Pitman and M. Yor, Bessel processes and infinite divisibility laws,
Chelsea, New York, 1 974.
in: Stochastic integrals,
Dover, New York, 1 976. [20] K. Li:iwner (C. Loewner), Untersuchungen i.lber schlichte konforme Abbildungen des Einheitskreises. I,
Math. Ann. 89
(1 923) 1 03-
1 21 .
[21 ] A. S. Lyshevskii, Fuel Atomization in Marine Diesels, Sudostroenie, St. Univalent Functions and Orthonormal Systems ,
Math. Soc . , Providence, Rl, 1 977.
Amer.
Thesis, USF, 1 997.
Lecture Notes in Mathematics
(Springer
Verlag, Berlin, 1 981 ) 285-370. [25] J. Riordan, An
Introduction to Combinatorial Analysis, J oh n
Wiley
& Sons, New York, 1 958.
[26] Yu. B. Sviridov, Engines,
Petersburg, 1 97 1 (in Russian). [22] I. M. Milin,
Stimulates Mathematic s,
Mixture Formation and Combustion in Diesel
Mashinostroenie, St. Petersburg, 1 972 (in Russian).
[27] D. V. Widder,
The Laplace Transform ,
Princeton University Press,
Princeton, 1 946.
VOLUME 22, NUMBER 2. 2000
53
U DO HERTRICH-J EROM I N
The S u rfaces Capab l e of D ivi s i on i nto I nfi n ites i mal Sq uares by the i r C u rves of Cu rvatu re : A nonstand ard -analysis approach to c lass i cal d ifferential geometry
�.......
lassically, isothermic surfaces are characterized
as
those surfaces which are "di
visible into infinitesimal squares by their curvature lines ". Taking this definition as an example, the paper is devoted to the analysis of the classical ''geometric approach" to differential geometry: notions of "infinitesimal geometry" are introduced following Nelson's approach to nonstan dard analysis, and some basic facts in differential geome try are derived following the classical authors. The above characterization is also the direct analog to the definition of discrete isothermic nets. Thus, this "geometric ap proach" may be a good mediator between discrete net the ory and differential geometry. Introduction
After being a forgotten side branch in (differential) geome try for a long time [25], discrete net theory gained immedi ate interest after a relation with integrable system theory became apparent [2]-causing a considerable number of
54
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
publications over the past few years. Discrete net theory deals with discrete analogs of objects and concepts in smooth differential geometry. With this formulation, one of the major problems of discrete net theory already surfaces: the term "analog" is not well defined. Indeed, it is fortunate that it is not well defmed: thus space is left for develop ments in various directions without prejudging them. This is a frequent theme in discrete net theory: to seek the "cor rect'' defmitions for discrete analogs of differential geo metric objects-and in each case to try to justify the defi nition in view of the relations between the smooth and discrete concepts. For example, when is it justified to con sider a discrete net F : 7L2 � !R3 as a "discrete isothermic
Figure 1 . An isothermic surface and an analogous discrete isothermic net.
net" analogous to a smooth isotherrnic net? Should one ask for similar theorems to hold; or should one require that smooth isotherrnic nets can be obtained as certain limits from discrete isothermic nets? Another observation is that the methods of proof in dis crete net theory are quite different from those in (modem) differential geometry: often, analytic arguments and cal culations have to be replaced by geometric arguments; sometimes, there seem to be no parallels in proofs of anal ogous facts at all. Thus, is it possible to take a viewpoint that brings to light more methodological analogies between the discrete and smooth theories? Here, a look into some of the classical literature is a startling experience: proofs and arguments in differential geometry can look much more geometric than the modem ones do-they resemble more the arguments used in the discrete theory (cf. [7], [ 14], [21 ] or [26]). At one point, Darboux [8] even explic itly mentions the possibility of giving different types of proofs, geometric or analytic. Let us distinguish, then, two different approaches, the "geometric" and the "analytic" approach to differential geometry. A modem differential geometer is usually unable to follow any argument in the geometric approach, and can only collect the facts and redo the proofs using analytic formalism. However, it might be useful to understand this classical geometric approach better-particularly, as I have said, with discrete net theory in mind. The present paper aims to be a small contribution in that direction: I will set forth some facts that seem neces sary (for a modem mathematician) to understand this geometric approach, and then describe some basic facts in differential geometry using it. At the same time, to bring out the relations with discrete net theory, I will discuss the classical definition of isothermic surfaces as an example. Their discrete analogs are a topic of current research. A discrete analog1 for isothermic surfaces was first defined by Alexander Bobenko and Ulrich Pinkall [ 1 ] (cf. [ 12]):
A map F : 7L2 � !R3 is said to be a discrete isothermic net if aU elementary quadrilaterals
Definition.
[Fm,n; Fm + l,n; Fm + l ,n + l; Fm,n + l]
are "conformal squares, " i.e., have cross ratio R = - 1 . Here "conformal square" means a quadrilateral (in space) which is the image of a square under a conformal (Mobius) transformation of the ambient space. In particu lar, its vertices are concircular-its cross-ratio is real-and the two "diagonal" circles, intersecting the surrounding cir cle orthogonally in opposite vertices, are perpendicular (Fig. 3)-its cross ratio R is - 1. In [ 1 ] , this definition is motivated by studying the limit behavior of quadrilaterals formed by parameter lines on the surface: it is shown that the parameter lines form a con formal curvature line net iff the cross-ratios of parameter net quadrilaterals around each point tend to - 1 quadrati cally. Thus, it conforms with the common
Figure 2. A discrete isothermic net.
1 Note that this is not the only possible definition - it is not even clear whether this definition provides the "best analogy" with the theory of smooth isothermic surfaces, in all aspects.
VOLUME 22, NUMBER 2, 2000
55
/
\
\
/
\
Figure 3. Conformal Squares.
Definition. A surface in 3-space is called isothermic if it allows conformal curvature line parameters.
Meanwhile, several results on discrete isothermic nets were obtained-in particular on the Christoffel [ 1 ] and Darboux transformations [12] (that I will address towards the end of the paper)-which further corroborate the anal ogy with smooth isothermic surfaces. However, consulting the classical literature, it seems that the analogy would have been completely obvious to any geometer working around the tum of the century: a first hint [6] is the clas sical
surface in (Euclidean) 3-space is called isothermic if it is capable of division into infinitesimal squares by means of its curves of curvature. Definition. A
This definition can be considered as part of the geometric approach to differential geometry (which Christoffel still used when introducing his transformation [7])-in contrast to the previous definition that belongs rather to the ana lytic approach. At this point, we seem to need no further indication that the classical geometric approach to differ ential geometry may provide a good mediator between dis crete net theory and modem differential geometry. Thus I turn to the main part of the paper: the analysis of this clas sical definition2 and its environment, the "geometric ap proach" to differential geometry. Infinitesimal&
First, the notion of infinitesimals has to be clarified: even though there exist earlier attempts to explain the notion of infinitesimals [5], a definition satisfying to most mod em mathematicians was found only in the sixties [24]. In the meantime, there exist two conceptually quite dif ferent approaches-their equivalence has been proven in [9].
In Robinson's original approach, an ordered field ex tension HR of IR is introduced-a number E E HR is then called "infinitesimal" if its absolute value is smaller than any positive real: Vr E IR C HR, r > 0 : kl < r. Besides Robinson's approach via higher-order nonstandard models [24] or the possibility of introducing HR axiomatically [22], there are two constructive ways of obtaining HR: either by adjoining [17] an ideal element E-similar to the way the complex number field can be obtained from the reals-or by considering suitable equivalence classes of real se quences [ 18] (cf. [ lO], [ 16])-similar to the way the reals are constructed from the field of rationals. This possibility to "construct" infmitesimals might be an advantage of this ap proach to nonstandard analysis. In Nelson's approach, on the other hand, the underlying field IR of reals remains unchanged; instead, a new concept-"standard"-is introduced [19], [23]: every real number (in fact, every mathematical object) is provided with a predicate "standard", or its opposite "nonstandard." An aspect which makes this approach very attractive-in particular with respect to applications-is that the intro duction of "standard" can be considered as the introduc tion of an "ideal scale" (cf. [ 15]): a number is standard if it is "measurable" or "accessible" with regard to this ideal scale, and it is infmitesimal if it is too small to be mea sured:
real number E E IR is called infinitesimal if its absolute value is smaller than any positive stan dard number: E = 0 : � vs r > 0 : kl < r; also: x, y E IR are called infinitely close (x = y) if x y = 0, x E IR is called finite if lxl < r for some standard r e IR, y E IR is called infinite (y = roJ if jy j 2: r for all r e IR. Definition. A
-
The use of the attribute "standard" is ruled by three axioms3: (I) Idealization: vsf F 3 xF\:fy E F : R(xF, y) {=::::} 3 x \:f8y : R(x, y), i.e., a (classical) relation R satisfies a domination prop erty for all standard elements if and only if it satisfies this domination property for every finite standard sub set. If, for example, R is the relation "<" on the positive reals IR+, then idealization provides the existence of in finitesimals [23]. "The intuition behind (I) is that we can only fix a finite number of objects at a time. To say that there is a y such that for all fixed x we have R is the same as saying that for any fixed finite set of x's there is a y such that R holds for all of them." (S) Standardization: 38 Ap\:f8x : [x E Ap ¢:::> X E X 1\ P(x)], i.e., for any property P (classical or not) there exists a standard subset Ap having for standard elements pre cisely those satisfying P. This axiom reflects the fact that the attribute "standard" can usually not be used to form
2At this point, 1 would like to thank Konrad Voss for repeatedly asking me for the "exact" meaning of this definition. This paper may be considered a comprehensive answer to his question. 3The quotes are taken from Nelson's unpublished book [20]. At this point, I would like to thank Edward Nelson for helpful discussions.
56
THE MATHEMATICAL INTELLIGENCER
(regular) subsets4. "The intuition behind (S) is that if we
which is usually given as an example of a differentiable func
have a fixed set, then we specify a fixed subset of it by
tion whose derivative is not continuous--is not differentiable
giving a criterion for judging whether each fixed element
at x = 0 in the sense of the above definition. As we will see, 1 differentiability as defined above is close to C , continuous
is a member of it or not."
differentiability. Here, a problem arises: the above definition
(T) Transfer:
'V8x : F(x) � Vx : F(x),
works only at standard points and consequently does not pro
holds for all standard x. This axiom implies that any stan
continuity. There are two possibilities to solve the problem:
dard set or standard function is uniquely determined by
it can be shown that the assignment
its standard elements, resp., by its behavior on standard
(standard) function [23]; or,
i.e., a (classical) formula F holds for all x if and only if it
vide a derivative-a function which could be checked for
x � mx extends
to a
arguments; moreover, it implies that any object which is
Definition. A junction f is
called differentiable on (a, b) if there is a standard junction f' : (a, b) � IR-the deriv ative of f-such that for all x * a, b and dx = 0, f(x + dx) = f(x) + f ' (x) dx + Edx with E = 0.
describable without using the attribute "standard" is stan dard5. "The intuition behind (T) is that if something is true for a fixed, but arbitrary x, then it is true for all
x."
Using transfer (sum and product of two standard reals is standard, etc.) it is now possible to prove the obvious rules:
Now, this definition has to be related to the previous one:
sum and product of two infinitesimals is infinitesimal, sum
first, transfer implies that the derivative f' of f is unique
of an infinitesimal and a finite number is finite, the prod
and takes standard values on standard arguments-!' is re
uct of an infinitesimal and a finite number, . . . . Moreover,
quired to be standard. Moreover, by interchanging the roles
x + dx, it is easily seen that j' (z) = f'(x) when x, i.e., f' is continuous. Together with f(y) f(z) = f'(x)(y - z) + [f'(z) - f'(x) + E] (y - z) for y, z = x this shows that, at every standard x E (a, b) of an inter val where f' exists, f is differentiable in the sense of the first definition with mx = f' (x).
Lemma (standard part). If x E IR is finite, then there exists a unique standard real Sx E8 IR (the "standard part" Sx = st(x) of x) with x = Sx.
of x and
ever z =
The proof nicely demonstrates the use of the axioms: to show uniqueness, let s 1 = then,
s2 - s 1
=
x = s2 and s1
and
s2 standard
0 and standard (by transfer), consequently
In a completely analogous way, using higher-order
s2 - s1 = 0. Existence follows from the completeness6 of !R. Let Ax be the standard set containing all standard y ::s: x (standardization). Ax is bounded (because x is fmite) and hence Sx : = sup Ax exists; Sx is standard (transfer), and x - Sx = 0 by construction.
Taylor polynomials, higher-order derivatives and differen tiability on an interval can be introduced [23]. In this con text, analyticity very nicely arises as a stronger form of in finite differentiability: here, the error of an infinite Taylor
ajinite range jdxj < r. a, b infinitely close to the end points of the domain interval-in particular infinite points x = oo
polynomial
E
=
0 on
Still, the points x
Differentiability In nonstandard analysis, as in standard analysis, slopes
are excluded in the first-order Taylor formula defining the
of secant lines can be used to fmd the derivative of a
derivative. When considering a function! as a planar curve
function-the slope of its tangent line:
(by identifying it with its graph), it becomes clear that this
Definition. A junctionf is called differentiable7 at a stan
restriction depends on the special choice of coordinate sys
dard point x E8 IR if the slopes of secant lines through in finitely close points y, z = x are all infinitely close to one standard number mx = (f(y) - f(z))/(y - z).
tem in the plane-it becomes unnecessary in the definition of a submanifold:
A standard subset MYn C [Rn is called an m 1 dimensional C -submanifold if there exists a standard tan gent plane map T : M � G(m, n) into the Grassmann-ian of m-planes such that for every point p E M: (a) p lies on its tangent plane, p 3 TpM; (b) the orthogonal projection 'TTp : M � TpM is an irifi.ni tesimal bijection8;
As usual, differentiability at a (standard) point x implies con tinuity: f(y) -f(x) mx(Y - x) = 0 as soon as y = x. Note
Definition.
=
that (so far) differentiability and continuity are only defmed
at standard points-for example, if x = oo, then x + 1/x = x but (x + l/x) 2 * x2, sof(x) = x2 cannot even be
proven continuous at infinite points (cf. [23]). Another sim ple investigation shows that the functionf(x) = 4For example, if one tries to form the subset {x E above: both assumptions, r = 0 and
the idea of an "ideal scale. "
r *
x2 sin
1/x-
IR� = 0}, the "halo of 0",
=
one deduces that this set cannot have a least upper bound,
r,
even though it is bounded
0, lead to contradictions. Thus, the halo is not a set -or, at least, it is not a usual set Note, that this behavior agrees with
5This statement might seem t o contradict the existence of infinitesimals -it reflects the fact that we deal with a n ideal scale. In fact, applying idealization t o the set of all finite subsets of IR
:=
{A
c
IR j A is finite} and the relation
"3",
it can be proven that all standard elements are contained in a (nonstandard) fin1te set: in his lifetime,
any mathematician can only describe finitely many mathematical objects (explicitly). 6Actually, the existence of the standard part is equivalent to the completeness of
IR.
71n the literature, this is often called "strong differentiability"-see the discussion below. 8This means: if p = q1 , q2 E M and 7Tp(q1) = 7Tp(q2) then q1 = q2 (infinitesimal injection), and if p = q' E TpM then there is a q = p in M with q' surjection). The first of these two conditions can be proven using part (c) of the definition.
=
7Tp(q) (infinitesimal
VOLUME 22. NUMBER 2, 2000
57
(c) the angle between TpM and the secant line through p and any infinitely close point q E M, q = p, is irifini. lq - 7Tp(q) l 0 tes�mal: q - p I I .
I"qC
r;,c
=
Part (b) ensures that M has dimension m; parts (a) and (c) say that M has first-order contact with its tangent planes. This defmition can be shown to be equivalent to the exis tence of a C1-atlas on M [28]. Since we are going to study curvature properties, we will soon need to assume higher order differentiability. As higher-order differentiability of functions can be introduced via the existence of higher-or der approximating Taylor polynomials, I defme smooth (C 2) submanifolds through the existence of osculating pa raboloids at each point (cf. [ 14]):
C1 -manifold Mm C �n is called smooth (C 2manifold) if there is a standard map which assigns to each point p E M an osculating paraboloid Q;J1', i.e., a pa raboloid QP with vertex p which has second-order contact Deimition. A
.
w�th M :
dist(q,Qp) = 0 for any q q pl 2
I
-
E M,
q=
I
/
I df} --------�� -
'i> flp
m
Figure 4. Geometry of a Curve.
p.
Infinitesimal Geometry
In order to get acquainted with the types of arguments used in the classical literature, and their modifications related to the modern notion of infinitesimals, let us first discuss the notion of curvature of planar curves and surfaces. This might also be of interest to those readers curious about the origins of the notion of curvature. Some of the classical au thors directly consider the infmitesimal angle dit between the tangent lines (or, the normals) at two infmitely close points p, q E C of a curve C c �2 as the curvature9 of C between p and q. Generically, the two normal lines of C at p and q will in tersect in a point m (Fig. 4). Elementary trigonometry ds . ds cos ditq . d-" ; and consequently, rp = � smce yields rp =
Sill v
with 0 = 1 + rp
dn · dp . This is the formula that is commonly dp2
used to introduce the curvature:
Kp
=
-st
( d:��P )
.
One possible way to measure the curvature of a surface
S C �3 at a point p E S is by means of its normal curva tures: restricting our attention to one of the normal planes 0 N of S at p, the normal curvature1 kp,dp of S at (p, dp) is just the curvature of the intersection curve C = N n S be tween p and q = p + dp E S (see Fig. 5). Note that here, in
contrast to the case of planar curves, the angle dit between the tangents of C at p and q cannot be treated symmetri-
dv
cos dit = 1 as well as (lldit) (sin dit) 1 for infmitesimal dit. This is how the curvature at a standard point p E C is =
obtained in modern differential geometry: Kp = st
(:)
, i.e.,
the angle dit between the tangent lines is measured with respect to the arc length ds = ldPI · However, in order for Kp to be well defmed, we have to assume higher-order dif ferentiability (cf. [28]): otherwise, Kp might not be inde pendent of the second point q, or, worse, rp could even be infinitesimal-in which case there would be no standard part for the infinite number 1/rp. rq - rP Clearly, 0 = dp + rqnq - rpnp. But is infmitesimal, ds 1 cos ditp - cos ditq rq - rp . = 2; then scalar = smce (ditq ditp) sin dit (ditq - ditp)ds _ _
multiplication with dp and division by ds2 =
dp2 provides us
Figure 5. Curvature of a Surface.
9Peterson [21 ] explicitly prefers that definition over the usual one-which will be considered in a moment. Note that this notion of curvature does not require higher-or der differentiability. 1 0Again, the normal curvature can be introduced without assuming smoothness- but, once S is smooth tual second point q = p + dp.
58
THE MATHEMATICAL INTELLIGENCER
Kp,dp
will only depend on the direction of dp, not on the ac
ds = 1. With the jdp j L = -pu·nw M = -pu·nv,
between two infinitely close points: second fundamental quantities, and N
=
-pv ·nv, we also can express the normal curvature
of S at p,
dp:
· 1 p dp = (L du2 + 2M dudv + N dv2) = � � 2 dp2 , ds
K
- --.
As indicated earlier, the conjugate direction of a direc tion
dp/ds at a point p is (nearly) parallel to the intersec IR[n X (n + dn)] of the tangent planes TpS and
tion line
Tp + dpS at infinitely close points (Fig.
Figure 6. Conjugate Directions. cally in
p and q
p and q:
the corresponding normal planes of S at
will only coincide if the normal lines
IRnp and IRnq
j j
intersect (or are parallel). In that case, the direction dpl dp
is infinitely close to a principal direction of S at p-in fact,
if S is smooth and p is a standard point, the principal cur vature directions of S can be introduced this way. Another observation is that the normal lines of infinitely close points
p and q intersect iff the
line of the two tangent planes
S at two
intersection
TpS and TqS is perpendicu
= q - p (see Fig. 6), i.e., the principal directions of S at a (nonumbilic) point p are exactly those directions which are perpendicular to their conjugate directions 11 lar to dp
(cf. [26]). Moreover,
QP denoting the osculating paraboloid
at p, the principal directions of S at p can be shown to be exactly
the
principal
axes
Dupin indicatrix
of the
( 1Tp(q) I q E Qp, q·np = ± 1 }-the indicatrix's orthogonal con
jugate axes [ 14].
Our purpose is the study of certain curve nets on sur faces. Peterson [ 2 1 ] introduces this notion of a curve net on a surface as follows:
Pu
and
Pv this
reads
0
6). For the net di
(n X nv) X Pu = M · n. Thus, by transfer, the net directions Pu and Pv at a stan dard point are conjugate iff M = 0; or (transfer, again): Definition. A curve net (u, v) : S --+ IR2 is called (i) a conjugate net if its curves intersect at each point in conjugate directions, i.e. , if M = 0, and (ii) a curvature line net if its curves intersect in orthogo nal conjugate directions, i.e., if F = M = 0. rections
=
Note that these notions are independent of the (regular) coordinate system
(u, v)
used to describe the curve net.
Now, the classical authors prove that a conjugate net di vides a surface S into (nearly) planar infinitesimal quadri laterals by the following argument: Two opposite edges of an infinitesimal net quadrilateral are both (nearly) parallel to the intersection line of the two tangent planes that (nearly) contain the edges. Thus, the quadrilateral is (nearly) a parallelogram, and, in particular, planar. Because the curvature line net of a surface is orthogonal and con jugate at the same time, it clearly divides the surface (nearly) into infinitesimal rectangles. To a modern mathematician, it might not be that clear what "nearly" in the above argumentation means. Let us try to clarify this notion by calculating12 the radius
Curves lying infinitely close side by side on a surface, form a curve system: two systems of crossing curves form a curve net. Definition.
r of the
sphere containing the vertices
As any 1-form on a (2-dimensional) surface S has an inte grating factor, any curve net on S gives rise to a (generi cally regular) coordinate system
(u, v) : S --+ IR2 ,
such that
the curves of the two systems are given as the level-curves
u
=
const and
v=
const, respectively.
On the other hand, we may describe the geometry of a (smooth) surface relative to a (smooth) coordinate system. Since
dp du
=
Pu and
dp dv
=
Pv (the Gaussian basis fields) along
v = const and u = const, we can use ds2 = E du2 + 2F dudv + G dv2 ,
the net curves
where
E = �' F = Pu·pv, and G = PB denote the first funda
mental quantities, as a good description for the arc length
Figure 7. A Darboux transform of the sphere.
1 1 The notion of conjugate directions will be discussed more comprehensively in a moment. . . . 12The center of a sphere conta1n1ng three po1nts p 1 ,
.
p2 and p3 1n general
.. pos1t1on and 0 E
2 X P3 + IP212 P3 X p , + IP312 p, u;!3 .IS m = h 1 P2 2 det ( p, , p2, P3)
X
P2 .
VOLUME 22, NUMBER 2. 2000
59
PI P - lPu du + p'V dv] Puu du2 + 2pu dudv + p'V'V dv2 =
'V
+
P2
=
2
P + lPu du - p'V dv] Puu du2 - 2Puv dudv + p'V'V dv2 + 2
P3 = P + lPu du + Pv dv] Puu du2 + 2pu dudv + p'V'V dv2 + 'V
2
P4 = P - lPu du - p'V dv] Puu du2 - 2Pu'V dudv + p'V'V dv2 + 2
.
.
.
.
of an mfimteslffial net quadrilateral, 0 we find
1
-
r
M
-,
=
ture iff M = 0 (and
=
* 0,
co:
by
EI
+ E2
+
E3
+
E4
Ei
ds2
=
i.e., the sphere has infinitesimal curva-
F
F =I=
0) and, consequently, is "nearly" a
plane. Unfortunately, in case that M
du dv
-
+
F=
0 we can only conclude
0 if the sphere is nearly a plane. But, in that case,
R of the net quadrilateral becomes R = E du2 . 1n particular, its imaginary part is infmitesimali3, G dv2
the cross-ratio
showing that the vertices of the net quadrilateral are "nearly" concircular. At this point, the willing reader might be convinced of the following
Figure 8. Another Darboux transform of the sphere. planes in corresponding points, both surfaces are generi cally isothermic [7], and they form a "Christoffel pair". Darboux pairs of isothermic surfaces naturally arise in !-parameter families
[4]
(which fact gives rise to the inte
grable system description). When this parameter becomes infinitesimal, each surface 8i is in the halo of a point
A (smooth) net (u, v) : 8 � �2 is a conjugate net iff all infinitesimal net quadrilaterals are (nearly) planar, and it is the surface's curvature line net iff it di vides the surface into infinitesimal rectangles. Theorem.
The classical characterization of isothermic surfaces is a
A surface 8 C �3 is isothermic, i.e., allows conformal curvature line coordinates (u, v) : 8 � IR2, if and only if it is divisible into infinitesimal squares by its curvature line net. Corollary.
*
P2 ·
Sending one of the points to
co
Pi, PI
by a Mobius trans
formation, the circles which join corresponding points on 8I and
82 become nearly straight lines-thus, the surfaces
form (nearly) a Christoffel pair. The two surfaces are no longer standard, though: one surface, lying in the halo of a fmite point, is infinitesimal and the other is infinite. The Darboux transforms of an isothermic surface can be obtained as solutions of a system of Riccati type partial differential equations [ 1 1 ] . Reinterpreting these equations in terms of the "geometric method," they can be consid ered as conditions for the cross-ratios of those quadrilat
Conclusions
erals formed by two infinitely close points on a curvature
This corollary is the goal-to understand the classical de
line of one surface and their corresponding points on the
finition of isothermic surfaces as a part of the "geometric
other surface. This was the starting observation which led
approach," and to get some sketches of this classical ap
to the definition of the Darboux transformation for discrete
proach to differential geometry: to see how it differs from
isothermic nets [ 1 2 ] .
the modem approach and how it can be given some foun
Except for the left picture i n Figure 1 (which shows a
dation using nonstandard analysis. Let me close with some
smooth Darboux transform-and which took about 70
remarks on transformations of isothermic surfaces and dis
times longer to calculate than its discrete cousin), all pic
crete isothermic netsi4:
tures in this article show discrete isothermic nets that were
Assume there is a point-to-point correspondence be tween two surfaces 8I and
82 which preserves angles and
the curvature line net. If corresponding points can be
obtained as Darboux transforms of a discrete isothermic net on the 2-sphere. In the smooth case, Darboux transforms of a spherical
joined by circles that intersect both surfaces orthogonally
net are surfaces of constant mean curvature
(i.e. , 81 and 8 envelop a sphere congruence), then, gener
bolic space [ 13]. One might be led to use this characteri
2
(8],
1
in a hyper
and they are said
zation as a definition in the discrete case. Similar ap
to form a "Darboux pair" (one is a Darboux transform of
proaches already proved useful for discrete minimal [ 1 ] and
the other). Similarly, if the surfaces have parallel tangent
discrete erne nets
ically, both surfaces are isothermic
1 131n fact, - lm R
ds
=
M G
2-
�
du2 G dv2
- -
[ 12] in Euclidean space ("erne" stands
(cf. [1]).
1 4These transformations seem also to play a crucial role for the existence of an analogous discrete geometry (cf. [2], [3]).
60
THE MATHEMATICAL INTELLIGENCER
for "constant mean curvature"). Then, all shown surfaces would be cmc-1 nets in hyperbolic space-its infmity sphere being pretty well indicated by the net's end behav ior in all figures. These ideas were elaborated further in a 1999 preprint "Transformations of discrete isothennic nets and discrete cmc-1 surfaces in hyperbolic space" by the au thor. I have tried to show why I believe that the classical "geo metric approach" to differential geometry can not only be useful in helping to understand the relations between the discrete and smooth theories, but it may also stimulate fur ther results in either field.
AU T HOR
UDO HERTRICH·JEROMIN
Acknowledgments
Department of Mathematics,
Partially supported by the Alexander von Humboldt Foundation and NSF Grant DMS93-12087.
Sekr.
MA 8-3
Technische Universitat Berlin Strasse des 17 Juni 1 36
D-1 0623 Berlin
Germany
REFERENCES
1 . A Bobenko, U. Pi n kall :
Discrete lsothermic Surfaces; J.
reine
e-mail: udo@sfb288. math.tu-ber1in.de
angew. Math. 475 (1 996) 1 87-208
2. A Bobenko, U. Pinkall: systems; in
Discretization of surfaces and integrable
A. Bobenko, R.
and Physics,
Seiler, Discrete Integrable Geometry
Oxford Univ. Press, Oxford 1 999
3. A Bobenko, U. Hertrich-Jeromin: gebras;
there
Orthogonal nets and Clifford al
preprint 1 998
4. F. Burstall, U. Hertrich-Jeromin, F. Pedit, U. Pinkall : lsothermic sur
written) and the ETH in ZOrich, before returning
to the TU. He
is primarily a differential geometer with interest in discrete nets;
his work on non-standard analysis arose in that connection,
as this article recounts. Much though he enjoyed the hobby of classical music
-
On the surfaces divisible into squares by their curves of
curvature; Proc. London Math.
7. E.
in 1 994. He spent three years in visiting positions at the
University of Massachusetts Amherst (where this paper was
Math. Z. 225 (1 997) 1 99-299
Gaut h ier Villars, Paris 1 92 1
6 . A Cayley:
in Bertin (then, West Bertin).
at the Technische Universitat
Reflexions sur Ia Metaphysique du Calcul Infinitesimal;
faces and curved flats;
5. L. Camot :
Udo Jeromin was born and raised He completed his doctorate
Christoffel :
Ueber
Minimumsflachen; J .
einige
Soc. IV (1 871) 8-9 allgemeine
Eigenschaften
days to
in his youth, he pays more attention nowa
flying kites.
der
reine angew. Math. 67 (1 867) 21 8-228
8. G. Darb oux: Sur /es surfaces isothermiques; Comptes Rendus 1 22
1 7 . D. Laugwitz:
9. F. Diener K. Stroyan:
1 8. T. Undstmm: An invitation to Nonstandard Analysis; in N . Cutland
(1 899) 1 299-1 305, 1 483-1 487, 1 538 ,
Syntactical methods in infinitesimal analysis;
in N . Cutland, Nonstandard Analysis and its Applications, London
Math. Soc. Student Texts 1 0 , Cambridge Univ. Press, Cambridge 1 0. J. Henle:
Non-nonstandard analysis: Real infinitesimals;
Math. lntell.
2 1 : 1 (1 999) 67-73
1 1 . U. Hertri ch -J erom in, F. Pedit: of isothermic surfaces;
Remarks on the Oarboux transform
Doc. Math. J. DMV 2 (1 997) 31 3-333
1 2 . U. Hertrich-Jeromin, T. Hoffmann, U. Pin kal l :
A discrete version of
the Darboux transform for isothermic surfaces;
in A Bobenko, R.
Seiler, Discrete Integrable Geometry and Physics, Oxford Univ.
Press, Oxford 1 999 1 3. U. Hertrich-Jeromin, E.
Musso, L. N i col od i : Mobius geometry of
surfaces of constant mean curvature 1 in hyperbolic space; preprint
1 998
1 4. F.
Joachimsthal: Anwendung der Differential- und lntegralrech
nung auf die allgemeine Theorie der Flachen und der Linien dop pelter KnJmmung; Teubner, Lei pzig
1 5 . F. Kl ei n :
1 890
Anwendung der Differential- und lntegralrechnung auf
Geometrie: Eine Revision der Principien;
1 6. D. Laugwitz, C. Sch mieden : rechnung;
Teubner, Leipzig 1 907
Eine Erweiterung der lnfinitesimal
Math. Z. 69 (1 958) 1 -39
Bibliographisches l nstitut,
Nonstandard Analysis and its Applications, London
,
Math. Soc.
Student Texts 10, Cambridge Univ. Press, Cambridge 1 988 1 9. E. Nelson:
1 988
Zahlen und Kontinuum;
ZOrich 1 986
Analysis;
Internal set theory: A new approach to Nonstandard
Bull. Amer. Math. Soc. 83 (1 977) 1 1 65-1 1 98
20. E. Nelson :
Internal Set theory; unpublished
2 1 . K. Peterson:
Ueber Curven und Flachen; A Lang's
Buchhandlung,
Moscow 1 868 22. M. Richter:
/deale Punkte, Monaden und Nichtstandardmethoden;
Vieweg, Wiesbaden 1 982 23. A Robert: Nonstandard Analysis; Wil ey , New York 1 988
24. A. Robinson :
Nonstandard Analysis;
North-Holland, Amsterdam
1 966
25. R. Sau er:
Differenzengeometrie ;
Sprin ger, Berlin 1 970
26. G. Scheffers: Anwendung der Differential- und lntegralrechnung auf Geometrie; Band II: Eintohrung in die Theorie der Flachen;
Verlag
von Veit & Comp . , Lei pzig 1 902
27. K. Stroyan, W. Luxem burg : mals ;
Introduction to the Theory of lnfinitesi
Academic Press, New York 1 976
28. K. Stroyan:
Infinitesimal analysis of curves and surfaces;
in J.
Barwise, Handbook o f Mathematical Logic, North-Holland, Amster dam 1 977
VOLUME 2 2 , NUMBER 2. 2000
61
l@@jj.i§j.fhi¥119-'l,'l,iii,iihfj
The Kovalevskaia Fund Ann Hibner Koblitz and Neal Koblitz
This column is a forumfor discussion of mathematical communities throughout the world, and through all time. Our definition of "mathematical community" is the broadest. We include "schools" of mathematics, circles of correspondence, mathematical societies, student organizations, and informal communities of cardinality greater than one. What we say about the communities is just as unrestricted. We welcome contributions from
mathematicians of all kinds and in all places, and also from scientists, historians, anthropologists, and others.
Please send all submissions to the Mathematical Communities Editor, Marjorie Senechal,
Department
M a rj o r i e Senec h a l , Ed itor
I
t was August 1995, and we were in Hanoi for a celebration of the lOth an niversary of the Kovalevskaia Fund. As part of the festivities, the Vietnam Women's Union, supported by the Fund, had brought to Hanoi 15 talented women undergraduate students from different parts of Vietnam in order to learn about scientific careers. We joined the group at the Hanoi Mathematical Institute, where the director, Hoang Tv.y,* was talking with them about why a young person should pursue a career that she/he truly loves-such as scien tific research-rather than one that might be more lucrative or more fa vored by friends and relatives. The young women were spellbound; most likely none of them had ever heard an eminent scientist speak from the heart in such a manner. Seeing this interac tion between the students and a lead ing scientist, we once again experi enced the satisfaction that comes from our travels for the Kovalevskaia Fund. In 1983 Ann published a biography of the Russian mathematician, social ist, and feminist Sofia Kovalevskaia. We decided to use the royalties from sales of the book for a project that would honor Kovalevskaia's memory. We initiated the Kovalevskaia Fund as a subcommittee of the U.S. Committee for Scientific Cooperation with Vietnam, an organization in which we had been active for several years. But because we wanted to be involved in other countries besides Vietnam, in 1985 we set up the Fund as an independent tax exempt foundation. Its purpose is to encourage women in science and tech nology in developing countries. The Fund has supported a number of activities, such as prizes for women scientists in various countries, schol arships for women students, and oc casional conferences. We publish a Newsletter (in English and Spanish)
]
that has readers in over 100 countries. In the area of mathematics most of the Fund's activities are in secondary and undergraduate education. •
•
•
•
The Sadosky Prize was started in 1997 to reward and encourage girls who do well in Vietnam's national mathematical olympiads. The prize honors the memory of Cora Ratto de Sadosky (1912-1981), who inspired and nurtured scores of students at the University of Buenos Aires. It is supported by her daughter, Cora Sadosky, and other family members. For the last five years the Kovalev skaia Fund, with support from math ematicians Dirk and Rebekka Struik, has been giving a scholarship at the Gauss School in Peru in memory of the mathematician Ruth Ramler Struik (1894-1993). A low-cost pri vate school in a working-class sub urb of Lima, the Gauss School has, as its name implies, a special focus on mathematics. Starting last year, the Kovalevskaia Fund has been sponsoring an annual scholarship at the University of the Western Cape (UWC) in Cape Town, South Africa. The scholarship pays for a year's tuition and fees for a woman student in the mathematical sciences. The first Kovalevskaia Scholar is Mobone Prescilla Mamabolo, a third year undergraduate majoring in mathematics and mathematical sta tistics. The Kovalevskaia Fund has arranged with the Peruvian committee on math olympiads to pay half of the airfare for girls who represent Peru in the regional and international math olympiads. The first young woman to receive this support, Liz Vivaz, won a series of medals in various competi tions and is now a top math student at the Catholic University in Lima.
of Mathematics, Smith College, Northampton, MA 01 063, USA;
·A lengthy interview with this remarkable intellectual appeared in 1990 in The Mathematical lntelfigencer, Vol.
e-mail:
[email protected]
1 2, No. 3.
62
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
When we travel in connection with these and other projects, we often visit schools to present math enrichment lessons. This is a great way to meet people and get beyond the formality that is often a barrier to meaningful in ternational friendship and collabora tion. Our experiences with school children in other countries also contribute to our understanding of pedagogical issues. (Unfortunately, most American educators who write about math education have observed classrooms only in the U.S., and so lack a broader perspective.) One of our most memorable school visits was to an isolated village called Pacaycasa in the Peruvian Andes, an hour or two drive from the provincial city of Ayacucho. We met with middle school-age children, taking them through some graph theory story prob lems, arithmetic games, and a proof without-words of the Pythagorean Theorem using geo-boards. We talked with the children in Spanish, which was as much a foreign language for them as for us, their mother tongue be ing the indigenous language Quechua. The Pacaycasa children liked the geometry most of all. The teachers pointed out that they could make some geo-boards for themselves, using wood and nails that are available in town. Someone would have to go in to Ayacucho for rubber bands-but in the meantime pieces of string would do well enough. After about two hours we
Hanoi middle-school children practice their English with Neal after a lesson on the misuses of statistics.
had to cut off our session. Although the children showed no sign of tiring or losing interest, we noticed that they were having difficulty seeing the black board. The sun was going down, and the school did not have electricity. On another occasion we were visit ing a school in Cape Town for Xhosa children (the largest Black African eth nic group in the Cape Province). A Xhosa mathematician at UWC named Loyiso Nongxa helped by translating and expanding upon our explanations of the math enrichment topics. At one point he couldn't help laughing-he had heard one of the children com ment that "this American sure speaks
Prize, and her father, the famous General Vo Nguyen Giap. Giap masterminded the French defeat at Di� Bien Phu in 1 954 and the Tet Offensive against the Americans in 1 968.
good Xhosa!" As he explained to us, the kids found it easier to imagine that a Black American visitor spoke unac cented Xhosa than that a member of their tribal group had become a pro fessional mathematician. One of the most rewarding aspects of our travels is the opportunity to get to know some remarkable women. Several such friendships have grown out of Ann's participation in three con gresses of the Third World Organiza tion of Women in Science (in Trieste in 1988, Cairo in 1993, and Cape Town in 1999). One example out of many is the Ghanaian physicist Aba Andam. After receiving her Ph.D. in nuclear physics in Great Britain (where she was the only woman in her department), she returned to Ghana to find that her theo retical training had little relevance to the problems of her country. She re-tooled and, along with her students, has been going around Ghana taking baseline measurements of pollutants such as radon gas in mines, dormitories, public buildings, and villages. We combine our Kovalevskaia Fund activities with other interests when we travel-talking with mathematical col leagues (Neal), gathering data on gen der issues (Ann), and snorkeling (both of us). For instance, in 1997 in Malawi, in addition to academic activities, we spent a few days on Lake Malawi, which probably offers the best freshwater snorkeling in the world.
VOLUME 22, NUMBER 2, 2000
63
�
Girls at a math enrichment lesson in Zomba, Malawi. They are fortunate to be in school: be cause of increasing fees combined with sexist attitudes, more and more Malawian parents are cutting short their daughters' education.
Sometimes even the non-recreational part of our travels is so interesting that between ourselves we speak of the "Kovalevskaia Fun," omitting the final d. For example, in 1989 we visited Phnom Penh at the invitation of the Cambodian delegates to a women-in-science meet ing the Fund had sponsored in Hanoi in 1987. Our visit happened to coincide
with a gala celebration of the lOth an niversary of the ouster of Pol Pot, and we turned out to be almost the only Western "representatives" in town for the occasion. We were treated as if the Kovalevskaia Fund were a small coun try. The Prime Minister greeted Ann (our "chief of delegation") on the tar mac when we arrived at the airport.
During a break at a Kovalevskaia Fund-sponsored Women in Science conference in Peru, Liz Vivas explains to Neal her solution that won her the highest score in a regional math olympiad.
64
THE MATHEMATICAL INTELLIGENCER
The Ministry of Foreign Affairs put us in a hotel with the diplomatic delega tions from Angola, Nicaragua, and Mozambique. The morning of the main festivities we were on the reviewing platform for an extravagant parade: we had the best possible seats for the world-famous Khmer dancers. Perhaps the most surreal moment oc curred during a group excursion to a crocodile farm on the outskirts of Phnom Penh. Mongolia's ambassador to Indo-china mentioned to us that it was unfortunate that the Mongolians had not been invited to send someone to the Hanoi conference of women scien tists-the slight was still rankling him two years later. In her politest Russian (which was the common language she had with him), Ann assured him that, as far as the Kovalevskaia Fund was con cerned, we had fully expected that they would participate. But invitations to the socialist countries had been the respon sibility of the Vietnamese. So it was the Vietnamese, not the Kovalevskaia Fund, who had unintentionally snubbed the Mongolians. Ann happened to have brought a spare copy of the Proceedings of the conference, which she gave to the Ambassador as an expression of our good will. This "diplomatic" conversa tion was taking place as we were all making the rounds of the crocodiles. The Kovalevskaia Fund has had a high profile in Vietnam since 1987, when our women-in-science confer ence was featured on the national TV news and the participants were brought to the Presidential Palace to meet with the legendary Ph?ID Van Dong, who was then Prime Minister. The Kovalevskaia Prizes are well known throughout the country. In fact, if a name-recognition poll were con ducted in Vietnam for all foundations, the Kovalevskaia Fund would easily beat out the Ford and Rockefeller Foundations. Not everyone agrees with every thing the Kovalevskaia Fund does. Some have said, for example, that it's unfair to offer a prize only for women; we should give another prize for men. Our response to this has been that the Kovalevskaia Fund's purpose is to sup port women in science, but that this does not preclude offering a prize for
men. Such a prize could be for the hus band of a scientist who is most sup portive of his wife's career. A local women's group could organize the competition, which would include a cooking and cleaning contest for the men. Unfortunately, when we sug gested this possibility in Peru, no one wanted to take us up on it. All of our projects depend upon the enthusiasm and dedication of the local people, who have the burden (with no remuneration from us) of organizing and administering them. Thus, an idea that might appeal to our fancy but does not seem appropriate to them (e.g., prizes for househusbands of scientists) will never be attempted. We do not im pose an alien agenda on anyone. Since we rely on a large commit ment of time and energy by local peo ple, occasionally we have had to dis continue projects in a country because of changing conditions there. For ex ample, in 1998 we stopped giving Kovalevskaia Prizes for university re searchers in Peru. Several members of the Prize Committee had left the uni versities or were working at two jobs. The Peruvian government's fmancial support of the universities had dimin ished (a consequence of the ideology of "privatization"), and morale had de clined among academics. There was simply no one left to organize the Prize competition. There are occasional setbacks, and even in the best of circumstances the accomplishments of the Fund are hard to pin down. There's an intangible as pect of what the Fund does in boost ing the morale of women scientists who often work in extremely unfavor able conditions and in helping them to feel less isolated. The amount of money that we and others have invested in the Fund is rel atively modest. On our part it's essen tially the income we get from miscella neous sources-royalties from books and Ann's $5000 consultant's fee that the U.N. paid for a cross-national com parison of women in science for the volume World's Women. Even from a purely selfish point of view, it's well worth it: we learn a lot, ef\ioy ourselves, and get to know some wonderlul peo ple in far-flung parts of the world.
A U T H O R S
ANN HIBNER KOBLITZ
NEAL KOBLITZ
Department of Women's Studies
Department of Mathematics
Arizona State University
University of Washington
Tempe, AZ. 85287
Seattle, WA 98195
USA
USA
e-mail:
[email protected]
e-mail:
[email protected]
Ann Hibner Koblitz is on the faculty in Women's Studies at Arizona State University and is
the author of two books: A Convergence of Uves: Sofia Kovalevskaia (2nd edition Rutgers ,
University Press, 1 993) and Science, Women, and RevolutJ'on in Russia (Harwood Academic
Publishers, 2000) . Neal Koblitz, a p rofessor at the University of Washington, has research
interests in number theory and cryptography. He has written five books, the most recent of which is Algebraic Aspects of Cryptography (Springer-Verlag, 1 998). Photo by Robert J. Koblitz.
VOLUME 22, NUMBER 2. 2000
65
DANIEL W. STROOCK
Do i ng Ana ys i s by Toss i ng a C o 1 n his little note is based on an expository lecture which I gave in January
1999 to M. I. T.
undergraduates. My purpose was to provide an elementary example of the way in which probabilistically natural considerations uncover interesting, and occasionally profound, insights into real analysis. In order to keep everything as elementary as possible, the example given was based on coin-tossing. If
it will be shown that such functions arise inevitably when
the study of more complicated objects, like Brownian
R.B. Burckel pointed out to me that a similar attempt [B]
one is willing to move from the consideration of coins to
ever one deals with infinitely many tosses of an unfair coin.
paths, then one can produce a much richer array of ex
was made by Patrick Billingsley, who also chose to get at
amples. 1
singular functions by tossing a coin.
At some point in their education, nearly all mathemati cians learn that every monotone function is
where
almost every
(in the sense of Lebesgue) differentiable.2 This re
assuring revelation is often followed by the disturbing news that there are non-constant, continuous, monotone func tions which are
singular in the sense that their derivative
vanishes at almost every point.3 Because they call into question the universal applicability of the Fundamental Theorem of calculus, such functions are usually considered to be pathological. That is, most people suspect that these are not the sort of function on which they are likely to stumble unwittingly. Worse, the simplest example has in tervals of constancy whose union has full measure. This
The Weak Law of Large Numbers A coin-tossing game in which the coin comes up heads with
probability p (0 < p < 1) and tails with probability q = 1 p is modeled mathematically by recording the outcomes as mutually independent random variables
If nothing else, I hope that this note will improve the rep
, en)
if it comes up tails. That is, if (e1, . . . and 1's, then4 rr1l lr
p(X1 = �ci, . . . , Xn
Next, let Sn
=
�n) = piih�1
-=--
Em
=
. ,
0
is a string of O's Em
qn -Iih�l .
=
�� = l Xm be the number of heads among 1Ep[Xml = 1p + Oq = p,
=
IEp
the first n tosses. Since5
can be fixed, but the remedy looks unnatural. utation of continuous, monotone, singular functions. Indeed,
X1, . . . , Xn , . .
where Xn = 1 if the nth toss results in a heads and Xn
1Ep [Sn]
Lt ] � Xm
=
m l
1Ep [Xm]
=
np.
1 See, for example, [D] or [82]. 2See, for example, §1.2 in [RN.] 3The canonical example is the Cantor-Lebesgue function. See, for instance, Exercise 8.2 . 1 2 in [81 ] . 4Below, and elsewhere, i>p i s used t o denote the probability measure determined b y independent tosses of a coin which, o n each toss, comes u p heads with proba
bility p; and, if r is some event based on such a coin-tossing game, then i>p(f) is the probability of r. Thus, in the formula which follows, the left-hand side should be
= E1 , X2 = e2, . . . , Xn = •n · " 5We will use IEP to denote expectation values which are computed relative to i>p. Recall that if X is a random variable, then its expectation value relative to the proba read "the probability that X1
bility measure i> is nothing more or less than the integral
66
f X di>.
In particular, if X takes on only a countable number of values, then IE [X]
Tl--I E MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
= lx x !>(X = x).
Similarly, ifXm = Xm - 1Ep [Xm] = Xm - p and Sn = !-ih= 1 Xm, then
IEp [Sn]
=
0
and
IEp [S�]
n
=
I
1Ep [Xm Xm •].
m ,m ' = l
where the sum runs over all m = (m 1 , m2, m3, m4) with 1 ::::; mi ::::; n for i E { 1 , 2, 3, 4). Notice that, either by direct computation or by the mean-0 property combined with in dependence,
1Ep [Xm1
But � 1Ep [XmXm •l = 1Ep [X�] = q2p + p2q = pq (p + q) = pq, m ' � 1Ep [XmXm ·l q2p2 - 2(pq)2
• • •
Xm4]
*
=
and so
1Ep [Sn]
=
0
-
IEp [S�]
and
=
+ p2q2 n
=
npq = np(I - P) ::::; 4·
0,
+
(np)Z
=
npq + n2p2.
Of course, what accounts for the difference is the mean-0 property of the Xm's which, together with independence, leads to the vanishing of the off-diagonal terms in the ex pansion of 1Ep [S�]. Because, for any R > 0,6
Hence, at most 4!n2 terms are non-zero, and each of them is no larger than IEp [XI] ::::; pq (1 - pq) ::::; fo. In other words,
(2.2) Now, by repeating the argument with which we passed from (1.1) to (1.3) via (1.2), we see that
IP'p
(l �n - p i 2:: R) ::::; 2n;R4 .
IEp [ eABn]
= JIn 1Ep [eAXm]
= (peAq
+
qe-AP)n for all A E IR,
one can check first that
1Ep (eASn] ::::; en/3pA2 for some {3p E (0, 1], then that
(1.1) yields
( 1.2)
IP'p
(: - p 2: R) V IP'p (: - p ::::; -R) ::::; exp [ -n(AR
As an application,
(I: - p l 2: R)
= IP'p(!Sn - np ! 2:: nR)
= IP'pclsnl
2:: nR) ::::;
and finally, after taking
::;_2 ::::; �2 . 4
IP'P
(1: - p l 2: e) =
is called the weak
0
for each
e
A = 2�,
+ {3pA2)]
for
A > 0,
that
C L3)
That is, with probability at most (4nR2r 1 will n - 1sn (which is the average number of heads) differ from p by at least R. The qualitative conclusion that
J�
(2.3)
Obviously, the preceding success makes one suspect that one can do better than n- 2 too, and, indeed, one can. In fact, from
1Ep [S�] 2:: 1Ep [S�, 1Snl 2: R] 2:: R2 1P'p (ISnl 2: R) ,
IP'p
(2. 1 )
of { 1, 2, 3, 4),
mlT4"
(1.1)
Something slightly subtle has happened here. Because s� involves n2 terms, a priori one would expect 1Ep [S�J to grow quadratically in n. Indeed, this is the case when S� is replaced by S�:
1Ep [S�] = 1Ep [(Sn + np)Z] = IEp [S�] + np iEp [Sn]
a
=
mlTj = m u2 & mu3
m = m' m
=
0 unless, for some permutation
However, for our purposes here, (2.3) will suffice. Namely, what we are seeking is the replacement of the weak law (cf. (1.4)) by the strong law of large numbers. 7 That is, we want to show that
> 0 (1.4)
(
Sn IP'P lim n-'J>oo n
law of large numbers.
= p)
(2.4)
= 1.
To this end, observe that, from (2.3), we have that A Small but Important Refinement
One should ask whether the estimate in (1.3) is shmp. In par ticular, is n - 1 the true rate at which the left hand side tends to 0? In the hope that it will shed light on this question, we consider the expected value of s� instead of SJ. Clearly,
1Ep (S�] =
I IEp [Xm1 m
• • •
Xm4],
n m) ( l �n - p l 2: n - 1 n ::::; � IP'p (l� - p l 2: n - 1 ) n m
IP'p
18 for some
2::
18
9
00
::::; - I n -312 � 0 as m � oo. 2 n= m
6The notation [[X, A] means that the expectation of the random variable X is being computed over the set A. Equivalently, [[X, A] = fA X diP'. 7The sense in which the strong law is stronger than the weak law is a little subtle and probably requires an appreciation of the difference between a/most everywhere convergence and convergence in measure. See, for example, § 3.3 in [81].
VOLUME 22, NUMBER 2 , 2000
67
because
Hence, given any E E (0, 1), we can arrange that
1Pp
(I � - I
)
1Pp(Xn = 1 for all n > m) =
for all n 2:: m 2:: 1 - E
118 P :=;; n-
lim IPp(Xn = 1 for all m < n :=;;
by simply taking m sufficiently large; and so, with proba bility arbitrarily close to 1, n 1sn p as n � oo.
- �
M----::,.c.c
I
m=n+l
Of course, this is not a perfect encoding procedure be cause, although the Xm's uniquely determine Y, Y does not always uniquely determine the Xm's. The problem comes from those t E [0, 1] for which there is an n 2:: 0 such that 2nt is an integer. In order to remove the ambiguity caused by such t's, we adopt the convention that the mth coeffi cient Em(t) in the dyadic expansion of t E [0, 1 ] should be determined so that
n
m 0 :S t - I 2- Em(t) < 2-n
for all n 2:: 1 .
m= 1
(3.2)
Every t E [0, 1] then completely determines {Em(t) : m 2:: 1 } � {0, 1 }. Indeed, the Em(t)'s are generated inductively by the rules: E1 (t) = En+ 1 (t)
- {1
_
{1
0
if 0 :S t < 2- 1 :S X < 1
Em(t) < 2 -n- 1 m 1 2- Em(t) < 2 -n.
E1 = 0
E1 = 1
)[ I
[ E1 = 0, E2 = O)[E1 = 0, E2 = 1)[E1 = 1, E2 = O)[E1 = 1, E2 = 1) I
0
! 2
4
1
The importance of these considerations for us is that they lead to the conclusion that
(
)
1Pp Xn = En(Y) for all n 2:: 1 = 1.
(3.3)
To see this, notice that
n
I m
=1
m 2- xm =
oo
I
m =n+ 1
IPP (there exists n 2:: 1 such that Xm :S
68
I
m= 1
=
1 for all m 2:: n
1Pp(Xn = 1 for all n > m) = 0,
THE MATHEMATICAL INTELLIGENCER
(Y)
=
Sn n
n
�
P
)=1
0.
)
1,
(3.4)
where (cf. (3.2))
n In(t) ""' I Em(t) for t E [0, 1]. m= l
(3.5)
To understand why such a statement might be interesting, notice that
1Pp(2 - nm :=;; Y :=;; 2 - n (m + 1)) = p"in(2 - nm)qn - InC2 -nm) , n 2:: 1 and 0 :=;; m < 2n. (3.6) Hence, if 0 :=;; a :=;; b < 1, then
=
(3. 7) 2nLn(b) "in(2 -nm)qn - In(2 -nm) , I p n->oo m n = 2 Ln(a) lim
and R (t) n
""' Ln(t) + 2 -n,
(3.8)
are, respectively, the closest nth-order dyadic points to the left and right of t E [0, 1]. In general, the limit in (3.7) is es sentially impossible to compute. However, when a = b, the sum degenerates to a single term which, because max{p, q} < 1, tends exponentially fast to 0 as n � oo. Hence, (3.9)
Another case when the computation is possible is when the coin being tossed is fair, and therefore p = t. In this case, all the terms in the sum are equal to 2 -n. Thus, since the number of terms lies between 2n(b - a) and 2n(b a) + 1, we conclude that IP 112 Ca < y :=;; b) = IP 11z(a :=;; y :=;; b) = b - a, 0 :=;; a :=;; b :=;; 1.
2 - mxm :=;; 2-n
(3. 10)
Equivalently,
with equality only if Xm = 1 for all m 2:: n + 1. But 00
for all n 2:: 1
1Pp(Y = x) = 0 for all x E [0, 1] and p E (0, 1).
ETC.
y-
m 2- xm < 2-n
n
) 1
2
( In
Ln(t) ""' I 2- mEm(t) m= 1
More graphically, [ 0
=
where
m
if 0 :S t - !, �= 1 2 if 2 - n - 1 :S t - !, � =
0
IPP
and
if 2 - 1
M__,x
and this is equivalent to (3.3). As an immediate consequence of (3.3) combined with the strong law of large numbers (2.4), we know that
(3. 1)
m= 1
M) = lim pM
=
00
Y = I 2 - mxm E [0, 1].
+
Hence,
Another Representation of Coin-Tossing
The next step in our program entails our encoding the out comes of an infinite coin-tossing game as a real number. Namely, we want to think of our {0, 1 }-valued random vari ables {Xm : m 2:: 1 } as being the coefficients in the dyadic expansion of the random number
m
+
1)
p = t � Y is a uniform random variable on [0, 1].
(3. 1 1)
Now a uniform random variable is a reasonable model for a point which is chosen at random from [0, 1] . So
when p = t (3.4) says that the ratio of O's to 1 's in the dyadic expansion of a typical x E [0, 1] is 1. More pre cisely, if a random x is chosen uniformly from [0, 1], then ��n (X) � t. This observation was made around the tum
of the century by E. Borel and is discussed beautifully and in detail by M. Kac [K]. More recently, G. Goodman [G] discussed some interesting variations on the theme of Borel and Kac. A Problem in Measure Theory
Motivated by these considerations, one should want to de velop some feeling for how many numbers x E [0, 1] are non-random. In particular, if (cf. (3.5)) .1p
=
{t E [0,
�n(t) p},
.
1] : hm
n--. co
--
n
=
(4.1)
how large is the complement .1�t2 of J1v2? A detailed an swer to this question is quite difficult (cf. [DR] for an in teresting discussion from an entirely different point of view). Nonetheless, a few qualitative statements can be made without much effort. I begin by showing that .1�t2 must be reasonably large, in the sense that it must contain an uncountable number of points. Indeed, given any p E (0, 1)\HJ, .1p \:: .1�t2 . We will lrnow that .1�t2 is uncountable as soon as we show, for ex ample, that .11/3 is uncountable. But suppose we could count the points in .1v3, and let (xnl'l be an enumeration of them. Then, (3.4) and (3.9) with p = i would lead us to the contradiction co
1
=
!Pvs(Y E .11/3)
=
I IPv3(Y = Xn) = 0. n= 1
We conclude that .1�t2 is uncountable. Of course, exactly the same reasoning shows that .1p is uncountable for each p E (0, 1), and obviously .1P is disjoint from .1p' when p =I= p'. Knowing that .1�!2 ;;;;? Upnt2 .1p and .1p's are mutu ally disjoint sets each of which is uncountable, one might start believing that .1�t2 is reasonably large. In spite of the evidence to the contrary just given, we lrnow that .1�t2 must also be quite small. Indeed, we lrnow that, with probability 1, a uniform random variable misses .1'it2 entirely. In the terminology of Lebesgue's measure the ory, .1�!2 is a set of measure 0, and sets of measure 0 are characterized by the property that they can be covered by a countable number of intervals the sum of whose lengths is arbitrarily small. That is, for each E > 0, there exists a sequence (In l'l of open intervals such that 00
00
J1'i12 \:: U In and yet I !In! < n= 1 n =1
E
where !In ! is the length of In.
In summary, although .11!2 is not countable, it can nonethe less be covered by a countable number of intervals the sum of whose lengths is as small as you like. Some Exotic Increasing Functions
For 0 < p < given by
1, consider the function Fp : [0, 1] � [0, 1] Fp(x)
=
!Pp(Y ::::; x),
(5.1)
where Y is the random variable in (3.1). Clearly, Fp is non decreasing, Fp(O) = 0, and Fp(1) 1. It is easy to see from (3.7) that =
X < y � Fp(X) < Fp(y).
(5.2)
That is, Fp is strictly increasing. In addition, because, by
(3.9), lim Fp(Y) = 1Pp(Y ::;; x) = !Pp(Y < x) + !Pp(Y = x) = 1Pp(Y < x) = lim Fp(y),
y '>.x
y /' x
we lrnow that Fp is continuous. In other words, we can say that, for each p E (0, 1 ), Fp is a strictly increasing, contin uous function with Fp(O) 0 and Fp(l) 1. When p t, we can say much more: from (3.10), Fv2(x) = x for all x E [0, 1]. However, when p =I= t, the function Fp is somewhat mysterious. It is difficult to pic ture its graph. For example, suppose that we examine its arclength. Recall that, for any continuous F : [0, 1] � IR, the arclength Arc(F) of its graph can be computed by taking the limit8 =
=
=
Arc(F)
=
2n - 1 I Y(2 n)2 n---"'oo m=O lim
+
(F((m
+
1)2 n) - F(m2-n)?,
which may or may not be +oo. For any non-decreasing F, 2 then, because ::;; Va + b2 ::;; a + b for all non-negative a and b,
a_:;;
F(1) - F(O) =
1 � \12 ::::; Arc(F)
::;;
2.
By direct computation (or the Pythagorean Theorem), the lower bound is achieved when F = Flf2. On the other hand, it is hard to imagine how a continuous F could achieve the up per bound. It would seem that the graph of such an F would, at every point, have to be going horizontally or vertically, thereby ruling out the possibility that the function being graphed is continuous. For this reason, it is interesting that
p E (0, 1) \ ( 112 } � Arc(Fp)
=
2.
(5.3)
To see (5.3), one can9 proceed as follows. We are fmd ing the limit of
zn - 1 Ln = I Y4 n m=O
+
2 (Fp((m + 1)2 n) - Fp(m2 - n)) .
8The existence of the limit is assured by the fact that the expression on the right is non-decreasing as n increases. To see this, interpret the mth summand in the nth sum as the length of the 2-dimensional vector whose components are 2-n and F((m + 1 )2-n) - F(m2-n), and note that this vector is the sum of the 2mth and (2m + 1 )st vectors in the (n + 1 )st sum. Thus, the asserted monotonicity is just the triangle inequality for the lengths of vectors in the plane. 9-fhe elegant argument which follows was suggested to me by Alex Perlin. For those who know a little more classical analysis, I have provided an appendix in which it is shown that the graph of any continuous, non-decreasing F on [0, 1] has arclength equal to 1 + F(1 ) - F(O) if and only if F is Lebesgue-singular.
VOLUME 22, NUMBER 2, 2000
69
Set
An(R) = {0 :::;; m < zn : Fp((m + 1)2-n) - Fp(m2 -n) > 2-n RJ Bn(R) = {0 :::;; m < 2n : Fp((m + 1)2-n) - Fp(m2 -n) :::;; 2-n RJ. Obviously, for any R
Ln 2::
I
mEAn(R)
>
0,
Y4 n + (Fp((m + 1)2-n) - Fp(m2- n))2
+ R- 1 )(1
Hence, since10 x 2: R � (1
Ln 2:: (1 + R- 1)- 1
I
mEAn(R)
(
2:: (1 + R- 1 )- 1 1
+
(2-n
+
+
mEAn(R)
mEBn(R)
.x-2) 112 2:: (1 + x),
Fp((m + 1)2 - n)
- Fp(m2- n))
I
The First Derivative of Fp
I
+
I
+
mEBn(R) (Fp((m + 1)2- n)
and clearly this completes the proof that Arc(Fp) 2:: 2. Because we already know that the opposite inequality must hold, (5.3) is now verified.
2-n
- Fp(m2 -n))
}
This is already grounds for regarding the graphs of the func tions Fp when p =/= t as quite strange indeed. Namely, (5.3) indicates that, at essentially any point, the tangent to the graph of Fp must be either horizontal or vertical. In this section I will provide further evidence that this picture is correct. To be more precise, recall the sets D.p introduced in (4. 1). What we are going to do here is check that
p E (0, 1) \ /tJ � F;(x) = where
F'(x) p
But
I
mEAn (R)
(Fp((m + 1)2- n) - Fp(m2- n)) =
1-
I
mEBn(R)
(Fp((m + 1)2- n) - Fp(m2-n)).
Therefore, we now know that, for every R > 0,
Ln 2:: (1 + R- 1)- 1 X 2-
(
I
mEB,(R)
(Fp((m
+
)
At the same time, by (3.3) and (3.6),
=
=
(
.
mEBn(R)
of X E [0, 1) for which pf.Cx)qn -t,(x) :::;; 2-nR. Thus we will know that
(
)
To this end, we first note that
>
2p log 2p + 2q log 2q 2
Because p
(5.4)
THE MATHEMATICAL INTELLIGENCER
Fp(X
=/=
{
(6.2)
>
+
{
0 if X E /j.l/2 h) - Fp(X) = oo if x E D.p h
·
Hence, since ;In(X) � i,
(
(: Yn(x) -n/2) k X) = n [ Pp ( n� - i) �] log
0,
= ( vpqr (�y
t, pq < i and therefore Pp == 2 Vpq < 1.
log (2vpq)n
and x E (0, 2) � x log x E IR is a strictly convex function which vanishes at x = 1. Finally, by combining (2.4) with (5.4), we arrive at wsquare both sides to check this.
_
Fp(Rn(x)) - Fp(Ln(X)) < Rn(X)) l l p(Ln (X) < y = 2n flu-Rn(X) - Ln(X) = 2npln(X)qn - ln(x) ncx) - n/2 2 .
(5.4) is equivalent to
p log 2p + q log 2q =
h
0 if X E !1 11 - oo if X E D.p 2 .
In other words, everything comes down to proving (6.2). Now letp E (0, 1) \ {112} and x E 11.1!2 be given. Referring to the notation introduced in (3.8) and using (3.6), we have that
0,
p E (0, 1) \ { 1/2} � (2p)P(2q)q > 1 . This is because
Fp(X) - Fp(X - h)
Indeed, if we knew (6.2) for every p E (0, 1) \ { 1/2 }, then, because
l�
)
1Pp ((2p)n -1Sn(2q) I -n-1Snr :::;; R � 0.
h)::._ x_+---' _ FE.:(:_ .._ Fl!..::c. (x:.f_ .) h
it would follow that
where to obtain the first equality we have taken advantage of the fact that U [m2- n, (m+ 1)2-n) is precisely the set
70
h�
1Pp ((2pr-lSn(2q) 1 - n-1Sn)n :::;; R ,
as soon as we show that, for arbitrarily large R
•
(6. 1)
D.q = { 1 - X : x E D.p} and Fq(X) = 1 - Fp(1 - x),
(Fp((m + 1)2-n) - Fp(m2-n)) mEB (R) I
n 1Pp(JJBnqn -Sn :::;; 2 -n R)
hm0
h---.
if X E D.v2 if x E D.p '
is the derivative of Fp at x. (Of course, part of the asser tion is that this limit exists at the indicated points.) To prove (6. 1), it suffices to check it for the left deriv ative. That is, it is enough to show that r
1)2-n) - Fp(m2-n)) .
==
{Ooo
+
log
�
- oo,
which means that lim
n---.oo
Fp(Rn(x)) - Fp(Ln(X)) = 0 exponentially fast. (6.3) Rn(X) - Ln(X)
Although this computation does not provide a definitive proof, it strongly indicates that (6.2) holds at each x E ll 11z. To complete the proof, choose n(x) to be the first n > 1 such that In (X) 2= 2, and, for n 2= n(x), let Mn(x) be the largest 1 ::; m ::; n for which In(x) - Im(x) = 1. Observe that, since n 2:: n(x) => LMn(x)(X) ::; X - 2-n < X ::; RMn(x)(X),
- - 1 < h ::; 2-n
FP (x) - FP (x - h) h (X)) Fp(R - Fp(LMn(x) (X)) Mn(x) ::; 2n -Mn(x) + 1 ; RMn(x) (X) - LMn(x) (X)
n 2:: n(x) and 2
n
=>
from this and (6.3), that it suffices to show
n(X) lim M
= 1. (*) n What would it mean for (*) to fail? There would have to be an r < 1 and there would have to be infinitely many val ues of n for which heads at the (n + 1)st toss (En+t(x) = 1) was preceded by a run of more than (1 - r)n tails (ek(x) = 0 for rn ::; k ::; n, so that Ik(x) remains constant over the run). But then for k at the beginning of the run,
n-.cc
Ik(X)
_
k
I (X) In(X) 2:: n
n
n
2::
2 -n =>
_
r
I=
FP(x) - Fp (x - h) h
2::
{X
E
[(2p)P(2q)q (�rn( ) n-pqNn(x)ln- 1r � +oo n - 1 In (x) - p � 0, X I
as n � oo. Because one can proceed, just as in the demonstration of (*) above, to verify that n - 1Nn(x) - 1 � 0. Finally, as we saw in (5.4), (2p)P(2q)q > 1, and so the desired conclusion should now be evident. In summary, the first part of (6. 1) shows that, when p =F 1/2, F�(x) = 0 when x E .:1 112 (what we called in (3. 1 1) be ing chosen "at random"). On the other hand, to compen sate for the first part and enable Fp to be strictly increas ing, F;(x) = oo when x E llp. Both sets are dense.
.
F(x
�-t((a, b]) = F(b) - F(a)
+
h) - F(x) h
=
0
}
0
for all
0 ::; a < b ::;
1,
then F being Lebesgue-singular is equivalent to f.t being sin gular to Lebesgue measure.l 1 The purpose of this appen dix is to prove that Arc(F) = 2
if and only if F is Lebesgue-singular. (A. 1)
I begin by introducing a little notation. Namely, set = F((m + 1)2- n) - F(m2- n) for n 2:: 0 and 0 ::; m < n 2 . By definition,
dm,n
Arc(F) = lim Ln n--->oo where Ln
2n - l
=
I
m �o
Y4 - n + ll;,,n,
and, as already pointed out, Arc(F) lies between 2. Next note that
2 - Ln
=
2 -n
2n - l
V2 and
((1 + 2ndm,n) - Y1 + (2n dm,n?).
I
m �o
But, for any non-negative
a,
so we now know that
Equivalently, iffn : [0, 1) � [0, oo) is defined so thatfn (X) 2ndm,n when m2-n :S X < (m + 1 )2- n, then
2 - Ln 2
2npln(x)q_NnCx) - ln(X)
[0, 1 ] : k�
[0, 1 ]
One says that F is Lebesgue-singular if the set I has Lebesgue measure 1 . Alternatively, if f.t denotes the Borel measure on [0, 1] determined by
1 .
Thus, it is enough to prove that
=
Given a continuous, non-decreasing function F on with F(O) = 0 and F(1) = 1, set
' .!. ,
In order for n-1In(x) to approach 1/2, as is required by our assumption that x E .:1 112, the left-hand side would have to approach 0 (the Cauchy criterion) and the right-hand side to approach a non-zero limit: contradiction. Therefore (*) must hold. Hence, we have now proved (6.2), and therefore the first half of (6. 1), for every x E ll112· To complete the proof of (6. 1), again letp E (0, 1) \ { 112 } be given, but now suppose that x E llp. For each n 2:: 1, let Nn(X) be the smallest m > n such that Im(X) - In(X) = 1, and observe that, since x - 2 -n :S Ln(X) < LNn(x> (x) :S x,
2- n + 1 > h
Appendix
::; f
[0,1)
1
fn(X) + fn (X)
dx ::; 2 - Ln.
=
(A.2)
In order to complete our program starting from (A.2), we need to invoke some measure theory. In the first place, an immediate consequence of (A.2) is the conclusion that
Ln � 2 ¢::::::> fn � 0 in Lebesgue measure.
(A.3)
Secondly, we need to know that fn � 0 in Lebesgue mea sure if and only if the measure f.t is singular to Lebesgue measure. This second fact, which is also the basis for the comment in footnote 1 1 , is a corollary of a variant (Theorem 5.2.26 in [81]) of Lebesgue's Differentiation Theorem. Namely, fn converges Lebesgue almost every where to the Radon-Nikodym derivative of the absolutely continuous part of the measure f-t. Thus.fn � 0 in Lebesgue
1 1 1n fact, one can show that the preceding set l is assigned measure 0 by the absolutely continuous part of
!"·
VOLUME 22, NUMBER 2, 2000
71
measure if and only if the absolutely continuous part of P vanishes; and, knowing this, one gets (A. l) as an immedi ate consequence of (A.3).
AUTHO R
ACKNOWLEDGMENTS
The author acknowledges support from NFS grant DMS 9625782. He is also grateful to G. Goodman for providing him with most of his bibliographic information. REFERENCES
[B]
Billingsley, P.,
The singular function of bold play,
American
Scientist 71 (1 983), 392-397. [DR] de Rham, G . ,
Sur certaines equations fonctionnelles ,
I'Ouvrage DANIEL W. STROOCK
publie a !'occasion de son centenaire par !'Ecole polytechnique
Department
de I'Universite de Lausanne, pp. 95-97. [D] [G]
Durrett, R.,
Brownian Motion
and Martingales in Analysis,
Goodman, G.,
Theory, Carus Math. Monograph Series #12, J. Wiley, NY, 1 959. [RN] Riesz, F. & Sz.-Nagy, B . ,
Functional Analysis,
translated from the
2nd French edition, Frederick Ungar, New York, 1 955.
[81 ] Stroock, D.,
A Concise Introduction to the Theory of Integration,
3rd Edition, Birkhauser, Boston, 1 999. [S2]
-- ,
Probability Theory, an Analytic View,
e-mail:
[email protected]
American Math.
Kac, M., Statistical Independence in Probability, Analysis and Number
Cambridge U. Press,
021 39-4307
USA
Statistical independence and normal numbers, an
Monthly (1 999), 1 1 2-1 26. [K]
Cambridge, MA
Wadsworth, Belmont, CA. 1 984.
aftermath to Mark Kac 's Carus monograph,
of Mathematics MIT
Daniel Stroock got his doctorate
in
1 966 at Rockefeller
University, with Mark Kac. Since 1 984 he has been a Professor at
MIT. He was the 1 997 Colloquium
Lecturer
of the American
Mathematical Society. In addition to his public persona as a leader in probability theory, he is sometimes to be found riding
horses in Colorado. (Well, actually, you can not find him, for he has
not
disclosed where in Colorado
he
does his riding.)
Cambridge, UK and NY, USA, 1 991 .
Clear, Simple, Stimulating Undergraduate Texts from the Gelfand School Outreach Program
Trigonometry
I. M. Gelfand, Rutgers University, New Brunswick, NJ & M. Saul, The Bronxville School, Bronxville, NY
This new text in the collec tion of the Gelfand School T�gonometry Outreach Program is written in an engaging style, and approaches the material in a � unique fashion that will moti vate students and teachers alike. All basic topics are cov ered with an emphasis on .. ·� -- -beautiful illustrations and �-------' examples that treat elemen tary trigonometry as an outgrowth of geometry, but stimulate the reader to think of all of mathe matics as a unified subject. The definitions of the trigonometric functions are geometrically moti vated. Geometric relationships are rewritten in trigonometric form and extended. The text then makes a transition to a study of the algebraic and analytic properties of trigonometric functions, in a way that provides a solid foundation for more advanced mathematical discussions.
� � ���
��L:-J ·- � ..... ...
2000 I Approx. 280 pp., 1 85 illus. I Hardcover ISBN 0-81 76-391 4-4 I $1 9.95
72
THE MATHEMATICAL INTELLIGENCER
Algebra
I.M. Gelfand & A. Shen
''The idea behind teaching is to expect students to learn why things are true, rather than have them memorize ways ofsolving a few problems . . . [Thisj same philosophy lies behind the current text. . . A serious yet lively look at algebra. " -The American Mathematical Monthly 1 993, 3rd printing 2000 I 1 60 pp. I Soltcover ISBN 0-81 76-3677-3 I $ 1 9.95
The Method of Coordinates
I.M. Gelfand, E.G. Glagoleva & A.A. Kirillov
"High school students (or teachers) reading
through these two books would learn an enor mous amount ofgood mathematics. More impor tantly, they would also get a glimpse of how mathematics is done. " -The Mathematical lntelligencer 1 990, 3rd printing 1 996 I 84 pp. I Soltcover ISBN 0-81 76-3533-5 I $ 1 9.50
Functions and Graphs
I.M. Gelfand, E.G. Glagoleva & E.E. Shnol
''All through both volumes, one finds a careful
description of the step-by-step thinking process that leads up to the correct definition ofa concept or to an argument that clinches in the proof of a theorem. We are. . . very fortunate thatan account ofthis caliber has finally made it to printedpages. " -The Mathematical lntelligencer 1 990, 5th printing 1 999 I 1 1 0 pp. I Soltcover ISBN 0-81 76-3532-7 1 $ 1 9.95
To Order:
Call: 1-800-777-4643 • Fax: (201) 348-4505 E-mail:
[email protected]
Visit: www.birkhauser.com
Textbook evaluation copies available upon request, caU ext. 669. Prices ore valid in North America on� ond subie
5100
Birkhauser Boston International Publisher for the Mathematical Sciences
Promotion 1¥1078
lj£1fW·i· t.i
.J e remy G ray , Editor
Vita: Friedrich Wilhelm Wiener HAROLD P. BOAS AND DMITRY KHAVINSON
I
I
n a recent note [4], we proved a mul tidimensional analog of the following classical theorem of Harald Bohr [6]. (For subsequent developments in the multidimensional theory, see [ 1-3].) THEOREM 1 (Bohr). Suppose that a power series I'k= o ckzk converges for z in the unit disk, and I I'k=o ckzkl < 1 when lzl < 1 . Then, I'k=o lckzkl < 1 when lzl < f. Moreover, no larger ra dius than t will do.
In one part of the proof, we adapted to higher dimensions an elegant argu ment which Bohr attributed to Wiener. Since Bohr mentioned this name in the same sentence with the names of Riesz and Schur, we assumed it to be the fa mous Norbert Wiener, and we added the initial "N" in our attribution. Our as sumption was false. Lawrence Zalcman brought to our attention that Edmund Landau mentioned the name of one F. Wiener in connection with Bohr's theorem [9, §4]. Wiener's Life
Column Editor's address: Faculty of Mathematics, The Open University, Milton Keynes, MK7 6AA, Englc.nd
Having never heard of a mathematician F. Wiener, we investigated. We report here on what information we have dis covered about the life and work of F. Wiener, hoping that his name may be preserved in mathematical history for another generation. One thing we did not find is any record of a relationship to Norbert Wiener. According to the curriculum vitae accompanying his dissertation, Fried rich Wilhelm Wiener was born in 1884 in Meseritz, then part of the Prussian province of Posen and now part of Poland. Mter completing high school (gymnasium), he pursued studies in Gottingen. After a year of compulsory military service in 1904-1905, he re sumed studies in Berlin. He returned to Gottingen in 1909, the same year that Landau was called there as Min kowski's successor. Wiener attended lectures of such famous mathemati-
cians as Frobenius, Hilbert, Landau, Schottky, Schur, and Schwarz. He completed his doctoral dissertation [ 15] under the supervision of Landau in 1911. Wiener published one journal article [ 14] in 1910, which is cited in standard books [7, 1 1]. After a promising begin ning, he seems to have published noth ing further, not even his dissertation. There is no evidence that Wiener was ever a member of the Deutsche Mathematiker-Vereinigung (DMV); no obituary notice for Wiener appeared in the DMV Jahresbericht ( 10]. Although we do not know the circumstances of Wiener's death, this must have occurred no later than 1921 , as the index pub lished that year to volumes 5 1--80 of Mathematische Annalen lists Wiener as deceased. We conjecture that Wiener may have been a casualty of the war. Wiener's Work
The focus of Wiener's mathematical work was to discover simple proofs of known theorems. Both of his pa pers have the word "elementary" in the title. Hilbert's Inequality
Wiener's 1910 paper concerns Hilbert's double-series theorem stating the boundedness in e2 of the quadratic form I�= l I�= l XmXn/(m + n). THEOREM 2
(Hilbert).
I m=lf n=lf mXmXnn I
C f lxnl2n=l Moreover, the inequality holds with C = 1r, and no smaller value of the constant C will do. +
:S
Hilbert's proof was first published in the dissertation [ 13] of his student Hermann Weyl in 1908. The theorem attracted a great deal of attention, and numerous proofs and generalizations were published subsequently. The clas sical book by Hardy, Littlewood, and
© 2000 SPRINGER-VERLAG NEW YORK, VOLUME 22, NUMBER 2, 2000
73
P6lya [7] devotes a whole chapter to this inequality. At the time of Wiener's work, it was not known that the sharp value of the constant C is 1r. Schur proved this the following year [ 12]. What Wiener meant by an "elemen tary" proof of Hilbert's inequality was a proof that used no integration and no function theory. His proof consists of the following elementary steps: 1. Reduce to the case that {xnl�= l is a decreasing sequence of positive real numbers. 2. Group the terms in the inner sum into blocks whose terms have in dices running between consecutive squares. 3. Apply the Cauchy-Schwarz inequal ity to both the inner sum and the outer sum. 4. Interchange the order of summa tion. 5. Invoke Cauchy's condensation test for convergence of series. Wiener's Dissertation
In his dissertation, Wiener addresses two questions in the theory of entire functions of one complex variable. The first part of the dissertation concerns the minimum modulus of an entire function f Let m(r) min{ff(rei lf_} f : 0 :s (} :s 21r}. Since m(r) is zero whenjhas a zero of modulus r, the natural question to ask about a lower bound for m(r) is whether m(r) is fre quently large: Is there some reasonable comparison function c(r) such that lim supr___.oc m(r)lc(r) > 0? If f is an entire function of finite order at most p, meaning that limJzl -+oo [.flz)[e-l z JP+< = 0 for every positive E, then Hadamard's fac torization theorem implies that lim supr___.co m(r)erP+ • = oo for every pos itive E. In other words, m(r) cannot tend to zero too fast. This weak estimate can not be improved in general. For exam ple, the exponential function ez has or r der 1 and m(r) = e - . On the other hand, iff is a nonconstant polynomial, then m(r) tends to infinity like a power of r. The question arises whether an en tire function of sufficiently small order is enough like a polynomial that its min imum modulus must be unbounded. In 1905, A. Wiman confirmed [16] =
74
THE MATHEMATICAL INTELLIGENCER
that the minimum modulus of ev ery nonconstant entire function of order p strictly less than t is in unbounded. Moreover, deed r lim SUPr-+oc m(r)e- P- • = oo when 0 < E < p < t. The cutoff at t is sharp, for the convergent infinite product I4:'= 1 (1 - zln2), which equals (sin 1rVZ)/ ( 1rYz), has order t and m(r) :s r- 112f1T. (See [5, Chap. 3] for more about the minimum modulus of entire functions of small order.) Wiener's dissertation gives a new proof of Wiman's theorem. The proof is elementary in the sense that it uses only arguments about series and prod ucts of real numbers; it avoids using theorems from function theory. Wiener's proof even supplements Wiman's theorem by giving some in formation in the endpoint cases p = 0 and p = i; namely, Wiener shows that if j(z) = II�=l (1 - zlan), where {anl�= l is a sequence of nonzero complex numbers of increasing modu lus, and if limn___.co n21[anf = 0, then lim supr___.oo m(r)r-k = oo for every pos itive k. This result applies to all tran scendental entire functions of order 0 n [e.g., to II�=l (1 - zln )] and to some entire functions of order t [e.g., to II�=2 (1 - zl(n2 log n))]. The second part of Wiener's disser tation is motivated by a theorem of Landau [8] that generalizes Picard's lit tle theorem.
There is a positive junction R such that every polynomial of the form a0 + z + a2z2 + + anzn assumes at least one of the values 0 and 1 in the disk {z: fzl :s R(ao) ). THEOREM 3
(Landau).
· · ·
One might hope that a theorem about polynomials would have an ele mentary proof, which would then yield an elementary proof of Picard's theo rem. Wiener was able to find an ele mentary proof (using RoucM's theo rem, but nothing else from function theory) of Landau's theorem under an additional hypothesis about the loca tion of the zeros of the polynomial; namely, he assumed that the zeros are located within the two equal acute an gles determined by two lines inter secting at the origin. If the radian mea-
sure of the acute angle is t1r - {3, then one can take R(ao) = 28fao f logfaof/ sin {3. (The cases a0 = 0 and a0 = 1 are of no concern, because then the poly nomial takes the value 0 or 1 at the ori gin.) ACKNOWLEDGMENTS
For assistance in this project of iden tifying and tracing F. Wiener, we thank Samuel J. Patterson (Georg August-Universitat Gottingen), Con stance Reid, Heinrich Wefelscheid (Gerhard-Mercator-Universitat Ges amthochschule Duisburg), and Law rence A. Zalcman (Bar Ilan University). We are especially indebted to Profes sor Wefelscheid for locating and send ing us a copy of Wiener's dissertation. We thank Heidemarie Wormann Boas for help with German translation. The authors' research was partially supported by grants from the National Science Foundation. REFERENCES
1 . L. Aizenberg, A. Aytuna, and P. Djakov, "An abstract approach to Bohr's phenom enon,"
Proc. Am. Math. Soc.
(in press).
2. L. Aizenberg, A. Aytuna, and P. Djakov, "Generalization of Bohr's theorem for arbi trary bases in spaces of holomorphic func tions of several variables," 1 998, preprint. 3. L. Aizenberg, "Multidimensional analogues of Bohr's theorem on power series," Am. Math. Soc.
Proc.
(in press).
4. H .P. Boas and D. Khavinson, "Bohr's power series theorem in several variables," Proc. Am. Math.
Soc.
1 25(1 0) (1 997),
2975-2979. 5. R.P. Boas,
Entire Functions,
Academic
Press, New York, 1 954. 6. H. Bohr, "A theorem concerning power se ries,"
Proc. London Math. Soc.
(1 9 1 4), 1 -5.
(2) 1 3
7. G. H. Hardy, J . E. Littlewood, and G. P61ya, Inequalities,
Cambridge University Press,
Cambridge, 1 934; second edition 1 952. 8. E. Landau, " U ber eine Verallgemeinerung des Picardschen Satzes, "
Sitzungsber.
Konig/. Preuss. Akad. Wissensch.
(1 904),
1 1 1 8-1 1 33. 9. E. Landau,
Oarstellung und BegnJndung
einiger neuerer Ergebnisse der Funktionen theorie,
Springer-Verlag, Berlin, 1 91 6, sec
ond edition 1 929, third edition 1 986 edited and supplemented by Dieter Gaier.
A U T H OR S
HAROLD P. BOAS
DMITRY KHAVINSON
Department of Mathematics
Department of Mathematical Sciences
Texas A&M Un iversity
University of Arkansas Fayetteville, AR 72701 USA
College Station, TX 77843-3368 USA e- mail :
[email protected]
e- mail :
[email protected]
Harold P. Boas received his A.B. and S.M. from Harvard in
Dmitry Khavinson was born in Moscow, Russia. He obtained
Technology in 1 980. After
his Ph. D . from Brown University in 1 983. Since 1 983, he has
his M . S . from Moscow State Pedagogical Institute in 1 978 and
1 976 and his Ph.D. from the Massachusetts Institute of
4 years as a J.F. Ritt Assistant
been teaching at the University of Arkansas in Fayetteville. He
Professor at Columbia University in New York, he joined the
faculty of Texas A&M University in College Station in 1 984. He
has been a visiting
Bergman Prize for their research on function theory in multi
Michigan, University of Alabama, and Universidad de La
space.
Editor for the international journal Complex Variables.
and his collaborator, Emil J. Straube, shared the 1 995 Stefan
Laguna (Tenerife). Since 1 991 , he has served as an Associate
dimensional complex space. Currently, he is lost in cyber
1 0. Deutsche Mathematiker-Vereinigung, Nach
rufe, an index of the obituary notices from
W I N N E R OF T H E 1 99 8 ASSOC IATION OF A M E RICAN I'LJI�I.ISH ERS BEST N EW J'ITL.E IN rvlATH EMATICSI
the DMV Jahresbericht is available on the
JAMES.
KEENER, University of Utah and JAMES SNEYD. University of Michigan
World-Wide Web at the address http:// www. mathematik. uni-bielefeld. de/DMVI
Mathematical Physiology
archiv/nachrufe.html.
1 1 . G. P61ya and G. Szego, Problems and Theorems
Springer-Verlag,
in Analysis,
professor at the Royal Institute of
Technology (Stockholm), the Technion (Haifa), University of
Mathematical Physiology
James Keener james sneyd
Heidelberg, 1 972, Vol. 1 .
1 2 . I. Schur, "Bemerkungen zur Theorie der
beschrankten Bilinearformen mit unendlich vielen Veranderlichen," J. Reine Angew. Math. 1 40 (1 9 1 1 ), 1 -28. 1 3. H.
Weyl,
mit
"Singulare lntegralgleichungen
besonderer BerOck-sichtigung
Fourierschen
lntegraltheorems,"
des Ph.D.
1 4. F. Wiener, "Eiementarer Beweis eines Reihensatzes von Herrn Hilbert," Math. Ann. 68 (1 9 1 0), 361 -366. 1 5. F.W.
Wiener,
"Eiementare Beitrage
zur
neueren Funktionentheorie," Ph.D. thesis, Georg-August-Universitat, Gottingen, 1 91 1 . 1 6. A.
Wiman,
"Sur
une
This book provides an overview of mathematical physiology
thesis, Gottingen, 1 908.
extension
d ' un
theoreme de M. Hadamard, " Ark. Math.
Astron. Fys. 2(1 4) (1 905), 1 -5.
containing a variety of physio logical models.
Physiology
Mathemarical
is divided into two parts: the
first
pan
is a pedagogical presentation of some of the basic theory; the second pan is devmed to an exten
sive discussion of particular physiological
tems. Machemarical Physiology will
�
sys
be of interest
to researchers and graduate students in applied
m athematics interested in physiological prob
lems. a n d to quantitative physiologists wishing to know about current and new mathematical techniques.
I 998/785 PP. . 360 ILUJS. HAROCOVERI$69.95 ISBN 0·387-9838 1 -3 INTERDISCIPLI ARY APPuED MATHEMATICS. VOLUME 8
•
I·.t�
•
Springer
PromOtiOn IH274
Order Ioday1 • C.1ll I -i-IOO·SPHI:.J(,l H 1..'1! 1 1- 'ili-1-·l'>W> • \1Sll h ttp \IIVw sprlllget-nv
VOLUME 22, NUMBER 2, 2000
75
i;i§lil§i:{j
.J et W i m p , Ed itor
I The reader may recall that
The Intelligencer ran in its Spring 1999 issue a review The Man Who
by Marion Cohen of Paul Hoffman's biography of Paul Erdos,
Loved Only Numbers. Here follow some observations about the related biogra phy, M y Brain Is Open, penned by a well-known mathematician who knew Erdos intimately.
impressed me. I remember him more
My Brain Is Open: The Mathematical Journeys of Paul Erdos
for his eccentricities, which were ap parent even then. sixties
US
Feel like writing a review for The Mathematical Intelligencer? You are welcome to submit an unsolicited review of a book ofyour choice; or, if
$25.00;
ISBN
0684846357
As a teenager in the
was hard not to admire
Erdos's social iconoclasm. He would leave the table in the middle of meals
by Bruce Schechter
NEW YORK: SIMON & SCHUSTER,
it
and wander about or fall asleep in front of guests invited specifically to meet
1 998. 224 pp.
him. (In retrospect this seems more the behavior of a depressed man than
REVIEWED BY PETER BORWEIN
charming eccentricity.)
Some thoughts evoked on reading ''My Brain Is Open". . . .
ity well, particularly when it is accom
I
Mathematicians tolerate eccentric panied by serious ability. My mother would talk of "earned eccentricity,"
remember the black shoes and the
and perhaps no one had earned his ec
dark suit of a blind Russian. I was five.
centricity more than Erdos. My wife is
I like to think it was Pontryagin. It might
less tolerant. Eccentricity, even ac
have been. We were on a family cruise
companied by serious ability, is not an
associated with the Edinburgh Inter
excuse for rudeness, in her eyes. My
you would welcome being assigned
national Congress of
I am from a
suspicion is that the entire community
a book to review, please write us,
mathematical family. My father, a math
has become less tolerant in this re
telling us your expertise and your predilections.
1958.
ematical grandchild of Hardy, was a lec
spect as fashions change and industrial
turer in St. Andrews in
sponsorship becomes more important.
this
period. I
must have met Erdos around that time
Erdos had a profound influence on
or shortly afterwards. He visited our
my career. My first paper, while still a
family in St. Andrews several times
student, was originally titled
and on many more occasions after we
terexample to a question of Erdos."
Somewhere
(This was changed by a referee who didn't seem to like an Erdos conjecture
1963.
that followed my sister's birth in
1961.
career, though I don't actually know,
tricks of dexterity with amphetamine
but it really wasn't a very good ques
bottles (or anything else). I do re
tion he posed and it didn't have a very
member him visiting the
difficult answer.
Copsons'
wearing sandals without socks-more
In subsequent years, I have been
shocking in the innocent fifties in
consumed by some questions of Erdos
Scotland than juggling amphetamines.
of far more substance. Beautiful ques
Mrs. Copson, now in her nineties, was
tions
the wife of E. T. Copson, a first-rate
Number Theory, and Probability, ques
complex
Philadelphia, PA 1 91 04 USA.
76
being wrong.) This probably helped my
I don't remember Erdos performing
analyst
and
chair
in
St.
Andrews. She was also the daughter of
of Mathematics, Drexel University,
coun
we still have the "What is e3?" postcard
moved to Canada in
Column Editor's address: Department
"A
tions
on
the
that
progress for
border
have
of
resisted
Analysis, serious
40 years.
Sir Edmund Whittaker, the most influ
Erdos's problems were a bit like his
ential Scottish mathematician of the
papers, they rather vomited forth and
first half of the century.
had to be distilled later.
Some
of
Erdos's legendary ability to charm
Erdos's work is wonderful, and some
children seems not to have particularly
isn't. If it was actually written by Erdos
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER·VERLAG NEW YORK
himself, it probably wasn't very well written. His problems shake them selves out by either surviving or not surv1vmg. Presumably, a similar process will sort the wheat from the chaff with respect to his papers. I met Erdos many times over the years. I admired and liked him both personally and professionally. At one meeting in Urbana in the late eighties I ended up, after a long day of meet ings, at a party with him. A group had been chatting with Erdos, until even tually only I survived. Erdos's vision was badly compromised at this time by cataracts. After a while he asked me where I was from. I replied "Halifax." He said, "You must know my good friend Peter Borwein." I said I did; and the conversation ambled on. Bruce Schechter, in his very readable biogra phy, recounts a similar anecdote and attributes it to a Vancouver mathe matician Elliot Mendelson (who, to the best of my knowledge, does not exist, at least not as a Vancouver mathe matician). This is a small quibble-the flavor of the anecdote is exact. I liked Schechter's portrait of Erdos. It is something of a caricature, but then so was Erdos. Equally, Schechter's at tempts to explain some of the mathe matics is much more successful than average for a work of this type. The question of Erdos's place in his tory is left hanging (though the ac count of the Erdos-Selberg contro versy is well handled and illuminating). Erdos's place in twentieth-century mathematics will not be clarified for quite a while. I suspect he will end up being a very significant figure. My model of the progress of mathematics weights the relative importance of the tentative beginnings of new fields over the defmitive closure of old ones. Some of the traditional elite in pure mathematics have trouble positioning Erdos in the mathematical firmament. Schechter alludes to this and offers some comments of Saunders MacLane on the topic. One Fields medalist is re ported to have described Erdos as a "brilliant talent fractured into a thou sand pieces." In some sense this is fair, but, of course, a different Erdos would have had a different career. I like to compare Erdos to Picasso. Both were
prodigiously productive, both have their easy pieces, both were masters of many techniques but not the undis puted leaders of any, and both are per haps better respected and better un derstood dead. The comparison is not of course perfect. Picasso and Erdos were very different types of people. Some of Erdos's eccentricity was studied. (Schechter also alludes to this.) I recall driving around Budapest in a cab with Erdos while he lined up some amphet amines. It didn't pose any problem. But certainly not all of Erdos's peculiarities were assumed. Many mathematicians are peculiar. The handbook of the American Psy chiatric Association has this entry: 301.40 COMPULSIVE PERSONAL ITY DISORDER Differential Diagnosis. Obsessive Compulsive Disorder. Diagnostic Criteria. At least four of the following are characteristic of the individual's cur rent and long-term functioning, are not limited to episodes of illness, and cause either significant impairment in social or occupational functioning or subjective distress: (1) restricted ability to express warm and tender emotions, e.g., the individual is unduly conventional, serious and formal, and stingy; (2) perfectionism that interferes with the ability to grasp "the big pic ture," e.g., preoccupation with triv ial details, rules, order, organiza tion, schedules, and lists; (3) insistence that others submit to his or her way of doing things and lack of awareness of the feelings elicited by this behavior, e.g., a hus band stubbornly insists his wife complete errands for him regard less of her plans; (4) excessive devotion to work and productivity to the exclusion of pleasure and the value of interper sonal relationships; (5) indecisiveness: decision-making is either avoided, postponed, or pro tracted, perhaps because of an in ordinate fear of making a mistake,
e.g., the individual cannot get as signments done on time because of ruminating about priorities. Most mathematicians fit at least sev eral of these. Most of us have colleagues that fit more. "Attention to detail" is of some considerable utility to mathemati cians. I am not particularly suggesting that Erd6s fits this bill (he was anything but stingy), but probably he fits some bill. He might well even have been "cur able"-though whether either the world or Erd6s would have been happier or better off for this is open to question. However one views Erdos, one must concede that he was an out-lier. He is an author on 1530 papers ac cording to Math Reviews. Like Joe DiMaggio's hits in consecutive games, this is going to be a hard record to break The next closest contenders appear to be Richard Bellman and Saharon Shelah, each with between 500 and 600. Erdos's name occurs in the titles of 813 papers by others, quite probably also a record for an (almost) living author, though one should com pare this to Hilbert at 8291. The num bers don't tell the whole story, but they do give some quantification to the ex traordinary extent to which Erdos touched the mathematical community. Centre for Experimental and Constructive Mathematics Simon Fraser University Burnaby, BC V5A 1 86
Canada
e-mail:
[email protected]
The Applicability of Mathematics as a Philosophical Problem
by Mark Steiner
CAMBRIDGE, MA. HARVARD UNIVERSITY PRESS,
viii + 256
PP.
US$39.95, ISBN 0-674-04097-X
1 998.
(alk. paper)
REVIEWED BY RICHARD CONN HENRY
H
ow large, and how significant a problem dare you tackle in your research?
VOLUME 22, NUMBER 2, 2000
77
Astronomers (such as myself) have it easy: there are lots of easily answer
Universe Does Not Exist,
by Richard
C. Henry.
of three (Sagredo, etc.). To me, now, SR was not the least bit mysterious, but in
able questions to be asked, and, in ad
Before signing the contract, I went
stead was necessary and inevitable: if
dition, astronomy has produced some
to the Science Citation Index. Not a
you want what we call time in your
big answers. Having your cake and eat
single citation of my paper in nine
Universe, there is no other way. And so,
ing it too! However, the danger for as
years. A few days after the contract
reasoning by mathematical analogy, I
tronomers is that it is all too easy to dis
was signed, an e-mail came from Jet
deduced that, improbable as it seemed
appear into the endless small tractable
Wimp asking if I would write the pres
publishable questions.
ent book review. Anything for a laugh:
at first blush, QM must be vastly sim ple, and must be inevitable. I set out to show that it is. If I succeeded, it may re
Some career choices are for bolder
yes, of course! A few days later, I
people: philosophers and theologians
looked in the book's index. My God!
restrict themselves to big questions;
Tum to page 177: " . . . the following
questions so big, in fact, that they do
"derivation" of quantum mechanics,
means sure! I found no new physics (not
not yield the same kind of progressive
which was inspired by, and resembles,
that I was trying to, or expected to), and
advance on which less ambitious in
that of Henry 1990 . . . " Referred to at
I knew the answer (QM) before I started.
vestigators pride themselves. Or, at
last, and by a Professor of Philosophy
Now, when I am setting a physics test
at the Hebrew University of Jerusalem!
for students, I avoid questions of the
least, they have not
so jar.
Yet in this review, we find an as
So, what
is
Mark Steiner's book
is
inforce Steiner's thesis. But
was I successful? I am still by no
type "given that . . . blah, blah, blah . . .
A = B,"
tronomer commenting on a book by a
about? "My claim
philosopher. How can that be? It must
pocentric policy was a necessary factor
be that the astronomer at some point
in
gathered his courage, and went where
physics . . . that ours appears to be an
A = B."
astronomers are overwhelmingly too
intellectually 'user-friendly' universe, a
dergraduate. What I am worried about
discovering
that an anthro
today's
fundamental
show that
because it is such
hard work to grade: every student will write some stuff, followed by "therefore,
I've done it myself, as an un
wise to go! Just so. In 1975, after a two
universe which allows our species to
still, with my derivation of QM from
year stint as Deputy Director of the
discover things about it-I mean this
scratch,
Astrophysics Division of NASA, I re turned to The Johns Hopkins Univer
claim to stand as an empirical hypothe sis, and as the conclusion of this book"
is that knowing the answer, I of
course dam well got there. The reader is invited to check my
metaphysics,
story. The test is the following: if you
ready to continue research in astron
the investigation of what lies behind
could time-travel, and could visit with
omy-and the teaching of physics.
the so-called laws of physics. Well, to
Newton in his old age, could you,
let the cat out of the bag, so am I prac
guided by the strategy of my paper, in
sity, entirely cured of administritis, and
Five years of teaching quantum me chanics-at
the
beginning,
Steiner is practicing
thinking
ticing metaphysics. With reference to
duce Newton, by means of the Socratic
(even as I "explained" it to the students),
my paper on QM (the only substantial
method, to derive QM? My claim is that
"What the hell am I talking about"; but at
paper on the subject that I have ever
you could. (You would have to teach
the end, wishing Feynman were alive and
published), Steiner says, "Henry's aim
Ike matrices first.)
I could claim to him, "I understand quan
was completely different from mine.
While I am worried that my ap
His treatment was meant for the class
proach may contain circular reasoning
I felt like writing a book! However,
room, to persuade students that QM is
or worse, I am
did
'inevitable.' Needless to say, I dissoci
fundamental idea. Someone of a more
tum mechanics," and see what he said. not being that sure that I actually understand
quantum
mechanics,
I
ate myself from that goal."
AJP
is a
not
worried about the
rigorous cast of mind is invited to re
paper, not
pedagogical journal, and indeed, peda
do my work better. But that it can be
a book: "Quantum Mechanics Made
gogy was one of my aims. But in fact
done, I have no shard of doubt. My in
American Journal ofPhysics, November 1990. The paper
the paper is a perfect example of
troductory material is right (I will not
Steiner's own thesis, which he ex
recapitulate it here), and QM itself is
thought I'd better publish a Transparent," in the
was a joy to write (it was all in my head
presses thus: "my goal in this book is
surely inevitable, independent of my
already), and a joy to work on, with the
to show in what way scientists have
paper.
aid of four referees. Three referees
quite recently and quite successfully
seems to me, established that fact by
Steven
Weinberg himself,
it
dropped out early, but the final referee
adopted an anthropocentric point of
attempting to show that QM as we
worked with me long and hard, for
view in applying mathematics. "
know it was
which I am very grateful.
The genesis of my QM paper was anthropocentric. (No, better to saynod centric; I will come back to this in a mo
1989), and failing (Weinberg 1992).
Then I sat back, expecting and hop ing to be punched in the nose. But nothing happened. Nine years passed.
ment.) Having taught special relativity,
Then, just a few weeks ago, an agent
I was overwhelmed with its simple logic
walked into my office and asked if I
and its
inevitability. So, I made my first
not inevitable
(Weinberg
I have more than once read remarks (e.g., Weinberg 1992) about how the world could have been classical, but is
not so; the could not have been clas
not. That seems to me to be world surely
wanted to write a book In a few short
sally at changing the world: "Special
sical. QM is the inevitable result of sim
weeks I was (and am) under contract,
Relativity Made Transparent" ( 1985), in
ple symmetries. While the existence of
The
which I first made use of Galileo's gang
the Universe is deeply mysterious, QM
and in 2000 you will be able to read
78
THE MATHEMATICAL INTELLIGENCER
itself is not the least bit mysterious (Henry 1989). My quarrel with Steiner is on two counts, on one of which I am much less radical than he is, and on one of which I am much more radical. It is clear, if somewhat puzzling (given his emphasis on anthropocentricism), that Steiner rejects what he calls "metaphysical Pythagoreanism, which simply identi fies the Universe . . . with mathematical objects or structures." In these terms, I am a metaphysical Pythagorean. Indeed, I regard the case for this as overwhelm ing, and I regard society as being in a metastable situation with regard to ac cepting the fact. That is where I am (at the moment) more radical than Steiner. I am much more conservative than Steiner in that I see no evidence for a special role for the human species, just for life. Steiner himself refers to "minds like the human mind, if there are any." He is clearly referring to other worlds, but if he looks about, he will find ele phants and cats. Their mathematics is rudimentary, but the differences be tween their minds and ours are small. Steiner argues for the criterion of beauty in mathematics as anthropo morphic. But there is ugly mathematics
(four-color theorem) that is correct. One can argue that a powerful selection effect is at work; mathematicians fmd it much easier to fmd the beautiful (typ ically, symmetric) theorems, than to fmd the ugly-but-true theorems. Also, to suggest that appreciation of what we call beauty is specifically human seems to me to be wrong; for example, we hu mans are at one with both the bees and the butterflies with regard to the beauty of the flowers. Steiner shows how correct physics follows even from ourrwtation, e.g. from Taylor series. Indeed, I was much struck, at first, how the Gaussian distribution arises so directly from a Taylor series ex pansion of ln(P). Why ln, I thought? Why not something else? But of course you can expand P1110 if you like; you will just have to include more terms to get as good an approximation to the Binomial Distribution, which is all that the Gaussian is, as you got with ln(P). Steiner's book is very scholarly, but I am pleased to see joy and excitement leak in: "the consequences are startling . . . in order for angular momentum to be in the same Hilbert space as the other quantities, it must be quantized!" This is one of the most glorious things that I
know. I think we need a more Hasidic physics: Sing the Torah of physics! Dance to express your joy! Steiner quotes Peirce extensively; Peirce clearly felt the vibes of the Universe in his bones. Physics needs a Blake, someone who can fill our children with the power and the beauty of mathematical physics. REFERENCES
R. C. Henry, "Special Relativity Made Trans parent,"
The Physics Teacher,
R. C. Henry, "Teaching OM: True, Trivial, Inevitable," in
Bell's
Theorem,
Quantum
Theory and Conceptions of the Universe,
ed.
M. Kafatos, 1 75, Kluwer Academic Publish ers, 1 989. R. C. Henry, "Quantum Mechanics Made Transparent, " Amer. J.
of Physics,
58, 1 087,
1 990.
Weinberg, S., "Precision Tests of Quantum Mechanics," Phys. Rev. Weinberg, S.,
Letters 62,
485, 1 989.
Dreams of a Final Theory,
New
York: Pantheon Books, 237, 1 992. Henry A Rowland Department of Physics and Astronomy
The Johns Hopkins University Baltimore, MD 21 21 8-2686 USA e-mail:
[email protected]
Revisit the Birth of Mathematics . . . EUCLID
December,
536, 1 985.
k-)f'l,j.CfJ.h.t§i
R o b i n W i l son
I
dra, Durer also wrote for the practi
parallel to the wall and the other is per
tioner on how to use mathematics in
pendicular to it.
Mathematics and Art II: Albrecht Diirer
300 years and inspired much later mathematics. In St. Jerome in his Cell
Florence Fasanelli and Robin Wilson
A
lbrecht Diirer (1471-1528) trav
construction and design.
The celebrated, studied, and often
These books were copied widely over
copied engraving,
Melancolia I,
on a
miniature sheet from Aitutaki, con tains more than two dozen objects for
one of the three most fa
us to interpret. Possibly the concern of
mous of Dtirer's engravings, we see an
the dark androgynous winged figure is
(1513-14),
example of exact geometric perspec
the quadrature of the circle: note the
tive with an approach which is dis
sphere,
tinctly different from that used in Italy.
which is truncated so we can see four
compass,
and
polyhedron
The extreme shortness of the perspec
faces. Agrippa's
tive distance indicates that the room
(which Dtirer probably learned of from
would be at most four feet long if ac
Pacioli) is affixed to the alchemists'
4X4
magic square
eled to Italy as a young man to
tually constructed. The vanishing point
tower behind the figure. The date of
learn the secrets of perspective, prob
is 1/4-inch from the right edge, and as
the engraving,
ably from Luca Pacioli. A mathemati
a consequence of this placement the
tom line of the magic square.
1514, appears in the bot
cian with much original work to his
picture seems cosy and uncramped.
credit including new curves, perspec
Each object is placed with perspective
Florence Fasanelli
tive machines, the nets with which to
in mind, and so is orthogonal to the pic
Mathematical Association of America
build polyhedra and some new polyhe-
ture, parallel to it, or at a
45°
angle.
Note the slippers on the left-one is
1 529 1 8th Street NW Washington, D.C. 20036 USA e-mail:
[email protected]
Albrecht Durer self-portrait
Durer's monogram Albrecht Durer self-portrait
Albrecht Durer self-portrait
Please send all submissions to the Stamp Corner Editor, Robin Wilson,
Faculty of Mathematics,
The Open University, Milton Keynes, MK7 6AA, England e-mail:
[email protected]
80
St Jerome and the Lion
THE MATHEMATICAL INTELLIGENCER © 2000 SPRINGER-VERLAG NEW YORK
Me/enco/ia I